How to allocate huge byte array safely

I don’t mind ergonomics being worse, but soundness/UB cannot be a figurative free-for-all.

…which is why, as you pointed out before, unsafe code guidelines are such an important project. I personally think that the reference’s list of “behaviour considered undefined” is already a pretty good start, but a longer-form text with more examples and more detailed explanations wouldn’t hurt.

2 Likes

@scottmcm

  1. Somehow, after reading the book, I had an impression that vec![0; len] is rolled out into many push()-es. But, as an actual assembler shows it is more like a loop with a dynamically given len. Thank you for insisting on this simpler form, appropriate for majority of the cases.

  2. Of cause, actual code populates array and only later randomly reads it. I was simply trying out reads and writes to see what computer will do, after allocating 100GB of memory :wink:

  3. As @vitalyd pointed out, this is a situation where an additional initialization write to memory is comparable to an overall action of the code, which, in case of scrypt will constitute one write to every byte, and on average one or more read, depending on input parameters. Most cases have scrypt parameter p == 1, which translates to one read on average. Hence, an additional write adds 30% overhead for this intentionally memory hard code.

1 Like

@vitalyd
I am a total noob in this. Can you sketch how mmap and VirtualAlloc without dropping into C code, but as a way to use platform specific facilities from within Rust. Thank you in advance.

1 Like

A good example to look at would be the slice_deque crate - @gnzlbg (author of it) may have additional suggestions for you :slight_smile:.

1 Like

@vitalyd
Thank you for this “rust by example”. I can see here things like:

#[cfg(unix)]
extern crate libc;

#[cfg(target_os = "windows")]
extern crate winapi;

I’ll dig deeper. winapi looks encouraging, etc.

As @sfackler mentions that's because you have overcommit enabled in ubuntu. Try echo /proc/sys/vm/overcommit_memory and you will probably get 0 or 1 as the answer. As a super user you can set it to 2 (not recommended) to constrain that by the overcommit ratio.

The easiest solution to your problem is to just touch the memory on allocation:

fn allocate_byte_array(len: usize) -> Vec<u8> {
	// EDIT: vec![0; len] // see @sfackler comment below
        vec![1; len] 
}

If that doesn't crash your program, touching the memory later won't either.

Unless this becomes a significant performance bottle neck or you need to really really be able to recover from an allocation failure, everything else is probably not worth the effort.

The alternative requires knowledge of the kernel memory allocation APIs of the different platforms and might end up in you reimplementing a memory allocator for your particular use case (like reimplementing malloc and free). If you have never done that, an allocator for serving a single allocation is probably a good place to start, but depending on what you want to achieve, this might not be a good use of your time. There are many rust crates and APIs that help you with that:

  • on Linux, libc gives you everything you need (mmap, relevant constants, syscall, ...),
  • for Windows the winapi crate gives you everything you need as well (VirtualAlloc, VirtualFree, CreateFileMapping),
  • for macos you will need the mach crate (can't remember the function names).
  • for everything in between (*BSDs, Solaris, ...) you need SystemV memory APIs (the libc crate exposes most of these).

The memory allocator behind slice_deque is very simple, yet it has to make use of all of these...

vec![0; len] is a calloc, so it may not explicitly touch the memory either.

1 Like

@sfackler indeed, vec![1; len] should work. Or doing a with_capacity and then just a vec.push(0) in a loop.

That's a common misconception. Uninitialized is far more insidious than that; see the LLVM explanations:

Here are some examples of (potentially surprising) transformations that are valid [...] This example points out that two ‘undef’ operands are not necessarily the same. This can be surprising to people (and also matches C semantics) where they assume that “X^X” is always zero, even if X is undefined.
LLVM Language Reference Manual — LLVM 13 documentation

A common way to get UB from that is to use it to index into an array; the compiler can skip the bounds check -- it's undef, so it picks something that makes it fit -- but then pick something outside the bounds of the array when actually reading from the array.

I don't have a crashing example handy, but this one gives the nonsense error "thread 'main' panicked at 'index out of bounds: the len is 3 but the index is 0'" because of misusing undef.

Yes, the 0 being a calloc is why my suggestion above was with 1 :grin:

Of course, you could probably also just only set one byte out of every 4k or so.

1 Like

Does that apply to ptr::read()?

I think the trouble is branching moreso than copying.

Yup, ptr::reading out of undef memory will give you an undef back. Demo,

unsafe fn uninit_via_array<T>() -> T {
    let x: [u8; 300] = std::mem::uninitialized();
    std::ptr::read(&x as *const _ as *const T)
}
pub unsafe fn uninit_u32_via_array() -> u32 { uninit_via_array() }

Compiles to just

start:
  ret i32 undef

https://play.rust-lang.org/?gist=d498f28fcc966f097f2d943cbb5db9b5&version=stable&mode=release

2 Likes

Yikes. This is going to bite someone (if/when compiler can reason about it) - hard - I suspect. Not in plain form like the example, but in the Vec::with_capacity style.

If you just want to touch the memory, you can also just do:

fn allocate_byte_array(len: usize) -> Vec<u8> {
	let v = vec![0; len];
    for i in 0..len {  // can use a stride of a page size
        black_box(v[i]);  // can use get_unchecked here
    }
    v
}

The problems I see with just ptr::read is that the compiler can optimize reads away if the results are not used, but black_box prevents that.

I believe MIRI can already reason about it, so the expansion of what can happen in const may make that day come sooner rather than later.

There are other ways to get undef as well via unsound code, like from padding:

pub unsafe fn uninit_u8_via_padding() -> u8 {
    let x = (0xFFFF_u16, 0xFF_u8);
    let a: [u8; 4] = std::mem::transmute(x);
    a[3]
}

is also just

start:
  ret i8 undef

https://play.rust-lang.org/?gist=01a7454171521d082cd7ece2a7003f82&version=stable&mode=release

1 Like

So I wonder what semantics Rust would choose if it wasn’t so heavily influenced by what LLVM does and models.

As mentioned upthread, I’m a bit worried that writing correct unsafe code is going to be a lot harder than the more reasonable requirements of say respecting ptr validity or aliasing rules. And given that unsafe is going to be at the core of most projects, it’s imperative that it’s not a minefield.

This is probably fodder for a new thread, however.

1 Like

ptr::read_volatile is a simple replacement for ptr::read in this case, I think.

If reading of uninitialized or padding bytes is not legal in any way under Rust, does that mean a function like memcpy is inexpressible in pure Rust?

I think read_volatile of uninitialized data should be safe in the absence of hardware-side UB detection (like Itanium’s), which would make it unsafe even if written in (naive) assembly.

The reason is that read_volatile is basically a way to tell your compiler “please do not assume anything about the contents of this memory region, external hardware might be modifying it behind our back”. This includes not assuming that the application’s data is uninitialized just because the application itself did not initialize it.

It does have the side-effect of killing most compiler optimizations though, so you’ll likely end up writing your memcpy implementation using SIMD intrinsics, which are effectively assembly with automatic register allocation.

1 Like