How to allocate huge byte array safely

Until Rust supports custom allocators, you will need to implement your own memory management over the underlying OS’s facilities. For linux, you’ll likely want mmap with MAP_POPULATE to prefault the VM pages. On Windows, VirtualAlloc with MEM_COMMIT. Or something like that.

Once you have the raw storage, perhaps you can use the c-vec crate mentioned upthread or roll your own data structures over it.

1 Like

Good advice in general, but OP has a Vec<u8> with uninitialized storage. I don’t think reading those bytes is unsound because every value is a valid u8. It’s garbage data, yes, but it’s not unsound.

Also, if you know you’re going to overwrite the whole buffer, then initializing it is wasted work, particularly for large allocations.

1 Like

Beware LLVM's heavy-handed approach to undefined behaviour. If the compiler ever manages to figure out uninitialized data is being read, Bad Things can happen to the code. We're talking about branches being entirely optimized out because they obviously cannot happen, or function bodies being replaced with a trap instruction because they obviously cannot be entered.

According to this internals thread, there has even been some hardware in the past which detected uninitialized memory reads and trapped on that.

2 Likes

Ah, I should’ve clarified - I meant reading it with ptr::read, not "normal" reads. I suppose the writes need to be via ptr::write as well, even if the underlying type is !Drop (or otherwise not read on writes).

But this pattern of allocating a block, without initializing, and then overwriting the whole thing is fairly common in I/O with Rust. Read::initializer attempts to formalize it a bit.

A comprehensive doc on Rust unsafe can’t come soon enough :slight_smile:. Although fun, debating UB in Rust is a bit academic and speculative at the moment because things aren’t fully clarified.

2 Likes

Since the UB handling craziness is not on the rustc side but on the LLVM side (or, in the Itanium case, on the hardware side), I think even ptr::read() without a prior ptr::write() is UB. This is also what the Rust reference currently hints at by using LLVM terminology.

I certainly can't wait to see a safe solution for more common data initialization patterns.

2 Likes

How would I implement reading a junk byte? Say I really wanted to do that.

If ptr::read'ing aligned but uninitialized (by Rust code) memory is UB, then I’m afraid Rust will quickly devolve into C territory of UB craziness.

From my understanding, the aforementioned list of Rust undefined behaviour prevents you from doing that with guaranteed forward-compatibility in this language. If you want to do it without depending on "compilers being nice", you must do it in platform-specific inline assembly.

To be fair, the way Rust's avoids C's undefined behaviour craziness is not by adopting a narrower definition of UB (our definition is actually wider because it includes things like &mut aliasing, immutability and misaligned reads) but by making all UB triggers only accessible via unsafe code.

I’m having a hard time believing that’s the Rust answer, in small part because inline asm is nowhere near being a stable feature.

Unsafe Rust is just as important, if not even more so, as the safe set - it’s going to be (and is already) used to build foundational pieces of the libs in the ecosystem.

This is true, and Rust's approach of isolating unsafe in a tiny fraction of the code means that you can get away with worse unsafe code ergonomics than in C/++, leading in turn to more aggressive optimizations.

I don't mind ergonomics being worse, but soundness/UB cannot be a figurative free-for-all.

...which is why, as you pointed out before, unsafe code guidelines are such an important project. I personally think that the reference's list of "behaviour considered undefined" is already a pretty good start, but a longer-form text with more examples and more detailed explanations wouldn't hurt.

2 Likes

@scottmcm

  1. Somehow, after reading the book, I had an impression that vec![0; len] is rolled out into many push()-es. But, as an actual assembler shows it is more like a loop with a dynamically given len. Thank you for insisting on this simpler form, appropriate for majority of the cases.

  2. Of cause, actual code populates array and only later randomly reads it. I was simply trying out reads and writes to see what computer will do, after allocating 100GB of memory :wink:

  3. As @vitalyd pointed out, this is a situation where an additional initialization write to memory is comparable to an overall action of the code, which, in case of scrypt will constitute one write to every byte, and on average one or more read, depending on input parameters. Most cases have scrypt parameter p == 1, which translates to one read on average. Hence, an additional write adds 30% overhead for this intentionally memory hard code.

1 Like

@vitalyd
I am a total noob in this. Can you sketch how mmap and VirtualAlloc without dropping into C code, but as a way to use platform specific facilities from within Rust. Thank you in advance.

1 Like

A good example to look at would be the slice_deque crate - @gnzlbg (author of it) may have additional suggestions for you :slight_smile:.

1 Like

@vitalyd
Thank you for this "rust by example". I can see here things like:

#[cfg(unix)]
extern crate libc;

#[cfg(target_os = "windows")]
extern crate winapi;

I'll dig deeper. winapi looks encouraging, etc.

As @sfackler mentions that's because you have overcommit enabled in ubuntu. Try echo /proc/sys/vm/overcommit_memory and you will probably get 0 or 1 as the answer. As a super user you can set it to 2 (not recommended) to constrain that by the overcommit ratio.

The easiest solution to your problem is to just touch the memory on allocation:

fn allocate_byte_array(len: usize) -> Vec<u8> {
	// EDIT: vec![0; len] // see @sfackler comment below
        vec![1; len] 
}

If that doesn't crash your program, touching the memory later won't either.

Unless this becomes a significant performance bottle neck or you need to really really be able to recover from an allocation failure, everything else is probably not worth the effort.

The alternative requires knowledge of the kernel memory allocation APIs of the different platforms and might end up in you reimplementing a memory allocator for your particular use case (like reimplementing malloc and free). If you have never done that, an allocator for serving a single allocation is probably a good place to start, but depending on what you want to achieve, this might not be a good use of your time. There are many rust crates and APIs that help you with that:

  • on Linux, libc gives you everything you need (mmap, relevant constants, syscall, ...),
  • for Windows the winapi crate gives you everything you need as well (VirtualAlloc, VirtualFree, CreateFileMapping),
  • for macos you will need the mach crate (can't remember the function names).
  • for everything in between (*BSDs, Solaris, ...) you need SystemV memory APIs (the libc crate exposes most of these).

The memory allocator behind slice_deque is very simple, yet it has to make use of all of these...

vec![0; len] is a calloc, so it may not explicitly touch the memory either.

1 Like

@sfackler indeed, vec![1; len] should work. Or doing a with_capacity and then just a vec.push(0) in a loop.

That's a common misconception. Uninitialized is far more insidious than that; see the LLVM explanations:

Here are some examples of (potentially surprising) transformations that are valid [...] This example points out that two ‘undef’ operands are not necessarily the same. This can be surprising to people (and also matches C semantics) where they assume that “X^X” is always zero, even if X is undefined.
LLVM Language Reference Manual — LLVM 18.0.0git documentation

A common way to get UB from that is to use it to index into an array; the compiler can skip the bounds check -- it's undef, so it picks something that makes it fit -- but then pick something outside the bounds of the array when actually reading from the array.

I don't have a crashing example handy, but this one gives the nonsense error "thread 'main' panicked at 'index out of bounds: the len is 3 but the index is 0'" because of misusing undef.

Yes, the 0 being a calloc is why my suggestion above was with 1 :grin:

Of course, you could probably also just only set one byte out of every 4k or so.

1 Like

Does that apply to ptr::read()?