How to allocate huge byte array safely

mikalai · June 24, 2018, 7:47pm

Context

There is a scrypt algorithm for deriving a key from password, sometimes called key stretching. Scrypt is a memory hard algorithm. Depending on parameters, scrypt may use 1GB of memory for key derivation (last test vector in rfc).
I am re-writing scrypt's C code in rust, and I have the following issues when allocating huge byte arrays.

Problem with vec allocation

I can use the following function to create a huge vector.

fn allocate_byte_array(len: usize) -> Vec<u8> {
	let mut v: Vec<u8> = Vec::with_capacity(len);
	unsafe {
		v.set_len(len);
	}
	v
}

But, when I test run it, asking for 1GB, system monitor in my ubuntu 18 machine doesn't show that memory is allocated in one chunk. Instead, it seems that memory is lazily allocated.

In fact, you can allocate 100GB. This function will return. When you start to access memory, operating system starts to use it, all the way to the point of going frozen due to being out of memory.

In a playground running with aforementioned function

fn main() {
    let v = allocate_byte_array(1024*1024*1024*100);
    print!("size {}", v.len());
}

outputs size 107374182400. We allocated 100GB on a server?
while the following

fn main() {
    let mut v = allocate_byte_array(1024*1024*1024*100);
    print!("size {}", v.len());
    for i in 0..v.len() {
        v[i] = i as u8;
    }
}

is killed due to timeout (/root/entrypoint.sh: line 8: 7 Killed timeout --signal=KILL ${timeout} "$@").

Judging from system utilization graph, when an array is written it seems to be allocated in little chunks. But such multiple allocations are costly, especially for scrypt algorithm, when we need one huge allocation.

Comparable C code uses malloc that results in NULL, when requested length of an array is too big. Fast acquiring of memory or fast indication of failure is desirable in a given context.

Question

Is there a way to allocate huge chunk of memory in one round and use it as Vec<u8>?
Is it possible to ensure that allocation fails immediately, when there is no memory?
Can this be done with some special allocator?

kornel · June 24, 2018, 8:25pm

~~Currently libc::malloc() is the best way~~, and there's no way to safely use it with Vec. You can try CVec. All OOM cases in Rust are dangerous

There's a proposal to improve OOM in Rust:

https://github.com/rust-lang/rust/issues/48043

Ixrec · June 24, 2018, 8:46pm

@kornel I must be missing something really obvious, because it seems like Vec::with_capacity(n); should either immediately allocate n, or allocate n on first push, or else it's identical to Vec::new() and kinda pointless. What am I missing that creates the need for libc here?

kornel · June 24, 2018, 8:50pm

Currently in Rust Vec::with_capacity(n) may panic or even immediately unconditionally abort the whole process, depending on value of n (the same goes for Vec::new, push and everything else that allocates memory in Rust's stdlib).

libc::malloc() may return NULL in cases where OOM can be handled. With Vec, until try_reserve lands, there's no way to handle any problem with allocation at all.

For Linux with overcommit, and OSes with large-enough swap file malloc() won't be enough either. You may need mlock or madvise.

mikalai · June 24, 2018, 8:54pm

@lxrec
As libc docs say use crate. libc = "0.2.42" goes into dependencies.

mikalai · June 24, 2018, 9:00pm

@kornel
As long as OOM is signaled on amount that doesn't fit into free_memory+free_swap, use case for scrypt will be fine, as it is a user-side operation that by design will take over tons of resources, and is not supposed to be run in a true background.

kornel · June 24, 2018, 9:03pm

Rust only signals this by calling abort().

kornel · June 24, 2018, 9:08pm

My point is, currently if you rely on Rust stdlib, things that are too large for the system memory only cause awful behaviors of the program:

the OS will give it swap space, and your program will just take forever to run (for scrypt that's definitely game over, your hashing will take years).
the OS will overcommit memory and kill the process.
even when the OS behaves well, Rust will detect OOM and kill the process anyway.

So if you want RAM or graceful error, then you have to avoid Rust's stdlib, and you'll probably need low-level system-specific memory allocation calls.

mikalai · June 24, 2018, 9:11pm

@kornel
What sort of " low-level system-specific memory allocation calls"? Give we an example. This isn't a familiar land. Give me some pointers

kornel · June 24, 2018, 9:12pm

malloc + mlock, probably.

mikalai · June 24, 2018, 9:20pm

@kornel
Oh. You really mean going system level, like this on linux, that on windows, etc.

How about a workaround. Is there rust something which will tell me what memory system has? Like there is a way (crate?) to tell number of processors in the system. I hope for something cross-platform, or a crate already created for all platforms.

kornel · June 24, 2018, 9:23pm

Checking amount of "free" memory will probably be even harder than allocating it, since modern OSes do all kinds of clever stuff to keep 100% of RAM used 100% of the time (free RAM is used for caches, things are deallocated only lazily, inactive data may live in RAM, but could be purged if needed, etc.).

mikalai · June 24, 2018, 9:38pm

Wow. With libc::malloc situation is exactly the same!
100GB without access to big indecies
100GB timeout-ing case.

sfackler · June 24, 2018, 10:57pm

That's just overcommit in action.

mikalai · June 24, 2018, 10:58pm

This is nightly with things from RFC #2116:

fn allocate_byte_array(len: usize) -> Result<Vec<u8>, CollectionAllocErr> {
    let mut v = Vec::new();
    v.try_reserve(len)?;
	unsafe {
	    v.set_len(len);
	}
	Ok(v)
}

At this moment code runs, i.e. there is no error on 100GB allocation.

mikalai · June 24, 2018, 11:03pm

@sfackler
Its a gross overcommit in action. Nightly example runs with 1000GB!

sfackler · June 25, 2018, 4:42am

That's just how malloc works on Linux. There's nothing special going on with the Rust side.

HadrienG · June 25, 2018, 6:13am

In any sufficiently complex system, there is a tension between being clever/accomodating and being predictable/debuggable/optimizable. Like the JVM and modern CPUs, the Linux kernel is a clever but unpredictable system in several ways, including its use of memory overcommitment.

Usually, when you investigate this kind of systems further, you will find that the deeper rationale for the cleverness is to be compatible with bad existing practice or ill-advised past design decisions. In this particular case, I think Linux is being compatible with the fork()/exec() process creation model from Unix. Fork/exec only works in the presence of copy-on-write and memory overcommitment, because without these a 15 GB process cannot spawn a 4 KB process on a system with 16 GB or RAM.

See also: memory-mapped files and implicit synchronous file I/O versus explicit asynchronous file I/O.

mikalai · June 25, 2018, 2:59pm

If it all about OS not really enforcing allocation promises, what can be the best defensive programing approaches here?

Try to use things that may return error, cause at least on Windows it will work (mentioned Rust RFC #2116).
Populate at allocation with error returning method, so that error happens in a localized spot.
???

scottmcm · June 26, 2018, 6:09am

I see that the code in the OP is using with_capacity+set_len, so it's uninitialized. Do you get the same behaviour if you do vec![1; len]? Writing the pages ought to force the OS to actually give you the memory.

(Note also that getting uninitialized memory in safe rust is unsound, so please do something about it regardless.)

Topic		Replies	Views
Need some help struggling with unsafe Rust help	9	567	January 12, 2023
C++ has vector(n, value). c has calloc(). rust has, uh,	59	10871	January 12, 2023
How to create a long array with non-copyable element? help	45	5112	October 25, 2019
Why is Vec implemented un-safely? help	27	9468	January 12, 2023
Why vec::push taking extremely long time? help	27	3786	January 12, 2023

How to allocate huge byte array safely

Context

Problem with vec allocation

Question

Related topics