Big buffer allocation

Hi Rustaceans !

As probably many of you, I have read the topic about Box allocation that overflew the stack. From what I read, Box allocate object on the stack and then 'move' it the the heap. Allocating big object leads to stack overflow.

As first small project, I wrote a program the compute a 1.5Gpixels Mandelbrot set. The big size is to get long processing time to get decent comparison with my C++ and Java implementations, and the subject chosen to have an idea of the generated code with floats [I heard about criterion ! :smile:]. As you can guess, using Rust 1.44, I still cannot allocate a big Box. As workaround, I used a Vec for output image storage. It worked nicely, even compared to C++ native array (I mean not std::vectors). Nevertheless, this solution is just a workaround, preventing random access and/or partitioning.
Are there any changes to expect about this situation ? In my opinion, allocating big buffer is quite legitimate and should not be prohibited by a language. I know that a few crates are available such as ndarrays or sjep array (on github), but as Rust beginner, I'm a bit reluctant to use something not provided in the standard library. Will this problem be corrected or any new allocators provided ? This issue is quite old now, so are there any philosophical or idiomatic opposition to such a practice ? Thanks for reading, regards

How does using a Vec<T> prevent random access and/or partitioning compared to a Box<[T; N]>?

From my experiment I can't access the vector using [ ] until it has been filled with push(). So, I can't split my image in different Areas (what I mean by partitioning) and, for example, use several threads to fill each part.
Since I never managed to get the Big Box, I never use it, but I was expecting (maybe foolishly) to get full range access to the allocated memory.

You could zero out the Vec to the desired size and then convert it to a big Box:

let mut vs = Vec::with_capacity(img_size);
vs.fill(0);  // or whatever your default T is
let boxed = vs.into_boxed_slice();  // boxed: Box<[T]>
let mut vs = Vec::with_capacity(img_size);
vs.fill(0);  // or whatever your default T is

This won't work, because fill only writes up to the length of the vec, which is 0.

Use this to efficiently allocate and zero a large heap buffer:

let buffer = vec![0; img_size];

The solution proposed worked flawlessly. Nevertheless, it feels to me like a kind of 'hack': We get a macro that allocates memory, memset to zero the buffer, and then updates it's length. I guess (correct me if I'm wrong) that it's a macro feature, not Vec's ? Maybe am I too OOP minded.

From what's written in the book, shouldn't Box be used here ? Sorry for insisting on the subject, but if not, maybe something should be corrected in the Book ?

The standard library uses a macro here just so that vector expressions like vec![0; n] have consistent syntax with array expressions like [0; N]. Internally, the macro expands to a regular function call.

Vec<T> is the common way of using buffers on the heap in Rust. Box<[T]> is occasionally used as an optimization when storing a huge number of buffers at once (because it is one word narrower), but otherwise there is not much reason to use it.

The vec! macro is there to create the vec in the most efficient manner possible. Macros in rust aren't a hack, they're a useful tool.

If you really want to set the length manually you could do:

let mut vs = Vec::with_capacity(img_size);
vs.resize(img_size, 0);

Which may or may not do the same thing as the macro but it should have the same effect.

I'm not sure what your requirement is exactly but when I hit the problem of stack overflow when creating a huge array some kind person here suggested this:

let mut a = unsafe {
        let layout = std::alloc::Layout::new::<Array>();
        let ptr = std::alloc::alloc_zeroed(layout) as *mut Array;

Which I could then pass to some function like so:

    someFunc(&mut a);

Where the function has a signature like:

type Array = [[i32; MAX]; MAX];
fn someFunc(a: &mut Array) {

The only reason I wanted to do this weird thing was so as to have Rust functions with signatures as similar to the C functions I was bench marking against. I don't think it is really necessary.

I'm not happy with it because of the use of "unsafe" which I don't generally want to see scattered around my application level code.

Full code example is here:

Ok, I misunderstood this, I think about vectors, like C++ vectors. I guess that the Box, is used in recursive type pattern because of the narrower size. Things are clearer for me thanks.

Thanks for completing the answer> I'll try in the future to think about Rust macros, as Rust macros, not C's. :slightly_smiling_face:

I have no requirement really. I'm just trying to learn Rust but it's quite hard for me to get rid of my (old) habits, and my low-level background. I will have a look at your github link.

Thanks all, for your replies.

1 Like

I sympathize. I'm in the same boat.

In my example the point was that I did not want a C++ style vector. I just wanted a 2D array and function signatures to match, like in C.

Such a simple thing turned out to be a head scratcher. And likely not the best way to do it in Rust.

When we don't need to zero the buffer, isn't this solution the most optimized?

It seems to do the same as vec![0; n] but without zeroing.

Moreover, vec! calls Vec::with_capacity and then Vec::reserve. Isn't that redundant?

...and so is UB, according to the Vec::set_len docs:

The elements at old_len..new_len must be initialized.

If you wanted to skip initialization, you'd need to use MaybeUninit (probably using something like Box::new_uninit) and then write your output via writing through a pointer. But that's way premature to think about before just using the vec solution.

1 Like

Rust procedural macros are a Turing-complete language that permits you to do at compile-time what would otherwise need to be done during initialization at run-time§. They even permit you to compile another language, provided that the other language is lexeme-compatible with Rust and never has legal isolated instances of Rust's parentheses/bracket/brace lexemes, which are always paired in legal Rust: ( ), [ ], { }.

§ They actually permit you to modify the AST (abstract syntax tree) of the invoking program.

To be honest, I've studied The Book, and Rust by example together, but I stopped just before chapter 19, Advanced features. I'm going to read a least introduction on macros. That said, I must admit the The Book is probably the best language documentation I've ever read. It's seems that authors often successfully guess reader's question just after it came in his mind, and answer it just in time !

Rust actually has two types of macros. The earlier-specified type is macro_rules!(…) macros, which are themselves Turing-complete, as the last section of The Little Book of Rust Macros demonstrates.

Procedural macros are more recently stabilized and are more like conventional programming. As such, they are more capable, but tend to increase compile time more than the macro_rules! macros.

1 Like