Why does putting an array in a Box cause stack overflow?

If I understand correctly this:

    let mut a = [0; 10_000];   

gets me an array of 10,000 elements on the stack.
And this:

    let mut a = Box::new([0; 10_000]);   

gets me 10,000 elements allocated on the heap.

So I was surprised that this:

    let mut a = Box::new([0; 10_000_000]);

Gets me a stack overflow:

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

What am I not understanding here?

1 Like

The zeroed array is created on the stack, then moved to the heap via the Box constructor.

See Stack overflow with Boxed array.

Wow, thanks.

Seems backwards to me. But this gets me going again:

let mut a = vec![0; 10_000_000].into_boxed_slice();

Hmmm... actually no, it does not get me going again.

What I actually wanted was a couple of two dimensional arrays 1024 * 1024. But on the heap.

    let mut a = [[0; WIDTH]; HEIGHT];
    let mut b = [[0; WIDTH]; HEIGHT];

Fiddling with the suggestions in the rust issues linked to I cannot get anything to give be that.

Every two-dimensional array is a one-dimensional array with index operations not just translating to a simple offset, but to a multiplication and addition operation to determine the actual offset. For a two-dimensional array with a total size of A * B, where A designates the size of the first dimension and B designates the size of the second dimension, accessing the element [x][y] is equal to accessing the element [x * B + y].

1 Like

You might be able to build what you want if you don’t mind an unsafe block and initializing the memory as all zeroes.

https://github.com/Axelderan/componentile/blob/3cdf30215c9be1cc7541ac906e6635e4d8f4975a/src/lib.rs#L140

World is a struct of many 4D arrays with nearly arbitrary array content types. So this should work for most cases.

1 Like

If you're really set on the 2D array approach perhaps try something like this: Rust Playground
It's a bit ugly, but you could probably wrap it in a nicer interface.

As an old time assembler programmer I am well aware of that. I really did not want to do that, its seem like a horrible band-aid in a high level language where we have a syntax for defining and indexing multiple dimensional arrays.

After tinkering around I found I can get what a want with a little sprinkling of 'unsafe':

type T = [[i64; WIDTH]; HEIGHT];

fn main() {

    let mut a = unsafe {
        let a = Box::<T>::new_uninit();
        let mut a = a.assume_init();
        for i in 0..a.len() {
            for j in 0..a[i].len() {
                a[i][j] = 0;
            }
        }
        a
    };

Which is kind of interesting.

Hope somebody has a better idea.

Would this work?

Allocating with zeroes as I suggested above might be safer in some cases. It also might not be. But zero is pretty much always at least a defined variant of an enum. It might not be logically correct to leave it zeroed but it will at least be an actual value your program can handle. Same with integers, characters, etc. Null pointers would be the biggest issue.

But that’s just if you for some reason didn’t follow up with properly initializing the memory after the zero init. Your strategy above has the same downside if not worse because the memory might have arbitrary bits in it.

Sadly this:

let mut a: Box<[Box<[i64]>]> = vec![vec![0i64; WIDTH].into_boxed_slice(); HEIGHT].into_boxed_slice();

Is a Box of Boxes. Which is not a huge contiguous glob of memory like a C style 2D array. If I understand correctly. Normally I would not mind but I set my heart on it for this exercise.

BINGO! Yes it does. Excellent!

Like so:

Did you read my suggestion? I'm quite sure it's simpler to understand and extends better as the dimensionality increases. It uses unsafe, but is fine because it doesn't actually try to use the value till it's completely initialized.

Obviously you'll have to adapt some of the type names and such because I excerpted it directly from some of my own code:

                unsafe {
                    let layout = std::alloc::Layout::new::<World>();
                    let ptr = std::alloc::alloc_zeroed(layout) as *mut World;
                    Box::from_raw(ptr)
                }

Indeed I did. Thanks.

Are you just averse to unsafe? I'm just curious because the iterator solution has mighty many parts to keep track of just to produce a 2D array of zeroes on the heap.

Not at all. I have been living in 'unsafe' world for decades in C/C++ and other languages. If anything I'm averse to all this unintelligible function style with it's iterators and maps and lambda functions etc.

But hence this whole exercise...

I was reading Steve Klabnik's blog about the C abstract machine: "C is not how the computer works" can lead to inefficient code there he links to an example from Intel about writing code that is cache friendly: How to Use Loop Blocking to Optimize Memory Use on 32-Bit Intel®... Which shows this example in C:

float A[MAX, MAX], B[MAX, MAX];
for (i=0; i< MAX; i+=block_size) {
  for (j=0; j< MAX; j+=block_size) {
    for (ii=i; ii<i+block_size; ii++) {
      for (jj=j; jj<j+block_size; jj++) {
        A[ii,jj] = A[ii,jj] + B[jj, ii];
      }
    }
}
}

Well, I thought, how would those kool kids using iterators and such do this in Rust? Time to give it a try, just as a learning exercise. Hence the code in the playground above.

I learned more than I ever imagined!

3 Likes

Ah, thanks for the context. I figured there was something else I wasn’t aware of going on.

Those look like interesting links.

Great! I have somewhat of a PTSD from having tried to explain array-to-pointer decay in C to too many people on Stack Overflow, so I constantly have the "you know the size of every dimension except the outermost one" idea in the back of my mind.

There is just one thing that niggles me about this solution:

I have the type T defined for convenience at the top of the file.

type T = [[i64; WIDTH]];

I have functions that take Ts as parameters:

fn do_it_1 (a: &mut T, b: &T)

I create values of T using:

    let mut a = vec![[0; WIDTH]; HEIGHT];

An there is my niggle. Nowhere in that declaration does it indicate that I am creating a T.
It would be nice to explicitly state that 'a' is a T.

But I can't use 'a:Vec' because the size is unknown.

Oh cool! Someday I will write part 3...

1 Like