Ndarray, stack and heap memory, and overhead


#1

As far as I know, Rust’s arrays are stack based and have fixed lenght. Vectors, on the contrary, hold their data on the heap, have variable lenght, and add some overhead w.r.t. plain arrays. I hope up to here I got it right.

I have been wondering about whether the Arrays from ndarray are more similar to standard arrays or to vectors, and whether they introduce overhead. Still, I could not find any answer on ndarray documentation or elsewhere.

What makes me suspicious is that the dimension of an Arrays from ndarray can be decided when the array is initialized, rather than being encoded in its type (like standard arrays do). I suspect that would require to hold the data on the heap (though I don’t know much about memory models).

On the practical side: in a performance-critical scenario, should I avoid ndarray and rely solely on standard arrays?


#2

The owned Array always uses Vec, as it’s defined like this:

pub type Array<A, D> = ArrayBase<OwnedRepr<A>, D>;
pub struct OwnedRepr<A>(Vec<A>);

On the other hand, ArrayView and ArrayViewMut can be constructed from any slice, which you could create from a local array on the stack if you like. But that’s your choice, not ndarray's doing.


#3

Note that Rust’s array type isn’t necessarily faster than anything else.

There are several trade-offs, like ability to have partially-initialized arrays, cache locality, risk of overflowing the stack, optimizer’s ability to remove bounds checks, etc.

If you create lots of 2-element arrays in a loop, then it’ll be faster. OTOH if you create a million-element array once, then it may be too large for the stack, and it may be cheaper to allocate lazily zeroed pages of Vec than have memset run on the stack.


#4

Aha! Thanks, that’s very helpful. I even went as far as looking into ndarray's sources and found

pub type Array<A, D> = ArrayBase<OwnedRepr<A>, D>;

but I had no idea OwnedRepr was just a wrapper for Vec!


#5

What are you using these arrays for? ndarray isn’t meant for small arrays like the kind you’d be using in games and graphics and stuff (it probably sucks at that), it’s meant for big arrays, like the kind you’d use in statistics or whatever.

ndarray has a bunch of methods (componentwise operations, matrix multiplication, etc.) that are way faster than what you’d naively write with arrays, so if you need the functions that ndarray has, it’s probably the way to go in a “performance-critical scenario”.


#6

That’s interesting, I happened to stumble into both use-cases. Is there a rule-of-thub to approximately tell when something is way too big for the stack?

I happened to use it in games. In some cases it was for small arrays, and it’s now clear it was not a good idea. To hold a world-map kind of structure, though, I could actually use a large-to-very-large array (say 1000x1000 or larger). Would ndarray be suited for that task, or there are better alternatives anyway?