In general, is there a preferred way to generate a vector when it comes to using either map or a for loop? For example, if i wanted to create a vector of vectors i could do
The iterator solution might have fewer bounds checks on the vector, but I think either is fine.
You can eliminate the extra bounds checks in the loop like this:
let mut result = vec![vec![0.0;n];n];
for (idx, row) in result.iter_mut().enumerate() {
for (jdx, cell) in row.iter_mut().enumerate() {
*cell = some_function(idx,jdx);
}
}
I think many here would say the 'map' solution is more idiomatic Rust.
Personally I vote for Alice's 'for' loop solution. It says what it does more clearly, and does not have the conceptual overheads of 'map, lambda's, 'collect to think about. Or all those noisy type annotations in the way.
alice probably has the best answer (readability/performance), but I tend to prefer functional solutions (e.g. doesn't mutate the *cell) if at all possible. (I have no good reason to - I just do).
BUT, one thing I'd say is the Vec<Vec<T>> is less efficient than Vec<T> where the width,height (row/col, idx,jdx) are computed externally. I do a lot of image processing, and we rarely perform a double-for loop.. we generally do a single stride offset from a cursor (since, if I'm brightening all pixels, I don't CARE what the width/height is; I just want to hit every element in the array).. If I force a 2D structure like this (AND require a double indirection) I'm introducing an unnecessary inefficiency at each row boundary.
From that personal bias, I tend to make all my 2,3,4 D arrays a 1D array; and the above problem becomes moot. You can always have a fn get(&self,x:usize,y:usize) -> T { self.data[y*self.WIDTH+x] } call to provide an easy to use API.. Similarly you an easily produce iterators over a single row. BUT you could also make an iterator over a single COLUMN as well (something not efficient to perform with a Vec<Vec<T>>).
I think my bias came from doing image processing in Java, where 2D arrays are basically the same as Vec<Vec<T>>.. unlike C/C++ where you can make them a single computed value. Happens to suite my rust code just fine.
As always with μoptimizations like this, it will depend greatly on the situation. Something with a big body, like .map(fs::read_to_string) probably will show no difference at all.
But for something with a trivial body, it can matter. I tossed together this (silly) benchmark:
#[bench]
fn bench_array_map_add_pi(b: &mut Bencher) {
let mut a: [f32; 64] = black_box(core::array::from_fn(|i| i as _));
b.iter(|| {
a = black_box(a.map(|x| x + 3.14));
});
}
And it said 3.3× faster:
BEFORE:
test bench_array_map_add_one ... bench: 57 ns/iter (+/- 0)
AFTER:
test bench_array_map_add_one ... bench: 17 ns/iter (+/- 0)
I dont know if your example is just coincidence but if your actual use case is a dynamically sized two dimensional array you will get the biggest performance benefit by using a single allocation instead of nested vectors. You might want to have a look at ndarray.
To add to this, a Vec<Vec<T>> can have differently sized "rows" whereas with a 1D structure you know everything is as it should be if the length matches width * height