Should I use map or a for loop?

ksb1871 · February 6, 2023, 5:13pm

In general, is there a preferred way to generate a vector when it comes to using either map or a for loop? For example, if i wanted to create a vector of vectors i could do

(0..n).map(|idx| (0..n).map(|jdx| some_function(idx,jdx)).collect::<Vec<f32>>()).collect::<Vec<Vec<f32>>>();

or i could

let mut result = vec![vec![0.0;n];n];
for idx in 0..n {
for jdx in 0..n {
result[idx][jdx] = some_function(idx,jdx);
}
}

Is there difference in performance or one which is used more commonly?

Thanks

alice · February 6, 2023, 5:16pm

The iterator solution might have fewer bounds checks on the vector, but I think either is fine.

You can eliminate the extra bounds checks in the loop like this:

let mut result = vec![vec![0.0;n];n];
for (idx, row) in result.iter_mut().enumerate() {
    for (jdx, cell) in row.iter_mut().enumerate() {
        *cell = some_function(idx,jdx);
    }
}

ZiCog · February 6, 2023, 5:25pm

I think many here would say the 'map' solution is more idiomatic Rust.

Personally I vote for Alice's 'for' loop solution. It says what it does more clearly, and does not have the conceptual overheads of 'map, lambda's, 'collect to think about. Or all those noisy type annotations in the way.

maraist · February 6, 2023, 6:58pm

alice probably has the best answer (readability/performance), but I tend to prefer functional solutions (e.g. doesn't mutate the *cell) if at all possible. (I have no good reason to - I just do).

BUT, one thing I'd say is the Vec<Vec<T>> is less efficient than Vec<T> where the width,height (row/col, idx,jdx) are computed externally. I do a lot of image processing, and we rarely perform a double-for loop.. we generally do a single stride offset from a cursor (since, if I'm brightening all pixels, I don't CARE what the width/height is; I just want to hit every element in the array).. If I force a 2D structure like this (AND require a double indirection) I'm introducing an unnecessary inefficiency at each row boundary.

From that personal bias, I tend to make all my 2,3,4 D arrays a 1D array; and the above problem becomes moot. You can always have a fn get(&self,x:usize,y:usize) -> T { self.data[y*self.WIDTH+x] } call to provide an easy to use API.. Similarly you an easily produce iterators over a single row. BUT you could also make an iterator over a single COLUMN as well (something not efficient to perform with a Vec<Vec<T>>).

I think my bias came from doing image processing in Java, where 2D arrays are basically the same as Vec<Vec<T>>.. unlike C/C++ where you can make them a single computed value. Happens to suite my rust code just fine.

jendrikw · February 6, 2023, 8:44pm

If n is const, you can use std::array::from_fn:

let res: [[f32; N]; N] = std::array::from_fn(|idx|
    std::array::from_fn(|jdx| some_function(idx,jdx))
);

emilHof · February 6, 2023, 9:06pm

That's super nice! Did not know this function existed in std!

One interesting thing I found is how from_fn() is implemented

pub fn from_fn<T, const N: usize, F>(mut cb: F) -> [T; N]
where
    F: FnMut(usize) -> T,
{
    let mut idx = 0;
    [(); N].map(|_| {
        let res = cb(idx);
        idx += 1;
        res
    })
}

It turns out you can map an array of one type to an array of another type without unsafe code! Neat!

jendrikw · February 6, 2023, 9:08pm

Yes, map on arrays was stabilized in 1.55. Pretty awesome! array - Rust

scottmcm · February 6, 2023, 9:09pm

Well, there's a bunch of unsafe underneath it, of course.

If you're curious about implementation details, I've got a PR open drastically changing it to work better: https://github.com/rust-lang/rust/pull/107634

emilHof · February 6, 2023, 11:03pm

Oh that's really cool!

How drastic are the speed ups? Is there a huge difference in runtime? The asm definitely looks a lot cleaner!

scottmcm · February 7, 2023, 2:30am

As always with μoptimizations like this, it will depend greatly on the situation. Something with a big body, like .map(fs::read_to_string) probably will show no difference at all.

But for something with a trivial body, it can matter. I tossed together this (silly) benchmark:

#[bench]
fn bench_array_map_add_pi(b: &mut Bencher) {
    let mut a: [f32; 64] = black_box(core::array::from_fn(|i| i as _));
    b.iter(|| {
        a = black_box(a.map(|x| x + 3.14));
    });
}

And it said 3.3× faster:

BEFORE:
test bench_array_map_add_one               ... bench:      57 ns/iter (+/- 0)

AFTER:
test bench_array_map_add_one               ... bench:      17 ns/iter (+/- 0)

emilHof · February 7, 2023, 4:46am

Of course, but that is still an amazing speed up! I am awful at statistics, but this does seem like it's significant!

Bruecki · February 7, 2023, 9:40am

I dont know if your example is just coincidence but if your actual use case is a dynamically sized two dimensional array you will get the biggest performance benefit by using a single allocation instead of nested vectors. You might want to have a look at ndarray.

Raniz · February 13, 2023, 7:06pm

To add to this, a Vec<Vec<T>> can have differently sized "rows" whereas with a 1D structure you know everything is as it should be if the length matches width * height

system · May 14, 2023, 7:07pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Why are cartesian iterators slower than nested fors?	19	1813	August 20, 2020
Creating 2D vectors	8	5742	January 12, 2023
Is manually looping through a vector always strictly worse then using iterators? help	12	4481	January 12, 2023
Convolution entirely with iterators? help	7	1657	January 12, 2023
API design: `pub fn map2<U, F: Fn(&T) -> U>` vs `fn map<U>(&self, f: Rc<dyn Fn(&T) -> U>)``	5	589	February 2, 2020

Should I use map or a for loop?

Related topics