I have two slices and need to perform element wise multiplication. The simplest and probably the best way is to iterate over the slices and map, collect the result into a vector. For ex,
Another way, and more complicated, is to use MaybeUinit like following
pub fn mul_maybeuinit(res: &mut [std::mem::MaybeUninit<u64>], a: &[u64], b: &[u64]) {
// we can assume that the caller knows the exact length of result and has initialised a new vec with Vec::with_capacity
res.iter_mut()
.zip(a.iter().zip(b.iter()))
.for_each(|(r, (va, vb))| {
r.write(va * vb);
})
}
Both versions seem to have same performance. Which makes me curious about how does one differ from another in terms of performing the oprations in closure and allocating data in memory?
pub fn mul_map<'a, A, B>(a: A, b: B) -> impl 'a + Iterator<Item = u64>
where
A: 'a + IntoIterator<Item = &'a u64>,
B: 'a + IntoIterator<Item = &'a u64>,
{
a.into_iter()
.zip(b.into_iter())
.map(|(va, vb)| va * vb)
}
I suspect both functions will more or less behave the same then, though I have sometimes been surprised by optimizations that Rust does or doesn't do. If in doubt, you can also try to check the assembler output.