Idiomatic way to efficiently apply the same function to two vectors in a loop

I have to apply the same operation to two or more different vectors.
An example of the function is below, it is very basic (multiplying a vector by a scalar in place).
I have other functions like this but this is an example

   fn multiply_vector_in_place(inp: &mut [i32], scalar: i32){
for k in inp.iter_mut(){
        *k *= scalar;
}
}

Now I want to perform the same operation to two vectors v1, v2. For performance reasons I want to avoid looping over the vectors twice, so I do the following:

   fn multiply_vectors_in_place(inp: &mut [i32], scalar: i32, inp2; &mut [i32], scalar2: i32){
for k1,k2 in inp.iter_mut().zip(inp2.iter_mut()){
        *k1 *= scalar1;
       *k2 *= scalar2;
}
}

This works, but obviously this code very ugly and not general and not reusable.

Is there a clear, or more idiomatic way to do this?
Is there any benefit in combining loops this way for performance reasons? I am writing performance critical code.

If your loops are really exactly this simple (multiplying by a scalar), then you are well into micro-optimization territory, where the performance depends greatly on all details of the real code and you cannot use general principles to find the best option. I would guess that combining the loops will usually not be beneficial (because you’re interleaving access to two different memory regions instead of just processing one at a time), but you really must run benchmarks of your real code, rather than asking people to guess on your behalf, because guessing is all we can do.

If your actual operation requires running some non-trivial calculation per iteration, so that combining the loops lets you avoid doing that work twice, then combining the loops likely is a good idea. But, again, it depends on the actual operation you’re performing. You have to benchmark your actual code, and discuss your actual code, not a hypothetical, when it comes to performance of loops like this.

5 Likes

thank you for your response.
Indeed in my case, the code is rather simple and in practice I did benchmark and the improvements were within the noise threshold.