Let us look under the hood to understand the assembly code generated for a vector iteration. We will see that the length of a vector is an important factor in vectorizing the iterations. With vectorization, the processor performs multiple operations per instruction. Finally, we will experiment with the Compiler Explorer to see how the compiler unrolls loops and uses vector instructions to improve performance.
We will look at the generated assembly for the following code:
pub fn increment_by(num: i64, list: &mut Vec<i64>) {
for item in list {
*item += num;
}
}
pub fn increment_example(inc_by: i64, array: [i64; 4]) -> Vec<i64> {
let mut list = array.to_vec();
increment_by(inc_by, &mut list);
list
}
I have changed the rustc version to 1.81.0 from the 1.66.0 version in the article.
Here are a few observations:
The compiler is now using SIMD regardless of the length of the array.
No ud2 instruction is generated for error handling.
increment_example_array_slice code is a lot simpler than the code generated for increment_example.
In the rustc 1.66.0 version of the code the error handler for memory allocation failure is followed by the ud2 instruction. I found the following discussion on the subject: