Map Rust vector iteration to assembly

Let us look under the hood to understand the assembly code generated for a vector iteration. We will see that the length of a vector is an important factor in vectorizing the iterations. With vectorization, the processor performs multiple operations per instruction. Finally, we will experiment with the Compiler Explorer to see how the compiler unrolls loops and uses vector instructions to improve performance.

We will look at the generated assembly for the following code:

pub fn increment_by(num: i64, list: &mut Vec<i64>) {
    for item in list {
        *item += num;
    }
}

pub fn increment_example(inc_by: i64, array: [i64; 4]) -> Vec<i64> {
    let mut list = array.to_vec();
    increment_by(inc_by, &mut list);
    list
}

This is always better as &mut [i64]. Don't put &mut Vec in parameters like this when you're not changing the length of the Vec.

This problem is why you ended up with the .to_vec() in

pub fn increment_example(inc_by: i64, array: [i64; 4]) -> Vec<i64> {
    let mut list = array.to_vec();
    increment_by(inc_by, &mut list);
    list
}

rather than being able to -> [i64; 4].

Also, this is wrong:

If the memory allocation fails, the function throws an exception using the ud2 instruction.

That's coming from -Z trap-unreachable, and isn't actually hit.

6 Likes

The article used a contrived example to illustrate the code generation for &mut Vec<i64>.

You are right that in practice one should use &mut [i64] when the length is not being modified.

I was curious about the difference in the code generation in the two cases. I generated the assembly for the following code:

pub fn increment_by(num: i64, list: &mut Vec<i64>) {
    for item in list {
        *item += num;
    }
}

pub fn increment_example(inc_by: i64, array: [i64; 4]) -> Vec<i64> {
    let mut list = array.to_vec();
    increment_by(inc_by, &mut list);
    list
}

pub fn increment_by_array_slice(num: i64, list: &mut [i64]) {
    for item in list {
        *item += num;
    }
}

pub fn increment_example_array_slice(inc_by: i64, mut array: [i64; 4]) -> [i64; 4] {
    increment_by_array_slice(inc_by, &mut array);
    array
}

Please refer to the Compiler Explorer link.

I have changed the rustc version to 1.81.0 from the 1.66.0 version in the article.

Here are a few observations:

  • The compiler is now using SIMD regardless of the length of the array.
  • No ud2 instruction is generated for error handling.
  • increment_example_array_slice code is a lot simpler than the code generated for increment_example.

In the rustc 1.66.0 version of the code the error handler for memory allocation failure is followed by the ud2 instruction. I found the following discussion on the subject:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.