Rust auto-vectorisation difference?

Because of a different standard library and a different LLVM version.

More instructions might be good; might be bad. Hard to say. The only way to know is to benchmark, ideally with Criterion.rs - Criterion.rs Documentation.

CAD97 made some great examples in Converting a BGRA &[u8] to RGB [u8;N] (for images)? - #13 by CAD97 showing that the shorter ones are sometimes way slower.

(But the newer one calls MULPS — Multiply Packed Single Precision Floating-Point Values while the older one calls MULSS — Multiply Scalar Single Precision Floating-Point Values, so I bet the newer code is faster. You could also try optimizing for code size instead of speed, at which point it looks like new rust gives code more like your 1.45 example https://rust.godbolt.org/z/Kd5nddGez)

3 Likes