This is not something you should ever worry about. For trivial small no-Drop-needed types like u32 or Range<usize>, the optimizer will perfectly remove them every time.
The core thing you need to think about is what you're assuming as a human: that the indexes will be in-bounds. The compiler, though, needs to handle all the cases where they might not be in-bounds, most importantly that it can't do an out-of-bounds read on src.
Two previous comments I've written about this, for more information:
- Rust auto-vectorisation difference? - #2 by scottmcm
- Understanding Rusts Auto-Vectorization and Methods for speed increase - #5 by scottmcm
So what you want is to say what you expect as part of the code.
The general strategy here is what I call re-slicing, which would look like this:
pub fn xor3(dst: &mut [u8], src: &[u8]) {
let n = dst.len();
// Check *before* looping that both are long enough,
// in a way that makes it directly obvious to LLVM
// that the indexing below will be in-bounds.
let (dst, src) = (&mut dst[..n], &src[..n]);
for i in 0..n {
dst[i] ^= src[i];
}
}
Which vectorizes as expected: https://rust.godbolt.org/z/sqhaoMbP7.
Note that this is subtly different from the zip approach. The zip approach is like using let n = std::cmp::min(src.len(), dst.len());. Whereas the reslicing approach will still panic for dst.len() > src.len(): the panic will just happen before the loop instead of inside it.
Because of optimizers, adding more checks can actually make code faster.