Question about rustc aliasing analysis

So in C/C++ the standard says the compiler is allowed to assume T* doesn't alias U* if T and U are different types. Which helps in general but means the compiler has to assume that if you have say two float* that they may alias.

Rust has &mut which is guaranteed to be the only reference to an object that will give them point in time so the compiler can assume it doesn't alias anything even when there are two for the same type.

However Rust has a lot of types like Vec, Box, String etc that:

  • Use pointers internally not references
  • Refer to memory that they own exclusively

In other words the internal pointers of one Vec<float> instance will never in practice alias the internal pointers of another Vec<float> instance. Is there anything about the rust language design that helps to allow the compiler to figure this out?

I guess as a user if I write code that's mutating two different Vec I am going to end up making mutable references, so is the compiler smart enough to work backwards for the fact that unsafe code internal to Vec made the mutable reference and it was derived from some internal pointer, therefore the pointers can't alias just like the references can't alias? Or does this never matter because the mutation is only ever done through references? (ptr::write exists though...)

I wrote a little test.

In this case, the compiler isn't smart enough to figure out that changing one Vec doesn't affect the other (mutate_vectors).

But if you pass slice references then it can figure it out (mutate_slices).

1 Like

Just so were clear, rust does not make type-based aliasing assumptions. This makes things like *(float_ptr as *mut i32) sound and ensures it does what's expected. Additionally, pointer aliasing based optimizations have needed to be turned off in the past because llvm bugs have caused miscompilations.

To the extent of my knowledge, rust does not assume that Box<T>, Vec<T> or related types don't alias. That doesn't really matter all that much, because doing anything useful with those types requires a reference to the contents involved, and llvm is extremely good at propagating them (heck, at least one of the issues causing miscompilation was llvm propagating noalias annotations too aggressively). Additionally, llvm is great at reasoning about things like heap allocations, which helps perform optimizations, too. My guess is that the story will be similar for the GCC based backend and frontend whenever they show up.

While rust officially has no memory model, most of the time llvm's noalias is how things work in practice; for further reading, see these links:

2 Likes

Box does currently get special-cased for noalias here.

I recall discussions that we should expand that to all uses of Unique<T>, but I don't remember where that ended up. Still that wouldn't help @tczajka's example because &mut Vec<T> has double indirection -- the &mut arguments will be noalias, but that doesn't tell LLVM anything about their inner pointers.

4 Likes

Not necessarily with the Vec<T>, but often once you get to &mut [T]s there is.

You can take advantage of this at function boundaries. See the example of that I used here: https://github.com/rust-lang/rust/pull/90821/files#diff-e8ccaf64ce21f955ccebef33b52158631493a6f0966815a2ebc142d7cd2b5e06R640-R643.

2 Likes