Why multiple mutable reference even without data access is UB?

Rust 2024 has make &mut STATIC_mut a hard error. I'm trying to understand why the alternative (e.g. &UnsafeCell) is safer than static mut. And I think one key reason is that multiple mutable reference (even without data access) is UB:

Quoting from Disallow *references* to `static mut` [Edition Idea] · Issue #114447 · rust-lang/rust · GitHub

the differences are in whether you get a reference directly. *muts are allowed to alias, so you can make new ones all day long and keep them around -- the only problems are when you actually read or write through them. Whereas multiple independent live &muts at the same time is UB even without a data race. So that's the difference in footgun-ness.

And quoting Consider deprecation of UB-happy `static mut` · Issue #53639 · rust-lang/rust · GitHub

With unsafe impl Sync, you only have to prove the data accesses correct, but with static mut, the references themselves can conflict.
EDIT: Now if "references exist" cannot be UB, ever, then this is not a problem, but IIUC, they do.

In multiple mutable reference with data access, there will be races which causes UB.
However, how does UB happen with just multiple references themselves?

2 Likes

UB happens when an invalid state in a program is reached, and having multiple mutable references is exactly that.

The compiler is free to assume that two mutable references do not alias (unless one was derived/reborrowed from the other) and optimize accordingly.

4 Likes

The glib answer is that UB is a language-level concept, and the language says so.

For one practical possibility, consider that if you had for example...

fn one() {
     let exclusive_i_claim = unsafe { &mut GLOBAL };
     let read = *exclusive_i_claim;
     two();
     let _still_alive = exclusive_i_claim;
     use_the_value(read);
}

...the compiler may assume read hasn't changed. But if two writes to GLOBAL, that would be incorrect.

But note also that the example is just one simple possibility amongst unenumerable others. Compilers are allowed to do literally anything when UB is encountered, leading to many unpredictable emergent miscompilations. (The days of "predictable lowering to machine code" and the like are 2+ decades behind us.)

4 Likes

In theory, I believe so (that compiler could do anything when UB). And, in real world, compiler tends to do something useful, like optimization.

Then I'm curious about what difference/optimization a compiler could practically make when multiple reference but without data access is UB.
If I understand correct, the example given by @quinedot shows what the compiler could do when multiple reference with data access is UB. However, I can’t seem to think of any specific optimizations that would apply in the case where there’s no data access.

It allows speculative and out of order reads and writes.

2 Likes

As an example, the compiler is permitted to replace taking a mutable reference with loading from the place referred to, all (if any) operations on the dereference of the mutable reference with operations on the loaded value, and the destruction of the mutable reference with a write back to the original place.

This can be an optimization if there's not too much register pressure, because it means that you get to schedule the load very early on, and then operate entirely in registers; however, if it's done to two functions with mutable references to the same place, it converts a previously innocuous looking function into a data race.

3 Likes

If you're not doing any data accesses, then why are you taking those references in the first place?

Most likely, you're not doing data access at this specific time. Or, you're accessing only part of the referenced memory, and don't access some other part. Or you don't access it yourself, but pass it to some other function which may do accesses.

In all of those cases, it's not that hard to imagine how optimizations could mess up your code if you violate the non-aliasing assumptions. The compiler may inline functions, move code around, load & cache data over function calls, erase writes which are non-observable according to its aliasing assumptions. It can load bigger regions of memory than you specified, because it may be faster (e.g. allows to load several struct fields in a single SIMD load). Similarly, it can write to bigger regions of memory, if it knows that those spurious writes are non-observable.

Technically, simply creating a &mut T provides non-aliasing guarantees to LLVM, so it's free to insert spurious loads and stores to that location, as long as the observable program behaviour is the same. In practice, it's obviously unlikely to do it if you don't do any accesses, but it can be hard to guarantee that. Creating overlapping &mut references that are really not used in any way at all is such a useless edge case that no one will bother to complicate the memory model just for that.

2 Likes

This could be misunderstood as UB being an event. In practice UB is an assumption that the implementation is allowed to make. This is why consequences of UB are so hard to define, because the assumption that &mut is unique can be implicit in lots of places in the optimizer, creating all kinds of paradoxes when the assumption is violated.

5 Likes

From the operational semantics perspective, it's an event. That's why Miri is able to check for UB: it runs the code in an interpreter, and checks that it doesn't perform any invalid operations. My understanding is that turning cases of UB from a vague assumption into a checkable event is an important part of opsem work.

2 Likes

In this model, I think it's important to understand how modern optimizing compilers actually work.

Modern optimizing compilers start by translating your program from source code into an Intermediate Representation (IR), whose semantics are not guaranteed to be the same as the source language. This translation is guaranteed[1] to preserve the defined behaviours of the source language.

Then, the compiler applies a series of transforms to the IR that are guaranteed to preserve the semantics of the IR, but that are in some sense making it better. We call these transforms "optimizations", and we use cost models to have a sense of what the "best" form of the IR was.

Then, we go down one of two paths:

  1. Translate the current IR into a different IR, preserving defined behaviours, and go back to applying transforms, just on a new IR.
  2. Translate this IR into machine code instead of a new IR.

The new IR is usually "lower level" in some sense; for example, the last time I looked at it, rustc translates into HIR, then MIR, then LLVM IR, then into machine-specific IR, then to machine code, and optimizations can take place at every level.

However, this has a nasty implication; because UB (definitionally) means that the translation from source to IR doesn't preserve the intended meaning of the code (since UB says "this code has no semantic meaning), the translation output can be completely different to the source.

Further, because the optimizing transforms are done at the IR level, the transform may not even be sensible if you think purely about the source code and machine code; for a trivial example, the LLVM IR LCSSA transform cannot exist in Rust source code, since Rust doesn't have phi nodes, and by the time you get to machine code, phi nodes are gone.


  1. Guarantee in this sense meaning "if the translation does not preserve all the defined behaviours, the compiler is buggy and will be fixed". ↩︎

1 Like