Is it UB to convert &T to *mut T?

Well, by using some unsafe and reference-to-pointer conversion I can do this

fn main() {
    let x: i32 = 0;
    let x_ref = &x;
    // we have a const reference here

    let x_ptr = x_ref as *const i32 as *mut i32;
    let x_mut_ref = unsafe { &mut *x_ptr };
    // Oops, we've converted &T to &mut T !!!
    println!("{}", x_mut_ref);

    // All we had was a &T, but now we can even mutate the content
    *x_mut_ref = 1;
    println!("{}", x_mut_ref);
}

playground

But if we look at some dark-side functions like transmute it is explicitly mentioned that:

Transmuting an & to &mut is UB
Transmuting an & to &mut is always UB
No you can't do it
No you're not special

So I wonder whether what I did is UB or not. Is it?

1 Like

Yes, you've effected the same UB as a transmute. I believe the only non-UB way to do this is within an UnsafeCell.

4 Likes

UnsafeCell still requires that &T and &mut T that are formed from its *mut T obey the normal aliasing rules. This implies to me that it can be considered sort of like a relaxed atomic load/store (it’s not a perfect analogy but close enough) that forces it to load and store values rather than optimizing around them in some ways that would be legal otherwise. But, I don’t think these &T or &mut T are tagged or otherwise known to have come from an UnsafeCell once they’re released into safe code, cross function boundaries, and so on. I might be wrong in which case I’d love to know more about the mechanics behind this.

The other interesting question is when exactly is UB triggered in the original post of this thread? Is it literally when a shared alias is turned into a noalias reference? Is it UB only if you read/write through it? If one is to consider &mut T as a “token” of uniqueness then it would seem it should be UB when such a conversion takes place, and no read/write needs to happen. But an interpretation that leans on memory safety and data races would require that read/write occurs to trigger UB.

I’m looking forward to the memory model and UB being more precisely specified in the future.

2 Likes

That is an interesting question! Converting &T to *mut T via *const T (as in the title of the post), is definitely not UB. You can always convert between *const T and *mut T using as, such conversions are even safe. This is commonly done in practice, because you might want *const for covariance, and yet still be able to get hold of &mut T.

IIUC, the current consensus is that creating the pointer itself is not UB (after all, you have just reinterpreted bits), but writing using an *mut pointer that wasn't initially created from a &mut-reference or *mut pointer is UB (as you are violating a bunch of compiler assumptions about aliasing and mutability, which may turn into funky behaviour once LLVM's noalias is finally fixed).

Moreover, @vitalyd is right that &mut references may come with extra caveats with respect to *mut pointers, because those come with a language-defined guarantee that only one of them will exist at a given point in time and that they cannot be created from & references without help from an UnsafeCell, and unsafe/compiler code may rely on that.

Even if only the first consideration is valid, it is just much easier, from a pedagogical point of view, to tell people "no turning of shared references into mutable references, period", than to start a long explanation about what the compiler does with pointers and references under the hood. The exact memory model, once it will be formalized, should always remain reserved to language lawyers and the most crazy users of unsafe code.

If the code ended in println!("{}", x); or x_ref;
The compiler could optimise and allow it to print 0.

The UB comes from breaking;
"Ensure that the access is unique when casting to &mut T, and ensure that there are no mutations or mutable aliases going on when casting to &T" (get)
Think it is any reading and writing afterwards that can be harmful, since references (& &mut) are just pointers when compiled.

I think the same rule must apply when pointers are passed from FFI, as @vitalyd says once in safe code references, the compiler just uses normal aliasing rules.

Yeah, I don’t think there’s any debate about getting a raw ptr being safe. The interesting bit is about OP’s code then forming a &mut out of it. Is that alone UB? Or does one need to read/write through it to trigger UB? So specifically:

// this is fine
let x_ptr = x_ref as *const i32 as *mut i32;
// is it immediately UB here?
let x_mut_ref = unsafe { &mut *x_ptr };

I don’t believe the answer to this has been definitively established.