Does UnsafeCell still function under &mut T?

UnsafeCell supports concurrent modification under &T without causing optimization issues because the compiler knows multiple references exist.

However, when using &mut T, does UnsafeCell still serve a purpose? Since the compiler assumes &mut T is a unique reference, will it become "more confident" in its optimizations and ignore the memory instability signaled by UnsafeCell?

Two simultaneous &muts to anything is immediate undefined behavior, UnsafeCell is not an exception.

Note that only the immutability guarantee for shared references is affected by UnsafeCell. The uniqueness guarantee for mutable references is unaffected. There is no legal way to obtain aliasing &mut, not even with UnsafeCell<T>.

4 Likes

I assume your T is a structure composed of a UnsafeCell, rather than the element in the cell. Either way the compiler must behave as doc requires.

I don't think (but not a compiler dev) that it restricts optimisation as much as black_box().

Your assumption is misleading. UnsafeCell is much lower language primitive. By itself it is not related to concurrent access and multi-threading at all. It is a way of communicating to compiler that given &UnsafeCell<T>, allocation containing T is writeable even during existence of shared reference to UnsafeCell, which is never allowed under "normal" reference rules. It is basically the same construct as mutable keyword in C++.

But notice that the only thing you can get from &UnsafeCell<T> is *mut T. You as the programmer are still responsible for upholding aliasing rules (you can never create two exclusive references to the same object), and for any concurrent operations on T (there must never be any data race), when you dereferenece raw pointer.

3 Likes

Careful with the wording, as “optimization issues” sounds like performance issues, but of course this is about correctness (i.e. not using UnsafeCell when you needed it is straight-up UB[1]).

Yes, in principle &mut UnsafeCell<T> and &mut T should behave exactly the same.


FYI, there are use-cases where one might not want to have it this way (in particular, for the internals of async fn/async {…} futures), and thus need a way to avoid this property

and that’s actually possible using (currently still) nightly-only API UnsafePinned<T> from RFC 3467.


  1. “undefined behavior” (more info) ↩︎

1 Like

Oh my. Is nothing sacred anymore? :exploding_head:

To be fair, it going to be an improvement over the status quo[1] which was … just … eh … disabling the property for all !Unpin types I guess :sweat_smile:


  1. by which I only mean the de-facto state, i.e. hacky reality; not that this was any kind of guaranteed promise or even a well-documented thing ↩︎

1 Like

This reminds me of a great blog post about memory models: The Tower of Weakenings: Memory Models For Everyone - Faultlore. It discusses pointer provenance, but I feel like similar argument can be made to other areas of memory model (like aliasing).

It’s technically possible. for example

struct S {
    r: UnsafeCell<*mut i32>
}

then two S initialized with same *mut i32;
Now we have 2 &mut S.

if S is !Send, and it may provide a function with signature fn load_shared_r(&mut UnsafeCell<*mut i32>) -> i32;

Then considers the code:

let mut_ref = &mut s1.r;
let n1 = load_shared_r(mut_ref); 
something modifies the r with s2......;
let n2 = load_shared_r(mut_ref); 

Yes, you can say load_shared_r should recieve &UnsafeCell<*mut i32>, not &mut UnsafeCell<*mut i32>; but this is not a bug.

But the question is, will it be really

let n2 = n1

?

But &UnsafeCell<T> and &T does not behave exactly the same. So why &mut UnsafeCell<T> and &mut T ?

Yes.

The UnsafeCell here is useless, just *mut i32 sufficies for what you want to do.

That function would be immediately unsound since I could just pass a *mut i32 that's dangling.

It doesn't matter because what really matter is only the value of the *mut i32. You can pass anything that allows you to get a *mut i32 out of it and it would be kinda the same.

To help illustrate why it’s reasonable that UnsafeCell is not special for &mut, consider that just about every interior mutability type in the Rust standard library offers a get_mut() function you can use instead of the specific &self mutation operations:

In all of these cases &mut access is an alternative to interior mutability, and it lets you get all the power and all the guarantees of &mut despite the alternative existing.

1 Like

I have reconsidered my previous point and realized I should rephrase my question: How does the compiler treat &mut *mut T with respect to the underlying *mut T? Specifically, does it treat it as &mut &UnsafeCell<T> or &mut T?

Since *mut T is potentially writable, the &mut &UnsafeCell<T> interpretation seems more likely. Therefore:

let mut n = 0;
let mut np1: *mut i32 = &mut n;
let mut np2: *mut i32 = np1;
let ptr1 = &mut np1;
let ptr2 = &mut np2;

let n1 = unsafe { **ptr1 };
unsafe { **ptr2 = 9 };
let n2 = unsafe { **ptr1 };

assert!(n1 == n2);

This assert will fail, meaning the code will not be optimized into let n2 = n1.

However, *mut T is only writable within an unsafe block. Furthermore, wrapping *mut T into something like &UnsafeCell<T> is just my guess and doesn't seem like a natural conversion. If the compiler instead treats &mut *mut T as &mut T, then there is a possibility it could be optimized into let n2 = n1.

This concern also seems to apply to Rc<T>. Seemingly independent Rc instances actually influence each other because they share a reference count. If &mut Rc<T> were optimized to cache the reference count it has already read, it would cause severe issues.

I am looking for the specific logic the compiler uses: How does a unique reference to a raw pointer (&mut *mut T) affect the compiler's assumptions about the stability of the memory being pointed to?

The existence of a *mut T tells the compiler nothing whatsoever about the state of the memory that might be pointed to by that pointer, regardless of where that *mut T might be stored or how it might be borrowed.

Neither one. *mut T says nothing about the T, if there even is one, so having &mut *mut T also says nothing about the T, if there even is one.

You can look at use of &UnsafeCell<T> as weakening the strong rules of &T, but when you use only *mut T to refer to some T, you are working from no rules at all, until such time as you create an &mut T, &T, or &UnsafeCell<T> from that pointer.

2 Likes

To add to the previous reply, what you can do with a *mut depends on how you created it.

More details in the docs.

1 Like