What makes UnsafeCell special

Consider the following intended UB code

fn main() {
    let mut x = 10;
    let y = &mut x as *mut i32;
    let closure = || unsafe { println!("{}", (*y)); };
    unsafe{ *(&mut x as *mut i32) = 20; };
     
    closure();
    
}

Miri rejects it since y is deactivated after I reborrow x. However, if I replace x with UnsafeCell.

use std::cell::UnsafeCell;
fn main() {
    let x = UnsafeCell::new(10);
    let y = x.get();
    let closure = || unsafe { println!("{}", (*y)); };
    unsafe{ *x.get() = 20; };
     
    closure();
    
}

Miri is now happy. So my question is that, it is clearly that the first code has UB, but why the second one doesn't? Is it a limitation of Miri that it failed to detect UB in the second case, or the second case is indeed well-defined code?

UnsafeCell is a language item. It's part of the language. It's special by definition.

6 Likes

Okay, let me ask in another way, what kind of assumption I can make with UnsafeCell that I can't with raw pointer? In particular, why the second case is not consider UB by miri?

1 Like

None. You can do interior mutability via raw pointers just fine (otherwise pretty much any sort of FFI would be unsound/impossible).

The problem in your first example is not the raw pointer, it's the mutable reference, of which the exclusivity is violated.

If you rewrite your first example using raw pointers only, then Miri accepts it.

There's also a stable version.

5 Likes

I will try to wrap my mind around it. Can I say in general when using pointer it is better not to mix it with reference and use UnsafeCell when possible?

You shouldn't gratuitously sprinkle your code with UnsafeCell. Generally, stick to safe abstractions such as Cell, RefCell and Mutex.

When you do need UnsafeCell and/or raw pointers, you'll need to be careful in your reference-pointer interactions; the Google search term you are looking for is "pointer provenance".

2 Likes

This is wrong. You can't use raw pointers to violate Rust's immutability guarantees, not even via FFI. Doing so is Undefined Behaviour. If a memory region isn't contained in an UnsafeCell and belongs to an immutable binding, then any way to mutate it is UB.

Your playground example doesn't implement interior mutability, since your binding is mutable. What you are really side-stepping is Rust's aliasing requirements for safe references. It's true that if you are exclusively using raw pointers, you can mostly ignore Rust's aliasing requirements (you still must uphold them at runtime, otherwise it's a data race, but for single-threaded execution it's automatic).

Note that you don't need #![feature(raw_ref_op)] since creation of raw pointers is already stable: use ptr::addr_of! and ptr::addr_of_mut!. Note that the compiler will prevent you from using ptr::addr_of_mut! on an immutable place. If you try to cheat by using ptr::addr_of! to create a *const T and then cast it to *mut T and write to it, you will get UB (unless you only mutate a part contained in an UnsafeCell).

&mut T asserts to the compiler that there are no other live pointers to T. You can "stack" mutable references by reborrowing the whole or part of it, but you can't use an unrelated or older pointer to access T. Doing so will invalidate &mut T. This applies even if you create a &mut T only to immediately cast it into a raw pointer: the creation step asserts the same invariatns as any other use of a &mut T.

Thus your first example is UB: the closure needs to use a pointer derived from a unique &mut T which you create on line 3. But on line 5 you create a new independent &mut x, invalidating the previous pointers to x, and thus making it impossible to call the closure.

The second example doesn't cause UB because you don't create &mut T at any point. UnsafeCell::get takes a &UnsafeCell<T>, and & references don't assert any uniqueness properties (they can even be freely copied). The rest of the operation happen exclusively through raw pointers. Aliasing requirements for raw pointers are much milder, and must be upheld only at runtime (basically, you must not make a data race, anything else is fair game).

Note that you can't do this trick without UnsafeCell, since normally it is UB to mutate data behind a & reference. UnsafeCell is a language-level blessed exception. If you are familiar with C++, it's similar to the mutable qualifier which allows you to mutate fields and call mutating methods even on an immutable object.

6 Likes

(member of T-opsem, but not speaking on behalf of the team, nor with any special merit)

Sorry for the technicality, but this is not 100% accurate, because it's undecided. Tbf, that it's undecided means that it's effectively UB (and certainly unsound). But specifically, while Stacked Borrows declares this UB, the Tree Borrows model does not. This is understandably contentious.

The reason why Stacked Borrows forbids writes through addr_of!(place) is not a good reason, and forbidding writes to let because it isn't mut is actually surprisingly complicated because they aren't actually fully immutable, they're just write-once. The linked thread covers it in more detail, although I must ask you not to comment there unless you have properly new context to add, since the discussion has trended a bit heated and talk-past-each-other already.

And to go in the complete other direction: even only mutating data covered in UnsafeCell is currently (library) UB by fiat, as the UnsafeCell docs state that

the only valid way to obtain a *mut T pointer to the contents of a shared UnsafeCell<T> is through .get() or .raw_get(). A &mut T reference can be obtained by either dereferencing this pointer or by calling .get_mut() on an exclusive UnsafeCell<T>. Even though T and UnsafeCell<T> have the same memory layout, the following is not allowed and undefined behavior:

unsafe fn not_allowed<T>(ptr: &UnsafeCell<T>) -> &mut T {
  let t = ptr as *const UnsafeCell<T> as *mut T;
  // This is undefined behavior, because the `*mut T` pointer
  // was not obtained through `.get()` nor `.raw_get()`:
  unsafe { &mut *t }
}
2 Likes

Weird. I have read those docs many times and don't remember those lines. When and why were they added? That claim is hard to square with the assertion that T and UnsafeCell are guarantees to have the same layout.

Is it a sort of "strict provenance"? We want to ensure that illegal mutations always pass through a laundering method?

Seems that this part was here from at least 1.66. I can't find the reason though - the corresponding MR speaks of this as if this is already well-known.

raw_get was added in add raw ptr variant of UnsafeCell::get by RalfJung · Pull Request #66248 · rust-lang/rust · GitHub. From the quick sleuthing I did, all I found was roughly "the proposed model allows it, so std can do it, but it's not fully obvious that this is guaranteed to be the case." It's possible to imagine a provenance model where "outer" references permit observing writes from other live pointers but don't permit writes through themselves... although I don't think there's any benefit for that additional complexity in the borrowing model and extra case of UB to track.

I place this in a similar position as the "infectious" nature of UnsafeCell mutability: it's very likely we'll adopt the more permissive model, but it's not something we're quite prepared to fully commit to guaranteeing forever quite yet.

(This is my own off the cuff recollection and opinion, not in any way an indication of T-opsem position.)

If I obtain &mut T from dereferencing * mut T from UnsafeCell<T>::get, it will not invalidate other &T that I get earlier(through derefencing get()). Is this true?

No, the &mut T must invalidate the &T because &mut T are always exclusive. UnsafeCell doesn't change that.

1 Like

Nothing I was claiming is wrong. I was claiming exactly what my code demonstrates. Do not put words in my mouth.

Now this is wrong: my code demonstrates shared mutability, which is another name for interior mutability. The "just side-stepping aliasing requirements" is interior mutability.

You are trying too hard.

Ohhh, I think I get it. Basically the moment you cast pointer to reference, all the aliasing rule for reference kicks in. But if we use the pointer exclusively then that's okay. So I think this is basically because compiler is making a lot of assumptions about the reference but not for the pointer. In some sense my second example delayed the borrow of x. In the first example, x is borrowed mutably when y is created. But in the second example it is not borrow until the closure is executed.

Correct me if I am wrong.

1 Like

Yes, exactly. References carry extra assumptions. It's just a bit too easy to accidentally create a reference. So you have to be careful.

The pointers have a lot less requirements than the references, yes. But pointers still have provenance, so there are still some requirements.

Miri rejected your first example because it considered your second &mut x to invalidate the first &mut x and the *mut you cast it to (y). If you just use the same *mut in both places, it's accepted.[1]

In the second example, due to UnsafeCell "magic", all *mut _ obtained by .get() can be deferenced so long as the &UnsafeCell they were produced from is valid (and other UB such as data races are avoided).

Pedantic nit: It's not the compiler making assumptions per se, it's that UB is defined at the language level. The compiler can then rely on the language definition.


  1. I made the closure a move closure to force a copy of y to be captured by the closure. ↩︎

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.