What makes UnsafeCell special

AlexHYF · July 3, 2024, 3:50pm

Consider the following intended UB code

fn main() {
    let mut x = 10;
    let y = &mut x as *mut i32;
    let closure = || unsafe { println!("{}", (*y)); };
    unsafe{ *(&mut x as *mut i32) = 20; };
     
    closure();
    
}

Miri rejects it since y is deactivated after I reborrow x. However, if I replace x with UnsafeCell.

use std::cell::UnsafeCell;
fn main() {
    let x = UnsafeCell::new(10);
    let y = x.get();
    let closure = || unsafe { println!("{}", (*y)); };
    unsafe{ *x.get() = 20; };
     
    closure();
    
}

Miri is now happy. So my question is that, it is clearly that the first code has UB, but why the second one doesn't? Is it a limitation of Miri that it failed to detect UB in the second case, or the second case is indeed well-defined code?

zirconium-n · July 3, 2024, 4:07pm

UnsafeCell is a language item. It's part of the language. It's special by definition.

AlexHYF · July 3, 2024, 4:09pm

Okay, let me ask in another way, what kind of assumption I can make with UnsafeCell that I can't with raw pointer? In particular, why the second case is not consider UB by miri?

paramagnetic · July 3, 2024, 4:12pm

None. You can do interior mutability via raw pointers just fine (otherwise pretty much any sort of FFI would be unsound/impossible).

The problem in your first example is not the raw pointer, it's the mutable reference, of which the exclusivity is violated.

If you rewrite your first example using raw pointers only, then Miri accepts it.

There's also a stable version.

AlexHYF · July 3, 2024, 4:30pm

I will try to wrap my mind around it. Can I say in general when using pointer it is better not to mix it with reference and use UnsafeCell when possible?

paramagnetic · July 3, 2024, 4:33pm

You shouldn't gratuitously sprinkle your code with UnsafeCell. Generally, stick to safe abstractions such as Cell, RefCell and Mutex.

When you do need UnsafeCell and/or raw pointers, you'll need to be careful in your reference-pointer interactions; the Google search term you are looking for is "pointer provenance".

afetisov · July 4, 2024, 1:07am

This is wrong. You can't use raw pointers to violate Rust's immutability guarantees, not even via FFI. Doing so is Undefined Behaviour. If a memory region isn't contained in an UnsafeCell and belongs to an immutable binding, then any way to mutate it is UB.

Your playground example doesn't implement interior mutability, since your binding is mutable. What you are really side-stepping is Rust's aliasing requirements for safe references. It's true that if you are exclusively using raw pointers, you can mostly ignore Rust's aliasing requirements (you still must uphold them at runtime, otherwise it's a data race, but for single-threaded execution it's automatic).

Note that you don't need #![feature(raw_ref_op)] since creation of raw pointers is already stable: use ptr::addr_of! and ptr::addr_of_mut!. Note that the compiler will prevent you from using ptr::addr_of_mut! on an immutable place. If you try to cheat by using ptr::addr_of! to create a *const T and then cast it to *mut T and write to it, you will get UB (unless you only mutate a part contained in an UnsafeCell).

&mut T asserts to the compiler that there are no other live pointers to T. You can "stack" mutable references by reborrowing the whole or part of it, but you can't use an unrelated or older pointer to access T. Doing so will invalidate &mut T. This applies even if you create a &mut T only to immediately cast it into a raw pointer: the creation step asserts the same invariatns as any other use of a &mut T.

Thus your first example is UB: the closure needs to use a pointer derived from a unique &mut T which you create on line 3. But on line 5 you create a new independent &mut x, invalidating the previous pointers to x, and thus making it impossible to call the closure.

The second example doesn't cause UB because you don't create &mut T at any point. UnsafeCell::get takes a &UnsafeCell<T>, and & references don't assert any uniqueness properties (they can even be freely copied). The rest of the operation happen exclusively through raw pointers. Aliasing requirements for raw pointers are much milder, and must be upheld only at runtime (basically, you must not make a data race, anything else is fair game).

Note that you can't do this trick without UnsafeCell, since normally it is UB to mutate data behind a & reference. UnsafeCell is a language-level blessed exception. If you are familiar with C++, it's similar to the mutable qualifier which allows you to mutate fields and call mutating methods even on an immutable object.

CAD97 · July 4, 2024, 1:35am

(member of T-opsem, but not speaking on behalf of the team, nor with any special merit)

Sorry for the technicality, but this is not 100% accurate, because it's undecided. Tbf, that it's undecided means that it's effectively UB (and certainly unsound). But specifically, while Stacked Borrows declares this UB, the Tree Borrows model does not. This is understandably contentious.

The reason why Stacked Borrows forbids writes through addr_of!(place) is not a good reason, and forbidding writes to let because it isn't mut is actually surprisingly complicated because they aren't actually fully immutable, they're just write-once. The linked thread covers it in more detail, although I must ask you not to comment there unless you have properly new context to add, since the discussion has trended a bit heated and talk-past-each-other already.

And to go in the complete other direction: even only mutating data covered in UnsafeCell is currently (library) UB by fiat, as the UnsafeCell docs state that

the only valid way to obtain a *mut T pointer to the contents of a shared UnsafeCell<T> is through .get() or .raw_get(). A &mut T reference can be obtained by either dereferencing this pointer or by calling .get_mut() on an exclusive UnsafeCell<T>. Even though T and UnsafeCell<T> have the same memory layout, the following is not allowed and undefined behavior:
unsafe fn not_allowed<T>(ptr: &UnsafeCell<T>) -> &mut T {
  let t = ptr as *const UnsafeCell<T> as *mut T;
  // This is undefined behavior, because the `*mut T` pointer
  // was not obtained through `.get()` nor `.raw_get()`:
  unsafe { &mut *t }
}

afetisov · July 4, 2024, 2:04am

Weird. I have read those docs many times and don't remember those lines. When and why were they added? That claim is hard to square with the assertion that T and UnsafeCell are guarantees to have the same layout.

Is it a sort of "strict provenance"? We want to ensure that illegal mutations always pass through a laundering method?

Cerber-Ursi · July 4, 2024, 2:33am

Seems that this part was here from at least 1.66. I can't find the reason though - the corresponding MR speaks of this as if this is already well-known.

CAD97 · July 4, 2024, 3:31am

raw_get was added in add raw ptr variant of UnsafeCell::get by RalfJung · Pull Request #66248 · rust-lang/rust · GitHub. From the quick sleuthing I did, all I found was roughly "the proposed model allows it, so std can do it, but it's not fully obvious that this is guaranteed to be the case." It's possible to imagine a provenance model where "outer" references permit observing writes from other live pointers but don't permit writes through themselves... although I don't think there's any benefit for that additional complexity in the borrowing model and extra case of UB to track.

I place this in a similar position as the "infectious" nature of UnsafeCell mutability: it's very likely we'll adopt the more permissive model, but it's not something we're quite prepared to fully commit to guaranteeing forever quite yet.

(This is my own off the cuff recollection and opinion, not in any way an indication of T-opsem position.)

AlexHYF · July 4, 2024, 3:48am

If I obtain &mut T from dereferencing * mut T from UnsafeCell<T>::get, it will not invalidate other &T that I get earlier(through derefencing get()). Is this true?

quinedot · July 4, 2024, 5:05am

No, the &mut T must invalidate the &T because &mut T are always exclusive. UnsafeCell doesn't change that.

paramagnetic · July 4, 2024, 5:27am

Nothing I was claiming is wrong. I was claiming exactly what my code demonstrates. Do not put words in my mouth.

Now this is wrong: my code demonstrates shared mutability, which is another name for interior mutability. The "just side-stepping aliasing requirements" is interior mutability.

You are trying too hard.

AlexHYF · July 4, 2024, 5:36am

Ohhh, I think I get it. Basically the moment you cast pointer to reference, all the aliasing rule for reference kicks in. But if we use the pointer exclusively then that's okay. So I think this is basically because compiler is making a lot of assumptions about the reference but not for the pointer. In some sense my second example delayed the borrow of x. In the first example, x is borrowed mutably when y is created. But in the second example it is not borrow until the closure is executed.

Correct me if I am wrong.

zirconium-n · July 4, 2024, 6:23am

Yes, exactly. References carry extra assumptions. It's just a bit too easy to accidentally create a reference. So you have to be careful.

quinedot · July 4, 2024, 7:06am

The pointers have a lot less requirements than the references, yes. But pointers still have provenance, so there are still some requirements.

Miri rejected your first example because it considered your second &mut x to invalidate the first &mut x and the *mut you cast it to (y). If you just use the same *mut in both places, it's accepted.^[1]

In the second example, due to UnsafeCell "magic", all *mut _ obtained by .get() can be deferenced so long as the &UnsafeCell they were produced from is valid (and other UB such as data races are avoided).

Pedantic nit: It's not the compiler making assumptions per se, it's that UB is defined at the language level. The compiler can then rely on the language definition.

I made the closure a move closure to force a copy of y to be captured by the closure. ↩︎

system · October 2, 2024, 7:06am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is it UB to have uninitialized value in UnsafeCell?	4	465	May 23, 2023
Is this code sound?	6	642	August 6, 2023
UnsafeCell: what exactly does it do, and when do I need it? help	5	5028	January 12, 2023
Question regarding slice of `UnsafeCell` help	15	617	March 5, 2024
Help understanding unsafe code with SyncUnsafeCell help	6	301	August 18, 2024

What makes UnsafeCell special

Related topics