UnsafeCell: what exactly does it do, and when do I need it?


#1

Reading the documentation for UnsafeCell has left me unsure what it does and what I’m allowed to do with pointers that I get from references.

If I have an UnsafeCell<T>, and T has boxes or other memory behind a reference, does UnsafeCell give me guarantees about that memory, or just the stuff immediately inside T?

In particular, if I have for some reason an UnsafeCell<&'a mut T>, is it OK to modify the T through the pointer I get out of the UnsafeCell?

When is it OK to use pointers I get out of references? Can I do this:

let x = 0u8;
let p: *const u8 = &x;
println!("{}", unsafe { *p });

What about this:

fn f(x: *mut u8) {
    unsafe { *x = 1 };
}
let mut x = 0u8;
f(&mut x);
println!("{}", x);

#2

UnsafeCell is basically an optimization barrier to the compiler – rustc cannot assume anything about aliasing (or lack thereof) of the data within the cell. In turn, once you pull out & or &mut references out of it, it’s your responsibility to ensure that aliasing guarantees are upheld.

As mentioned above, you don’t typically have a reference inside the cell - you have just the value. If you have &mut T inside the cell, then you’re already guaranteed to be the sole possessor of a reference to that T (unless the mutable reference was obtained unsoundly). UnsafeCell is a building block for handing out &T or &mut T - it doesn’t quite make sense (I think) to already have mutable or immutable references inside it.

The aliasing guarantees apply to references - raw pointers are generally out of scope, AFAIK. What you cannot do is fabricate, for example, 2+ &mut references to the same location and write through them or fabricate immutable references when a mutable reference exists - Rust assumes no aliasing for a &mut reference, and these would break that. But raw pointers don’t carry aliasing guarantees, AFAIK.

So UnsafeCell<T> is a low-level building block for manufacturing valid &T and &mut T references where you take care of ensuring the aliasing guarantees are upheld.


#3

Thanks again.

Right, that was meant to be a simple example to illustrate a question rather than something specific I had a burning desire to do. What about this then:

struct S(Vec<u8>);
let cell = UnsafeCell::new(S(vec![0u8; 18]));
let ptr = cell.get();
unsafe { &mut *ptr }.0[0] = 12;

Is this allowed, since the memory I’m changing is not in the S struct itself?

So then my example is acceptable? What prompts the question is that in the body of the function f, I can mutate the object, but there’s no mutable reference present… What’s to stop the compiler from doing something weird and allowing optimization to screw up my what my pointer is pointing? Isn’t this what UnsafeCell is supposed to be needed for?


#4

Code should be using RefCell (or Cell), which uses UnsafeCell internally. The reason not to use the safe wrappers is to avoid the space/performance penalty that is added by these.

On its own your 4 line example does nothing wrong.
Simple example that would break aliasing rules;

let r1 = unsafe { &mut *ptr };
let r2 = unsafe { &*ptr };
let val = r2.0[0];
r1.0[0] = 12;
let val2 = r2.0[0];

Alternatively with RefCell; it would panic (i.e. safe) when creating r2.


#5

Yes, this is fine. The fact that you’re changing a field of S, rather than S itself, isn’t important - the compiler already can track disjoint borrows of struct fields with normal references; it’s perfectly fine to have 2 mutable borrows at the same time, one pointing at field f1 and another at field f2 of the same struct value. While those mutable borrows exist, the struct itself cannot be borrowed.

What is important in this example is that there’s no aliasing of ptr - you have a single mutable reference to it.

Compiler can’t really do anything weird here. You can mutate through a raw mut ptr - that’s totally valid.

UnsafeCell is for building interior mutability. The classic example is RefCell<T>. A RefCell allows you to do interior mutation, e.g.:

struct S(RefCell<Vec<i32>>);

impl S {
    fn mutate_via_shared_borrow(&self) {
        // note the caller only needs a shared reference, which means we can be aliased multiple times
       let mut_vec = &mut *self.0.borrow_mut();
       // we now hold a mutable reference, but `borrow_mut` verifies the soundness of this
       ... // mutate the vec
   }
}

Internally, RefCell<T> has an UnsafeCell<T> along with a flag indicating the borrow status. So when you call borrow() or borrow_mut() on it, it internally checks this flag to dynamically enforce that it’s not about to hand out an invalid borrow while there’s another borrow (or multiple, in the case of immutable references) outstanding.

So why is UnsafeCell needed here and what’s special about it? The compiler sees a &S borrow when someone calls mutate_via_shared_borrow; ordinarily, it can assume no mutation will occur. But, that’s not the case of course. As such, UnsafeCell is the only known-to-the-compiler primitive that informs it that it’s not safe to assume anything about aliasing inside that cell.

Hope that makes sense - this is somewhat of a subtle topic.