Why does Cell require Copy instead of Clone?

Cell is a tool to extend the normal Rust rule of sharing == no mutation, but being able to be mutated while shared comes with a trade-off: fewer things can be mutated. Some sort of restriction is necessary, or else one could get dangling pointers caused by doing a mutation that invalidates a reference that has previously been retrieved. Cell is designed to never give any references to its interior (the value that can be mutated while shared) to external code, and thus ensures that there's no risk of invalidation. (RefCell is designed to allow "references", aka Ref and RefMut, but it needs those fancy references so that it can track them.)

The problem with Clone is that it requires &self, that is, Cell would need to pass a reference to its interior into the external/arbitrary/user-defined code of a Clone implementation. Once this reference exists, the Clone implementation could theoretically use another path to the Cell (this is where the sharing comes in) to invalidate it. On the other hand, T: Copy is safe because it means T can be duplicated in a known, controlled way (a byte copy) without having to run any user-defined code, and thus no references to the interior can escape.

Having a concrete example is good:

use std::cell::UnsafeCell;

// version of Cell with Clone
struct Cell<T> {
    data: UnsafeCell<T>
}

impl<T: Clone> Cell<T> {
    fn new(x: T) -> Cell<T> {
        Cell { data: UnsafeCell::new(x) }
    }
    
    fn get(&self) -> T {
        unsafe {(*self.data.get()).clone()}
    }
    fn set(&self, x: T) {
        unsafe {
            *self.data.get() = x;
        }
    }
}


// the "magic" type
struct BadClone<'a> {
    data: i32,
    pointer: &'a Cell<Option<BadClone<'a>>>,
}

impl<'a> Clone for BadClone<'a> {
    fn clone(&self) -> BadClone<'a> {
        // grab a reference to our internals
        let data: &i32 = &self.data;
        // what is it?
        println!("before: {}", *data);
        
        // now clear out the cell we point to...
        self.pointer.set(None); 
        
        // print it again (should be no change!)
        println!("after: {}", *data);

        BadClone { data: self.data, pointer: self.pointer }
    }
}

fn main() {
    let cell = Cell::new(None);
    cell.set(Some(BadClone {
        data: 12345678,
        pointer: &cell,
    }));
    

    cell.get();
}

A reference like &i32 shouldn't change---while it lives the i32 is immutable---meaning printing it multiple times should always be the same (no matter what happens between them), but this code prints:

before: 12345678
after: 0

Whoops!

The problem is that the Clone implementation can kill the exact BadClone value it is being called on, by having a circular references.

This is undefined behaviour, and could easily lead to memory corruption e.g. if data included some pointers, they'll get set to an effectively random value (Option is nice enough to zero in this case, but there's no guarantee about what value will be there), meaning code could segfault---if you're lucky---or end up writing to random parts of memory.

11 Likes