Confused between Box, Rc, Cell, Arc

Hi all,

Despite having read articles on each, I'm still confused. Could someone explain here or point me to a link explaining their differences, and if possible comparing to C ?

I know Box is like a malloc() but why use Rc instead , etc.

Thanks a lot.

5 Likes

Shortest explanation:

  • Box<T> is for single ownership.
  • Rc<T> is for multiple ownership.
  • Arc<T> is for multiple ownership, but threadsafe.
  • Cell<T> is for "interior mutability" for Copy types; that is, when you need to mutate something behind a &T.

If you share which ones you've read, that'd be helpful so we don't send them to you again :smile:

30 Likes

Isn't Cell kind of unsafe then?

Thanks for answering so quickly :slight_smile:

Mainly here: Wrapper Types in Rust: Choosing Your Guarantees - In Pursuit of Laziness

but not crystal clear to me.

1 Like

In what sense?

1 Like

Are there more specific questions that you have? Perhaps you could elaborate a bit on what's confusing. These types can be described in general, as @steveklabnik did above, but that's perhaps part of the issue here - you'd like more specifics on some (all) of them?

It's hard to compare these to C since these are all smart pointer types, and C doesn't have such a thing :slight_smile:. The closest analog would be C++, but don't know if that will help you.

1 Like

@vitalyd, thanks a lot, I'll read this article again. Actually I had more experience in D where all classes are references and struct are by copy. Not really the same here...

So in Rust values (enums, tuples, structs, etc) are on-stack/inlined (into parent value) and moved, by default. If you want/need to put something on the heap, then that's where Box, Rc, Arc come into play. To understand these, it's important to understand the ownership model in Rust.

Cell is really unrelated to those 3, and is mostly a way to "cope" with the fact that Rust does not have field-level mutability specifiers, as is common in other languages.

2 Likes

It's not; the only things that are unsafe are what are marked as such. It is built on top of unsafe code, but that code is safe to use. The reason it's okay is interesting, actually; there's an unsafe primitive, UnsafeCell<T>, and Cell<T> and RefCell<T> are layered on top to make it safe.

4 Likes

Ok this is clearer. So if Arc<T> is for multiple ownership but thread safe, is it possible to share a vector of structs between threads, having one thread mutating indices from 0 to n/2 and the other one from n/2+1 to n (if n is the vector length) ?

This one of my next challenge, but I'll create another subject for that.

2 Likes

For vectors there's split_at_mut(), which will give you two slices that can be mutated independently (although I haven't used it across threads). There's also Rayon crate that does this automagically.

2 Likes

But aren't you circumventing the borrow checker if you can have multiple (externally) immutable references that can modify the same value?

Cell does not allow you to hold references to the inner data.

You are circumventing that, but that's its raison d'etre :slight_smile:. As @steveklabnik mentioned, it uses UnsafeCell internally, which rustc understands as optimization barrier (e.g. it's not safe to hoist a read). So there's no safety issue there.

4 Likes

Arc is for multiple ownership (across threads), but that's really talking about lifetime of the underlying value - once the last reference drops, the value is destroyed. Arc does not allow mutable access to the underlying value - it's only about cross-thread liveness. As the docs for Arc state:

In your description above, I'm not sure you need Arc. Will your main thread (or whichever one forks the work off) join (i.e. wait) for the workers to complete working on the structs? If so, you don't need Arc. You would split_at_mut, as @kornel mentioned, and then move the two (disjoint) mutable halves to the two workers; once the workers exit (assuming you're joining on them), those mutable borrows expire and your main thread "regains" full control over the Vec.

3 Likes

What do you mean by "hoist a read"? How does this relate to the borrow checker's prevention of taking multiple mutable references?

I'm surprised @steveklabnik hasn't linked to these chapters in the second edition of the book yet:

2 Likes

In this code:

fn foo(a: &mut u8, b: &u8) -> u8 {
    *a = 5;          //
    let answer = *b; // compiler can reorder these
    answer
}

Compiler can freely reorder the two annotated lines – it can hoist a read before write, as the &u8 reference is shared (&) and also immutable. But consider this:

fn foo(a: &Cell<u8>, b: &Cell<u8>) -> u8 {
    a.set(5)              //
    let answer = b.get(); // compiler can't reorder these
    answer
}

In this case, the Cell<u8> is build on UnsafeCell, so it contains it. And the compiler treats types with UnsafeCell in special way. For them, shared (&) does not imply immutable, as opposed to the usual types such as u8, String or Vec<char> for which shared always mean immutable. So in this (Cell) case, compiler has less room for optimizations such as reordering statemets, or "memoizing" expressions.

4 Likes

Suppose you have the following:

struct Foo {
   x: u32
}

fn loop_on_foo(f: &Foo) {
   for i in 0 .. 100 {
     call_some_func_taking_foo(f);
     call_another_func(f.x);
  }
}

The code, as written, reads f.x on each iteration. However, given f is borrowed immutably, the compiler knows f.x cannot be modified across iterations. It'll then "hoist" the read out of the loop - remove the field read from each loop iteration. "Hoisting" is commonly used to refer to "hoisting" operations out of a loop (i.e. pulling it out of the loop body), but I was referring to this facility in general - avoiding re-reads of the field, inside a loop or not.

Now, if you change x: u32 to x: Cell<u32>, it's no longer safe to assume that the value doesn't change - in the example above, call_some_func_taking_foo could mutate value (assume the compiler doesn't inline it for whatever reason). The UnsafeCell inside Cell prevents the compiler from doing the above optimization.

So there are no mutable references involved here, and all modification is done via an immutable reference. Yes, that sounds contrary to what Rust tries to prevent, but then you have to be mindful of why it's trying to prevent those cases and consequently, why Cell is safe despite not being updated through a mutable ref.

EDIT: worth reiterating that Cell doesn't allow getting a ref to the value and just copies the bits back and forth (since the value has to be Copy). RefCell, on the other hand, doesn't require Copy and, more importantly, allows getting refs to the underlying value. This cell type is also built on UnsafeCell internally, but it performs dynamic (runtime) borrow checks (since references can be obtained).

6 Likes

Ok, so hopefully I'm getting it now: Being able to share data mutably twice is usually dangerous since optimizations could re-order or "hoist" instructions, so the borrow checker forbids it. But since Cell and RefCell wrap UnsafeCell the compiler knows not to do those optimizations so it's safe. Correct?