How is it unsafe to use Rc in multithreads program?

I am thinking about the execution orders with Rc pointer, so when we clone, it derefereces into RcBox. Then the nonatomic strong reference increases, and creates a new Rc pointer as cloned.

However, easily inceasing for cloning and decreasing for dropping seems like that Sync is not necessary.
Is there an actual example for it to show why do we needs Arc?

We need Arc because Rc doesn't atomically update its counter. If two threads simultaneously write to the counter, we have a problem. Or if one thread reads the value while another thread writes to it, the result will likely be garbage.

1 Like

Do you mean that writing to one number simultaneously, like from 35, when two threads increase it by 1 may cause 36 as the result?

For example. Or in a less thread-safe language, have you ever tried writing from one thread and reading from another? Sometimes the result doesn't even resemble something close to 30 at all but are just random 64 bits

I see, thanks for your patience!

1 Like

As for the severity, data races are UB.

2 Likes

I went digging in the code to support @jofas' answer.

The documentation for Rc claims only that it "uses non-atomic reference counting." The implementation, however, is more specific, and (as of this writing) uses usize::wrapping_add and <usize as std::op::Sub>::sub to manipulate the reference count, combined with a read (to get the current value) and a write (to store the incremented or decremented value). These are primitive arithmetic and storage operations, and, critically, do not have well-defined rules for thread visibility on their own.

It's tempting to think that given the series of operations

  • read self.strong,
  • add one to the result, and
  • store the result in self.strong,

that another thread can only intervene between those three steps. Even under that simple model, Rc could misbehave if two threads try to increment the reference count simultaneously:

  • [thread A] read self.strong, obtaining 1
  • [thread B] read self.strong, obtaining 1
  • [thread B] add one, obtaining 2
  • [thread B] store the result 2 in self.strong
  • [thread A] add one, obtaining 2
  • [thread A] store the result 2 in self.strong

In spite of two increments, the reference count would go from 1 to 2, and not 1 to 3.

On real systems, the range of behaviours is not this narrowly-specified. In the absence of a memory barrier (which Rc does not contain), the values read and stored may not be synched to memory from the core's cache for a very long time, leading to a lot of uncoordinated increments and decrements, most of which are working with stale values. Given an initial strong count of one, the final result can be nearly any number at all - including, possibly, very large ones, much larger than the number of clones performed, and including results much smaller than the number of clones performed.

Arc, on the other hand, uses std::atomic::usize, which provides access to arithmetic operations that also synchronize the results between threads. This is, in most cases, incrementally slower, but ensures that an Arc shared between threads maintains a coherent reference count.

3 Likes

As a big fan of writing code examples showing undeniable UB whenever possible (as opposed to UB only reported by checkers like miri) let’s make some practical examples.

We'll start with breaking the soundness by providing Sync:

struct MakeMyRcSync(Rc<String>);

unsafe impl Sync for MakeMyRcSync {}

and the goal is to use this for undefined behavior. As for undefined behavior, I particularly like use-after-free of Strings, because that prints such nice and obvious garbage data. Look at this:

fn main() {
    let x = String::from("Hello World!");
    let r: &str = unsafe {
        &*(x.as_str() as *const str)
    };

    println!("Here's the string: {r}");
    drop(x);
    println!("Now it's garbage: {r}");
}
Here's the string: Hello World!
Now it's garbage: 9��X2tX

(exact output of garbage data varies)


Now let’s create this using only safe code and the unsound MakeMyRcSync wrapper.

As mentioned above by previous commenters, it should be possible to get an Rc with too low a reference counter by concurrent cloning. Let’s try to reproduce this with two threads running in a loop.

fn main() {
    let mut rcs: Vec<Rc<String>> = vec![Rc::new(String::from("Hello World!"))];
    // goal: achieve that strong_count < rcs.len()
    while Rc::strong_count(&rcs[0]) >= rcs.len() {
        let sync = MakeMyRcSync(rcs[0].clone());
        let (done_tx, done_rx) = mpsc::channel();
        thread::scope(|s| {
            s.spawn(|| {
                let sync = &sync;
                let mut v = vec![];
                for _ in 1..1000 {
                    v.push(sync.0.clone())
                }
                { done_rx }.recv().unwrap();
                drop(v);
            });
            for _ in 1..1000 {
                rcs.push(sync.0.clone())
            }
            done_tx.send(()).unwrap();
        });
    }
    dbg!(Rc::strong_count(&rcs[0]), rcs.len());
}

Rust Playground

[src/main.rs:29:5] Rc::strong_count(&rcs[0]) = 8492
[src/main.rs:29:5] rcs.len() = 8992

(exact output of reference counts varies)

as we can see, that worked! Let’s now turn this into the use-after-free of Strings from above.
The obvious approach to get there is to let the count drop to zero while still having access through one of the rcs:

dbg!(Rc::strong_count(&rcs[0]), rcs.len());
for _ in 1..Rc::strong_count(&rcs[0]) {
    rcs.pop();
}
assert_eq!(Rc::strong_count(&rcs[0]), 1);

let x = rcs.pop();

let r: &str = &rcs[0];

println!("Here's the string: {r}");
drop(x);
println!("Now it's garbage: {r}");

Rust Playground

[src/main.rs:29:5] Rc::strong_count(&rcs[0]) = 4584
[src/main.rs:29:5] rcs.len() = 4996
Here's the string: Hello World!
Now it's garbage: �▒<Y�n�!

Another approach to unsoundness would involve the capabilities of Rc with strong-count of 1 to provide mutable access to the contained element through Rc::get_mut:

assert_eq!(Rc::strong_count(&rcs[0]), 1);

let mut rc = rcs.pop().unwrap();

let r: &str = &rcs[0];

let x: String = mem::take(Rc::get_mut(&mut rc).unwrap());

println!("Here's the string: {r}");
drop(x);
println!("Now it's garbage: {r}");
15 Likes

It is so cool, thanks for your explanation!