Why rust Arc has a weak reference

I know that Rust releases memory when the strong reference count is 0, and weak references are used to solve the problem of circular references. However, I am not sure how it solves the issue of circular references. Could you provide more detailed information on this?

It doesn't solve circular references. You will get a memory leak if you create a circular reference and don't break it manually.

It doesn't exactly solve the problem of circular references, but instead it's a tool that you can use to help take care of the problem. For the sake of illustration, consider a doubly-linked listĀ¹:

struct Node<T> {
    prev: Option<Arc<Node<T>>>,
    next: Option<Arc<Node<T>>>,
    data: T
}

You're likely to end up with the in-memory state something like this:

local variable ----->  Node 1  ----->  Node 2  ----->  Node 3
                      strong 2 <----- strong 2 <----- strong 1

Now, when the local goes out of scope, it reduces the strong cound of node 1, leaving things like this:

                       Node 1  ----->  Node 2  ----->  Node 3
                      strong 1 <----- strong 2 <----- strong 1

Because none of the strong counts have reached zero, these nodes are never dropped but are unreachable. You have a memory leak.


You can fix this particular problem by making prev a weak reference instead:

struct Node<T> {
    prev: Option<Weak<Node<T>>>,
    next: Option<Arc<Node<T>>>,
    data: T
}
local variable ----->  Node 1  ----->  Node 2  ----->  Node 3
                      strong 1 <- - - strong 1 <- - - strong 1

Now, when the local variable is dropped, the strong count for Node 1 does go to zero:

                       Node 1  ----->  Node 2  ----->  Node 3
                      strong 0 <- - - strong 1 <- - - strong 1

Because it's now zero, Node 1 gets dropped reducing the strong count for Node 2:

                                       Node 2  ----->  Node 3
                                      strong 0 <- - - strong 1

This is now zero, so it also gets dropped, and the process repeats along the entire chain of nodes (regardless of its length):

                                                       Node 3
                                                      strong 0
// Nothing left

Ā¹ Note that I've simplified this example for illustrative purposes to the point that it is very difficult (or impossible) to construct the list. In practice, you'll need some kind of interior mutability as well.

14 Likes

I made a playground where you can try it out for yourself. (I used Rc's Weak, but it's the same for Arc's Weak references.)

#![allow(dead_code, unused_imports)]

use std::rc::{Rc, Weak};
use std::cell::RefCell;

struct Node {
    name: String,
    child: Option<Rc<RefCell<Node>>>,
    parent: Option<Weak<RefCell<Node>>>, // comment me
    // parent: Option<Rc<RefCell<Node>>>,
}

impl Drop for Node {
    fn drop(&mut self) {
        println!("{}", self.name);
    }
}

fn main() {
    let root = Rc::new(RefCell::new(Node {
        name: "root".to_owned(),
        child: None,
        parent: None,
    }));
    
    let child = Rc::new(RefCell::new(Node {
        name: "child".to_owned(),
        child: None,
        parent: Some(Rc::downgrade(&root.clone())), // comment me
        // parent: Some(root.clone()),
    }));

    root.borrow_mut().child = Some(child);
}

If you comment the lines marked with // comment me and uncomment the commented lines you will see that the Nodes never get dropped (nothing is printed to stdout).

7 Likes

That's a pretty neat demonstration of the idea.

That said, I would like to caution against linked lists in general, as they're pretty cache-inefficient and as such cause more round-trips to RAM, with the corresponding performance penalty for the cache misses.
This didn't used to be much of an issue, but as CPUs have gotten faster relative to RAM in the last decade and a half, in 2024 it would really be wasting a lot of available performance relative to just using a Vec<T>.

2 Likes

I agree; their primary utility is pedagogical as the simplest data structure that forces you to understand how pointers/references behave. In an actual program, you should almost always be using something else (though intrusive linked lists still have some niche applications).

4 Likes

Indeed they do, though in my experience, generally those use cases aren't too cache-sensitive so that works out nicely.

1 Like

Linked lists certainly shouldn't be used as much as they were in the past due to cache-inefficiency. They shouldn't be the first choice when a cache-efficient structure can be used instead.

But I end up using them fairly often because I write in no_std (and no alloc) and do most of my own memory management. So for me slab allocation is common and that usually has a linked list of free entries. Linked lists are also commonly used in the OS for the same reasons. And Rust is particularly good for such lower level work.

But even at a "higher level" I think a couple very useful Rust data structures are slabs and slotmaps, both of which are slab-like and have linked lists of free entries (here and here).

I guess you could say this sort of usage is niche but it is also pretty important.

I just realized that because these linked lists use indexes instead of references, they don't apply to the general topic of Arc and Weak. I never use references for linked lists because I can save space by using a smaller integer. Sorry for going off topic.

local variable -----> Node 1 -----> Node 2 -----> Node 3
strong 2 <----- strong 2 <----- strong 1

likew this .I don't know how these numbers corresponding to "strong" are obtained. And I don't seem to understand the purpose of weak references either.

When you make a new Arc, it allocates memory for the contents and sets the initial strong count to 1. If you then clone the Arc, no new memory is allocated and the strong count is incremented instead. In essence, then, the strong count is the number of live (not yet dropped) clones of the Arc<Node>. In these diagrams, it's the number of solid (non-dashed) arrows pointing at the node.

The collection algorithm for Arc is very dumb and completely localā€” It can't see that there's a group of objects that have become unreachable, only that there's still a copy of the original Arc that hasn't been dropped yet.

Weak (the dashed arrows) exists so that you can hold a reference without having a full-fledged Arc that will prevent the target from being dropped. When used strategically, this can let the collection algorithm roll up and drop large groups of objects when they become unreachable.

1 Like
fn main() {
//  when create the rc ,the strong reference is 1 .
    let root = Rc::new(RefCell::new(Node {
        name: "root".to_owned(),
        child: None,
        parent: None,
    }));
    // the child strong is 1  ,weak is 0 ??
    let child = Rc::new(RefCell::new(Node {
        name: "child".to_owned(),
        child: None,
        // execute root.clone ,the root variable strong reference add 1  , then the result is 2 
        parent: Some(Rc::downgrade(&root.clone())), // comment me
        // parent: Some(root.clone()),
    }));
    
    root.borrow_mut().child = Some(child);
    // when child leave the scope . the ???
    // when root leave the scope . the strong is sub 1 . result is 2-1=1 
    
    
}

look my comment . I don't how to analyse. help me analyse

The key point that you're missing here is that the Rc returned from root.clone() is a temporary value that is not stored anywhere, so it's dropped at the end of the statement. This returns the strong reference count to 1.


Here, child is moved and is not droppedā€” it's instead stored inside root.child and will be dropped when the root node is dropped (after its strong reference count reaches zero).


As I mentioned above, the strong count at this point is 1 (not 2). So, at the end of the function:

  1. The child variable would normally be dropped, but it no longer contains anything due to the move above. This does nothing.
  2. The root variable is dropped, which contains an Rc<Node> with strong count of 1. This reduces the strong count to zero, which drops the contained Node structure.
  3. This, in turn drops all of its fields. One of which is child which also contains an Rc<Node> (pointing to the child node) with a strong count of 1.
  4. The child node's strong count is now zero, so its Node structure is also dropped along with all of its fields.

I don't understand, doesn't cloning increase the strong reference count? And I also can't see that the cloned data of the root is dropped when passed into the downgrade function.

It's not; the drop happens at the end of the statement because the clone is a temporary value that was never stored anywhere. This code is equivalent:

     // Increase strong count from 1 to 2
    let temporary_root = root.clone();
    let child = Rc::new(RefCell::new(Node {
        name: "child".to_owned(),
        child: None,
        // No change to strong count here
        parent: Some(Rc::downgrade(&temporary_root)), 
    }));
    // Drop the temporary, reducing the strong count from 2 to 1
    std::mem::drop(temporary_root);

Alternatively, you can remove the clone entirely to simplify your analysis. This code also works:

    let child = Rc::new(RefCell::new(Node {
        name: "child".to_owned(),
        child: None,
        parent: Some(Rc::downgrade(&root)), // No change to strong count
    }));
1 Like

So the purpose of weak references is to refer back to objects in a cyclic reference without increasing the strong reference count. This ensures that when dropping occurs, the strong reference count can reach zero and the objects can be released. This allows us to use both strong references (such as RC or ARC) to access resources and weak references to access them.

In the code provided by the previous person, there are two cyclic dependencies, and weak references are used to avoid incrementing the reference count. This ensures that the strong reference count can reach zero during the drop process. Without using weak references, the objects would not be released because the strong reference count couldn't reach zero when leaving the scope.

3 Likes
fn main() {
    let root = Rc::new(RefCell::new(Node {
        name: "root".to_owned(),
        child: None,
        parent: None,
    }));
    
    let child = Rc::new(RefCell::new(Node {
        name: "child".to_owned(),
        child: None,
        //parent: Some(Rc::downgrade(&root.clone())), // comment me
        parent: Some(root.clone()),
    }));

    root.borrow_mut().child = Some(child);
}

if the code like this . when the main function is over .whether the child variable is drop ??? i don't konw .

if not ,isn't it the ownership is inthe root field . or ?????

You know because of this Drop implementation that prints something to stdout when a Node is dropped:

When you use Weak this is printed to stdout:

root
child

When you use Rc for the parent reference and you create a cycle, stdout stays empty, because neither the root nor the child node gets dropped.

It's the parent reference in the child node that is causing the cycle. When the root variable is dropped at the end of main, the Rc's strong count is decremented from 2 to 1, as the child node still has a reference to the root node. Because the strong count isn't 0, the root node doesn't get dropped, which means the child node won't get dropped, which means the root node won't get dropped. Boom, cycle.

This is how I understand it: because the root node owns the child, as long as the root node is not dropped, the ownership of the child will also not be dropped. This is because the ownership of the child resides within the root node, and as long as the root node is alive, the child will also remain alive.

Yes. And because the child node also has a strong reference to the root node, the root node will stay alive as long as the child is alive. Both stay alive forever, waiting for the other one to be dropped first.

1 Like

Also, while circular references are the main reason for Weak to exist, it does have other uses. You can use it anywhere you want to keep a reference to something without necessarily forcing it to stay in memory.

Here's a (slightly contrivedĀ¹) example of a size-limited cache for function call results: map holds Weaks so that items can be evicted from slots directly without looking up the old item.

#[derive(Debug)]
pub struct CacheItem<T> {
    accessed: Cell<usize>,
    data: T,
}

impl<T> std::ops::Deref for CacheItem<T> {
    type Target = T;
    fn deref(&self) -> &T {
        &self.data
    }
}

pub struct Cached<F, I, O> {
    func: F,
    t: usize,
    map: HashMap<I, Weak<CacheItem<O>>>,
    slots: Vec<Rc<CacheItem<O>>>,
}

impl<F, I, O> Cached<F, I, O>
where
    F: Fn(I) -> O,
    I: Hash + Eq + Clone,
{
    pub fn new(func: F, cap: usize) -> Self {
        Cached {
            func,
            t: 0,
            map: HashMap::new(),
            slots: Vec::with_capacity(cap),
        }
    }

    pub fn get(&mut self, i: I) -> Rc<CacheItem<O>> {
        self.t += 1;

        // Check for valid cache
        if let Some(weak) = self.map.get(&i) {
            if let Some(item) = weak.upgrade() {
                item.accessed.set(self.t);
                return item; // Use previous value
            }
        }

        // Not in cache; recalculate
        let item = Rc::new(CacheItem {
            accessed: Cell::new(self.t),
            data: (self.func)(i.clone()),
        });
        self.map.insert(i, Rc::downgrade(&item));

        // Now, figure out where to put it...
        if self.slots.len() < self.slots.capacity() {
            // We have room left, just add it to the vec
            self.slots.push(item.clone());
        } else {
            // Pick two random slots, and evict the older one
            let mut rng = rand::thread_rng();
            let a = rng.gen_range(0..self.slots.len());
            let b = rng.gen_range(0..self.slots.len());
            if self.slots[a].accessed.get() < self.slots[b].accessed.get() {
                self.slots[a] = item.clone();
            } else {
                self.slots[b] = item.clone();
            }
        }

        item
    }

    pub fn dbg_slots(&self)
    where
        O: Debug,
    {
        eprintln!(
            "slots: {:?}",
            self.slots
                .iter()
                .map(|x| (x.accessed.get(), &x.data))
                .collect::<Vec<_>>()
        );
    }
}

Ā¹ There are probably better ways to write this that don't involve Rc/Weak at all.

2 Likes