How to return reference to value in Rc or RefCell

Hello,

I am attempting to build a doubly linked list in Rust (I know about the entirely too many linked lists book, but I want to try everything I can to fix errors I encounter before resorting to that book). I have been struggling with this for a while now, so I was wondering how do I return a reference to the data inside an Rc (or RefCell)?

These are the type definitions for my linked list:

use std::{
    cell::RefCell,
    rc::{Rc, Weak},
};

const INDEX_OUT_OF_BOUNDS: &str = "Error: index is out of bounds.";

type StrongListNode<T> = Option<Rc<RefCell<ListNode<T>>>>;
type WeakListNode<T> = Option<Weak<RefCell<ListNode<T>>>>;

pub struct LinkedList<T> {
    head: StrongListNode<T>,
    tail: StrongListNode<T>,
    size: usize,
}

struct ListNode<T> {
    data: T,
    prev: WeakListNode<T>,
    next: StrongListNode<T>,
}

And this is the implementation (currently, I am stuck on implementing the get method):

impl<T> LinkedList<T> {
    pub fn new() -> Self {
        LinkedList {
            head: None,
            tail: None,
            size: 0,
        }
    }

    pub fn get(&self, index: usize) -> Result<&T, &str> {
        if index >= self.size {
            return Err(INDEX_OUT_OF_BOUNDS);
        }

        let mut curr = Rc::clone(self.head.as_ref().unwrap());
        for _ in 0..index {
            let temp = curr;
            curr = Rc::clone(temp.borrow().next.as_ref().unwrap());
        }

        Ok(&curr.borrow().data)
    }
}

These are the compiler errors I get (one for Rc and one for RefCell I assume):

error[E0515]: cannot return value referencing local variable `curr`
  --> src/lib.rs:43:9
   |
43 |         Ok(&curr.borrow().data)
   |         ^^^^-------------^^^^^^
   |         |   |
   |         |   `curr` is borrowed here
   |         returns a value referencing data owned by the current function

error[E0515]: cannot return value referencing temporary value
  --> src/lib.rs:43:9
   |
43 |         Ok(&curr.borrow().data)
   |         ^^^^-------------^^^^^^
   |         |   |
   |         |   temporary value created here
   |         returns a value referencing data owned by the current function

I would greatly appreciate any help/response. I tried testing the reference behavior out with Option<Rc<ListNode<T>>> instead of Option<Rc<RefCell<ListNode<T>>>>, but when I tried to dereference the Rc, I got a cannot move error, which does not make sense because isn't Deref for Rc supposed to return a reference to the data inside the Rc?

    pub fn get(&self, index: usize) -> Result<(), &str> {
        if index >= self.size {
            return Err(INDEX_OUT_OF_BOUNDS);
        }

        let mut curr = Rc::clone(self.head.as_ref().unwrap());
        for _ in 0..index {
            let temp = curr;
            curr = Rc::clone(temp.next.as_ref().unwrap());
        }

        let hello = *curr;
        Ok(())
    }
error[E0507]: cannot move out of an `Rc`
  --> src/lib.rs:43:21
   |
43 |         let hello = *curr;
   |                     ^^^^^
   |                     |
   |                     move occurs because value has type `ListNode<T>`, which does not implement the `Copy` trait
   |                     help: consider borrowing here: `&*curr`
1 Like

You cannot hide a RefCell in API boundaries. If you use a RefCell, your clients have to know about that. You need to return Ref or RefMut guards

This is covered here in the book you mentioned.

3 Likes

Ok, thanks for the help. Then, what about the Rc? Why does dereferencing the Rc result in the owned type rather than a reference to the value inside the Rc?

If you want a reference, you can do &*rc.

That works, but shouldn't I just have to do *rc to get the reference? In the documentation, the implementation of the deref function for the Deref trait for Rc returns a reference:
Screen Shot 2022-06-09 at 7.22.03 PM

The desugaring for * is *Deref::deref(&v) (note the *).

2 Likes

I didn't know that. Thanks for the clarification. So (just to imagine I was making an immutable list in a hypothetical scenario with only Rc and no RefCell), how would I return a reference to the inner data in an Rc created with Rc::clone from a function?

    pub fn get(&self, index: usize) -> Result<&T, &str> {
        if index >= self.size {
            return Err(INDEX_OUT_OF_BOUNDS);
        }

        let mut curr = Rc::clone(self.head.as_ref().unwrap());
        for _ in 0..index {
            let temp = curr;
            curr = Rc::clone(temp.next.as_ref().unwrap());
        }

        Ok(&curr.data)
    }
error[E0515]: cannot return value referencing local variable `curr`
  --> src/lib.rs:43:9
   |
43 |         Ok(&curr.data)
   |         ^^^^----^^^^^^
   |         |   |
   |         |   `curr` is borrowed here
   |         returns a value referencing data owned by the current function

I see. It's mentioned here in the book: Treating Smart Pointers Like Regular References with the Deref Trait - The Rust Programming Language. I covered it a while ago, so I must have missed the detail.

You cannot. If the Rc was created inside the function, it was already dropped and you may be returning reference to freed data.

But the data the Rc points to was not created inside the function. It also maintains reference counts, which means that the value that the Rc points to won't be dropped if the ref count is greater than 1, so the compiler can't determine that, or am I blurring the line between what I know and what the compiler knows?

The compiler doesn't track that.

1 Like

I think you would find this blog post informative.

Ok. Thank you so much for your help. I have one remaining confusion. How does the compiler know that the reference inside my_option lives longer than my_option?

struct Foo {
    data: Option<String>
}

impl Foo {
    fn bar(&self) -> &String {
        let my_option = self.data.as_ref();
        return my_option.unwrap();
    }
}

fn main() {
    let foo = Foo { data: Some("hello".to_owned()) };
    println!("{}", foo.bar());
}

If we annotate the lifetime:

fn bar<'a>(&'a self) -> &'a String

Then &self.data -> &'a Option<String>. self.data.as_ref() is actually Option::as_ref(&self.data), and the signature for as_ref() (desugared) is:

fn as_ref<'b>(&'b Option<T>) -> Option<&'b T>

So it produces Option<&'a String>. Unwraping it gives &'a String, which is the correct type.

1 Like

OK, thank you so much for the help. So, the following example would compile because the lifetime of hi is directly related to that of &self.data, but if I write something like this: let hello = Rc::clone(&self.data); return &*hello, it would not compile because the lifetime information would disappear when I call clone.

use std::rc::Rc;

struct Foo {
    data: Rc<String>
}

impl Foo {
    fn bar(&self) -> &String {
        let hi = &*self.data;
        return hi;
    }
}

fn main() {
    let foo = Foo { data: Rc::new("hello".to_owned()) };
    println!("{}", foo.bar());
}

Is the line of reasoning above correct?

Yes. The compiler does not know about the sharing nature of Rc; it thinks of it as an owned type.

2 Likes

Ok, thank you so much for your help. You helped clarify a lot of misconceptions I had.

This is an interesting case to think about, really. Maybe this will lend some insight (I hope so).

First a couple basics that are pretty trivial.

This doesn't compile:

fn borrow_from_owned<T>(rc: Rc<T>) -> &T {
    &*rc
}

The compiler doesn't like that the lifetime of the returned reference "came out of nowhere", which is already a big clue. We can tell it, hey, don't mind that:

fn borrow_from_owned<'any, T>(rc: Rc<T>) -> &'any T {
    &*rc
}

But it will still fail, because you can't borrow from a local and return a reference to it. And that's definitely correct here too! If that rc is the only one still pointing at the contents, the contents will drop when rc drops. Good, fine, makes sense.


This compiles just fine:

fn borrow_from_borrow<'rc, T>(rc: &'rc Rc<T>) -> &'rc T {
    &*rc
}

It's very much like a field access, or borrowing a &str from a &String. Good, fine, makes sense.


Now the interesting one. Knowing how Rc works, you may expect this to work:

fn borrow_from_clone<'rc, T>(rc: &'rc Rc<T>) -> &'rc T {
    let clone = Rc::clone(rc);
    &*clone
}

And as was explained, the compiler doesn't understand that the returned reference is just fine, because it really pointing at someting owned by rc. Let's take it in slow motion and point out a few reasons why it "has" to be this way (without Rc becoming very, very special-cased by the language).

First, let's be a little more explicit about the clone type.

fn borrow_from_clone<'rc, T>(rc: &'rc Rc<T>) -> &'rc T {
    let clone: Rc<T> = Rc::clone(rc);
    &*clone
}

That Rc<T> of clone has no lifetime on it. There's nothing on the language level still connecting it to rc. This is similar to the first case, where a lifetime is going to have to come "out of nowhere".

Let's imagine there's a way to overcome that though, and be more explicit about the returned value too. Maybe the compiler will get that they're related.

fn borrow_from_clone<'rc, T>(rc: &'rc Rc<T>) -> &'rc T {
    let clone: Rc<T> = Rc::clone(rc);
    let borrow: &'rc T = &*clone;
    borrow
}

Nope, it's not buying it. Here, we get a whole new error about 'rc being too long. This comes up occasionally in other situations: When you have a function with a lifetime parameter, that lifetime is chosen by the caller of the function. Within the function, it could be almost anything -- the main thing you know is that it lasts longer than the function body (it lasts at least as long as the call site expression). [1]

But clone is a local, so you can't borrow it for longer than the function body.

Rust generally wont just take our word that things are okay. You make assertions like that by using unsafe and taking on the responsibilities to uphold Rust's safety guarantees yourself.


Finally, I'd like to outline another way to think of this situation. Disclaimer: it's a bit more hand-wavey than the rest, and it's possible Rust will find ways to grow beyond this mental model. Anyway, the general idea is to think of a chain of exclusive access.

First note that shared references are very flexible -- you can Copy them, so you can even copy a &'long T out of a &'short &long T, and so forth. For example, if I have a &String, and then I create some &str from that, and then I use the &String again, I can keep on using the &str. No problem.

The exclusive nature of &mut make it much more rigid, though. You can only move or reborrow them, not copy them, and you cannot get a &'long mut T from a &'short mut &'long mut T. And if I have a &mut Vec<T>, and I create a &mut [T] or even &[T] from that, and then I use the &mut Vec<T> again... my &[T] sub-borrow gets "cancelled". This makes sense -- maybe my use of &mut Vec<T> was to call .clear() after all.

And the same is true if you have a &T you got from an owned value, and then use the owned value again. You can do anything with owned values that you can with a &mut T and more, after all. The semantics for ownership are at least as strong.

Beyond a single borrow, a chain of exclusive access back to the owned, borrowed object must remain "unbroken".

let mut owned = String::new();
let a = &mut owned;
let b = &mut *a;
let c = &mut *b;
let d = &mut *c;

// This use of `b` makes both `c` and `d` unusable.
// But `b` is still ok to use,
// - and so is `a` (but using it makes `b` unusable),
// - and so is `owned` (but using it makes all borrows unusable)
println!("{b}");

So with that in mind, consider this function:

fn borrow_unique(i: &mut Holder<i32>) -> &mut i32 {
    let mut h: Holder<&mut i32> = Holder(&mut i.0);
    let h = &mut h;
    
    &mut *h.0
}

It fails, while the shared version succeeds. We can talk about why in terms of lifetimes and borrowing and copying shared references -- we've created some &'local mut Holder<&'foreign i32> value, and you can't get a &'long mut T from inside a &'short mut &'long mut T and so on.

But we can also reason about it in terms of the chain of exclusive access. When our local goes out of scope at the end of this function, we break the chain of exclusive access -- an intermediate link is invalid, so our returned value has no path back to the owned value, and it too must be invalid.

With that viewpoint, you can forget about lifetimes per se and see how borrow_from_clone is analogous to borrow_unique in some way: you stuck the data behind something the language considers exclusive (an owned Rc) and then borrowed through that exclusive thing. Once the exclusive thing is gone, so is your access.


  1. The lifetimes also have to make sense for the types, so here: fn foo<'a, 'b>(_: &'a &'b str) {} you know that 'b: 'a. ↩ī¸Ž

9 Likes

You can make the type of data to Rc<T>

use std::{
    cell::RefCell,
    rc::{Rc, Weak},
};

const INDEX_OUT_OF_BOUNDS: &str = "Error: index is out of bounds.";

type StrongListNode<T> = Option<Rc<RefCell<ListNode<T>>>>;
type WeakListNode<T> = Option<Weak<RefCell<ListNode<T>>>>;

pub struct LinkedList<T> {
    head: StrongListNode<T>,
    tail: StrongListNode<T>,
    size: usize,
}

struct ListNode<T> {
    data: Rc<T>,
    prev: WeakListNode<T>,
    next: StrongListNode<T>,
}

impl<T> LinkedList<T> {
    pub fn new() -> Self {
        LinkedList {
            head: None,
            tail: None,
            size: 0,
        }
    }

    pub fn get(&self, index: usize) -> Result<Rc<T>, &str> {
        if index >= self.size {
            return Err(INDEX_OUT_OF_BOUNDS);
        }

        let mut curr = Rc::clone(self.head.as_ref().unwrap());
        for _ in 0..index {
            let temp = curr;
            curr = Rc::clone(temp.borrow().next.as_ref().unwrap());
        }
        // get value step by step, you can write it into one step
        let a = Rc::as_ref(&curr);
        let b = a.borrow();
        let c = &*b;
        let d = c.data.clone();
        Ok(d.clone())
    }
}

Note that this is not an arbitrary decision, and you shouldn't expect anything else, basically. The desugaring of Deref very deliberately includes dereferencing the returned (&Deref::Target) type. That is the point (and the definition) of dereferencing: it removes a layer of indirection. Converting a reference of one type into a reference of another type (possibly to one of its fields) definitely does not count as "dereferencing".

To compare it with a simpler example: an Rc is a pointer. If you expected to get a &T when you dereference an Rc<T>, then by the same line of thought, you should also expect a simple reference, &T, to yield the exact same reference, &T when "dereferenced". But that would be useless – it would mean there is no way to get (or mutate) a value from behind a reference.