Questions about Smart pointer and value access and de-referencing

I have come across some confusing behavior of smart pointers and how it relates to de-referencing etc and I am pretty sure this is due to gaps in my understanding of how they work. Hence this question and looking forward to getting some clarification.

I will be using Box and Rc but my guess is the same confusion I have applies to other smart pointers.

Q1. I can transfer ownership and move a value into a smart pointer like Box. Can I move the value out and transfer ownership out? If not why not?

For example in the code below

#[derive(Debug)]
enum MyBool {
    True,
    False
}

let my_value: MyBool = MyBool::True;
let value_in_box = Box::new(my_value); // ownership goes from my_value to Box
// println!("{:?}", my_value); won't compile. Used of moved value
let my_value = ??? // is it possible to transfer ownership from Box back to my_value?

I moved a value originally owned by my_value into the Box. Is there any syntax to move this value out of the Box back to my_value?

Q2. I can use a value in a Box as if it was not in a Box, but why can't I pass a value in a Box to function expecting the value without a Box?

In the same theme of accessing value within a smart pointer like Box (first question was about accessing it and moving it out). This question is about using the value.

I realize if I have a value in a Box, I can work with the value transparently as if it were not in a Box. For example:

    let my_value: u8 = 2;
    let value_in_box = Box::new(my_value);
    let pow_two = value_in_box.pow(2);

I was able to call the pow method on value_in_box as if it were not in a Box. But if I have a function defined to take in u8 I cannot just pass in value_in_box. For example the following code won't work:

fn pow_2(i: u8) -> u8 {
    i.pow(2)
}

    let my_value: u8 = 2;
    let value_in_box = Box::new(my_value);
    pow_2(value_in_box)

It will complain about mismatched type. Also a bit confusing is the fact that trying to use the value below won't work too.

    let my_value: u8 = 2;
    let value_in_box = Box::new(my_value);
    value_in_box * 2;

One can interact with methods on the value in the Box transparently in some ways (ie when calling methods on it) but cannot in some other ways (ie when passing into a function or using it with operands like *)

Why is this distinction? And what is the best way to think about accessing values inside a smart pointer like Box?

Q3. What is the benefit of putting values in hashmap into a Rc?

I have come across advice to put values in a hashmap into Rc. By default if a value is gotten from a hashmap using .get one get's a reference. The only way to get a own copy is either to clone or use the remove method to get the value out of the hashmap.

The advice to use Rc to wrap value is given as way to have the value remain in the HashMap but also own the value elsewhere.

Trying this advice out, I end up with the following code:

#[derive(Clone)]
struct Person {
    username: String,
    is_admin: bool
}

fn main() {

    let person = Person {
        username: "alice".to_string(),
        is_admin: true
    };

    let mut rced_values: HashMap<String, Rc<Person>> = HashMap::new();
    rced_values.insert("Alice".to_string(), Rc::new(person.clone()));
    let retrieved_rc: &Rc<Person> = rced_values.get("Alice").unwrap();

    let mut raw_values: HashMap<String, Person> = HashMap::new();
    raw_values.insert("Alice".to_string(), person);
    let retrieved_raw: &Person = raw_values.get("Alice").unwrap();
}

But I am not sure what the use was in having the values in Rc in the first HashMap? How is getting retrieved_rc: &Rc<Person> out of the map better than getting retrieved_raw: &Person?

Just dereference it.

It's because method calls auto-reference and auto-dereference as much as needed (as a first approximation). So calling .foo() on Box<T> will eventually find T::foo() if it takes a reference to self. This doesn't make the types identical, so of course when you pass one type to a function and it wants a different one, you will naturally get an error.

It's impossible to tell without any context. By default, you definitely shouldn't put all your hash map values into an Rc, just for the sake of it. It is for sure perfectly fine to have a HashMap<Key, NotAnRc>. In the situation you described, you might have needed shared ownership, in which case Rc is useful, but then it has nothing to do with hash maps. There's no intrinsic added value in passing around &Rc<Person>.

4 Likes

I think the insight I got from your response is to treat smart pointers more or less like normal references (at least in terms of accessing them) so just using * to de-reference them works as it should.

Note that Box behaves special:

use std::ops::{Deref, DerefMut};

struct NonCopy; // Some type which does not implement `Copy`

struct Wrapper<T>(T);

impl<T> Wrapper<T> {
    fn into_inner(self) -> T {
        self.0
    }
}

impl<T> Deref for Wrapper<T> {
    type Target = T;
    fn deref(&self) -> &T {
        &self.0
    }
}

impl<T> DerefMut for Wrapper<T> {
    fn deref_mut(&mut self) -> &mut T {
        &mut self.0
    }
}

fn main() {
    let a = NonCopy;
    let b = Box::new(a);
    let _c = *b; // we can move out of `Box` even if the inner value is `!Copy`
    let d = NonCopy;
    let e = Wrapper(d);
    //let _f = *e; // we can't move out normally if the inner value is `!Copy`
    let _f = e.into_inner(); // we must use a specific mechanism
}

(Playground)


See also: Tracking Issue for box_into_inner #80437:

The language actually supports *x as a special case [in case of Box]. However it can't be generalized to other types easily. It also doesn't consume the Box value immediately if T is Copy, which is quite a footgun.


The Rust reference lists Box<T> as a special case here:

Box<T> has a few special features that Rust doesn't currently allow for user defined types.

  • The dereference operator for Box<T> produces a place which can be moved from. This means that the * operator and the destructor of Box<T> are built-in to the language.
  • Methods can take Box<Self> as a receiver. [also applies to Rc, Arc, and Pin]
  • A trait may be implemented for Box<T> in the same crate as T, which the orphan rules prevent for other generic types.

It's a bit unfortunate (in my opinion) that there is no corresponding note in the section on the dereferencing operator itself. That part of the reference says:

On non-pointer types *x is equivalent to *std::ops::Deref::deref(&x) in an immutable place expression context and *std::ops::DerefMut::deref_mut(&mut x) in a mutable place expression context.

I find the wording "non-pointer types" misleading here because smart pointers (which include Rc, for example[1]) are explicitly listed as "pointer types" in this other section of the reference.

Edit: There is already an open issue #1298 for the Rust reference on that matter.


  1. also see thread What are smart pointers? as well as issue #91004 ↩ī¸Ž

2 Likes

Moving out of a Box can be done with the dereferencing operator * (which is a special case in the language as explained in my previous post).

You can call methods due to Deref coercion. You can also pass a &Box<T> instead of a &T for the same reason (also because of Deref coercion and the fact that function arguments are a coercion site). So, for example, this works:

-fn pow_2(i: u8) -> u8 {
+fn pow_2(i: &u8) -> u8 {
     i.pow(2)
 }
 
 fn main() {
     let my_value: u8 = 2;
     let value_in_box = Box::new(my_value);
-    pow_2(value_in_box)
+    pow_2(&value_in_box);
 }

(Playground)

Well, with numbers and operators, there are a few surprising issues, unfortunately.

Basically, when the smart pointer is behind a reference (whether you pass references as arguments) or when you call methods that work on &self, there is an automatic coercion to the pointed-to value. The distinction is when you pass or work on owned values. Note that you can't always "unwrap" a value from a smart pointer in the general case. For example, if you have an Rc<String>, you cannot obtain the String if there are clones of the Rc:

use std::rc::Rc;

fn main() {
    let a = String::from("Alice");
    let b = Rc::new(a);
    //let _c = b.clone(); // uncommenting this will make the program panic
    let _d = Rc::try_unwrap(b).unwrap();
}

(Playground)

So if a Rc<T> would be implicitly converted to a T, this would cause very nasty errors. I guess that's why documentation of Deref also states:

For similar reasons, this trait should never fail. Failure during dereferencing can be extremely confusing when Deref is invoked implicitly.

I guess Rust could have decided to make Box a special case and coerce Box<T> into T automatically, but maybe there are downsides to it (I don't really know). But it's not generally possible for smart pointers; see Playground above and the more verbose example below:

use std::rc::Rc;

fn greet(name: String) {
    println!("Hello {name}!");
}

fn main() {
    let joe = String::from("Joe");
    let rc = Rc::new(joe);
    let rc_clone = rc.clone();
    //greet(rc); // how could we obtain an owned `String`?
    
    // We must do:
    greet((*rc).clone());
    // Or if we know there is just one `Rc`:
    drop(rc_clone);
    greet(Rc::try_unwrap(rc).unwrap());
}

(Playground)

You can see that going from Rc<String> to String requires cloning the String unless the Rc has a strong_count of exactly one (see Rc::try_unwrap). Performing an implicit clone would be a bad idea (and moreover require T: Clone, which is true for Strings, but might not be the case for other types T stored in an Rc<T>).

1 Like

I figured out that it would be pretty inconsistent, because Rust is strict about whether arguments are passed by value or by reference (and same should hold for passing "by Box"):

struct S;

fn takes_owned(_: S) {}
fn takes_reference(_: &S) {}
fn takes_box(_: Box<S>) {}

fn main() {
    takes_owned(S);
    takes_reference(&S);
    takes_box(Box::new(S));

    //takes_reference(S); // this isn't allowed
    //takes_box(S); // so why should this be allowed?
    
    //takes_owned(&S); // this isn't allowed
    //takes_owned(Box::new(S)); // so why should this be allowed?
}

(Playground)


Also note that in regard to the receiver of a method, Box<T> gets unwrapped automatically:

struct S;

impl S {
    fn foo(self) {}
}

fn main() {
    Box::new(S).foo(); // works
}

(Playground)