Ch::15::Box:: I find this example-selection a bit confusing

Here the point of the article and the write up is very good, and very clear.

But there is an important caveat imho, in this paragraph:

The reason the deref method returns a reference to a value, and that the plain dereference outside the parentheses in *(y.deref()) is still necessary, has to do with the ownership system. If the deref method returned the value directly instead of a reference to the value, the value would be moved out of self . We don’t want to take ownership of the inner value inside MyBox<T> in this case or in most cases where we use the dereference operator.

However, Box seems one of the only cases that this actually happens!

fn main() {
   let a = Box::<String>::new(String::from("hell"));
   let b = *a; // expects &str
   let c = *a; // but clearly wasn't
}

Playground

you are right, Box, despite being defined in a library (the alloc crate), it actually has special support by the compiler, and among all the smart pointers (meaning, types that implements Deref), only Box can be moved out.

and curiously, the Deref implementation of Box is seemingly "recursive":

see also this blog post:

3 Likes

Box is special. Let's discuss this using something else first.

struct MyBox<T: ?Sized>(Box<T>);

// This is analogous to how `Deref` works for `Box<T>`, modulo the
// special properties, some of which we'll talk about shortly.
impl<T: ?Sized> Deref for MyBox<T> {
    type Target = T;
    fn deref(&self) -> &Self::Target {
        &*self.0
    }
}

fn main() {
   let a = MyBox(Box::new(String::from("hell")));
   let b = *a;
   let c = *a;
}
error[E0507]: cannot move out of dereference of `MyBox<String>`
  --> src/main.rs:17:12
   |
17 |    let b = *a; // expects &str
   |            ^^ move occurs because value has type `String`, which does not implement the `Copy` trait

And recall the notional desugaring they mentioned:

    //        vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv &String
    let b = *(<MyBox<String> as Deref>::deref(&a));
    //      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ String

So b is expecting a String, not a &str.[1]

The error happens because you can't move from behind a &_ reference, as that would make the reference dangle.


Okay, back to Box<String>. The * deref operator implementation for Box<_> is a magic built-in operation. It has a trait implementation too for the sake of generics,[2] but when Box<_> is a concrete type, the build-in operation is used.[3]

There is no intermediate &_ in the built-in operation like there is for the trait implementation. And the built-in operation for Box<_> is special in that it does allow moving out of the place that results from *bx. The Box's allocation place is treated similarly to a variable, where the compiler can track whether it is initialized or not -- whether you have moved out of it or not, and whether you have reassigned a value or not.

So your OP playground error is similar to this, in terms of why it errored:

    let star_a = String::from("hell");
    let b = star_a;
    let c = star_a;

  1. MyBox<String> and Box<String> have Target = String, not Target = &str (or str). ↩︎

  2. for example ↩︎

  3. Which is why the trait implementation is not actually recursive. ↩︎

1 Like

interesting post, thank you

Just a few questions to understand what you say better. What is ? doing there (in a simplified way please) ? I do know that T:Sized is a trait bound requiring T to be or to have implemented Sized.

Oh, okay, it's just that the trait is optional. Seems strange to me somehow.

Also, can't we use -> &T directly instead of -> &Self::Target ? They look equivalent to me at least?

But maybe it's simply to satisfy the definition of Deref.

For the dereferencing part

So here what you are saying is that the comment "expects &str" was wrong, correct? It is &String, then * returns the String.

Confusing: Why not to use the original MyBox<String> for the example, doesn't it illustrate this as well? I don't know whether this was in purpose or not?

struct MyBox<T: ?Sized>(Box);

Why does * go all the way down?

Why isn't b of type Box? Is it dereferencing "recursively", until T is reached? (but why?!)


Test
I tried this, which prints "hell" so I wonder whether this is the "recursive" part, but I don't think it is.

#[allow(unused_variables)]
use std::ops::Deref;

struct MyBox<T: ?Sized>(Box<Box<Box<T>>>);

impl<T: ?Sized> Deref for MyBox<T> {
    type Target = T;
    fn deref(&self) -> &Self::Target {
        &*self.0
    }
}

fn main() {
   let a = MyBox(Box::new(Box::new(Box::new(String::from("hell")))));
   let b = &*a; // &String
   println!("{}",b)
}

I'd really appreciate you put a heading in bold to what you are trying to show in a section (if you can), so I don't get lost trying to figure out other possibilities. I tried to do that in my reply, so that's also more clear.

The behavior of * is determined by the Deref implementation, not by the fields of the type. The Deref implementation has

type Target = T;

therefore the type of *a is exactly T.

but MyBox should dereference once / one level? Up to the first Box given that it's our own implementation and just does &*self ?

sorry im aware im quite confused.

actually it does go one down on "my impl"

use std::ops::Deref;

#[derive(Debug)]
struct MyBox<T>(T);

impl<T> MyBox<T> {
    fn new(x: T) -> MyBox<T> {
        MyBox(x)
    }
}

impl<T> Deref for MyBox<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

fn main(){
  let a = MyBox::new(MyBox::new(String::from("hell")));
  println!("{:?}",a);
  println!("{:?}",*a); // expect going 1 level down to MyBox::MyBox
}

i.e only if I use Box::new...it dereferences all the way down.

In @quinedot’s original code, &*self.0 dereferences the Box<T> stored in self.0 to get T, then takes a reference, producing &T.

In your code with extra Boxes, the required extra dereferencing happens automatically via deref coercion inside the code of fn deref(). The compiler sees that the function must return &T and inserts additional dereferences to make that true. Your code ends up being equivalent to:

fn deref(&self) -> &Self::Target {
    &***self.0
}

This coercion happens all the time in Rust code. There is no exception for fn deref(); it's just particularly confusing here.

1 Like

The type coercion here should be happening between the deref's last expression and the &Self::Target, right? It tries to match the return value with the return type signature and may insert extra *?

But my question really is about comparing these two:

let a = MyBox::new(MyBox::new(String::from("hell")));
let b = Box::new(Box::new(String::from("hell")));

Here is the playground with the prints.

But it's possible there is something I simply don't understand just yet.


PS yes I missed the * or at least didn't get it at first, but it's clear now.

Yes, that is right.

The Debug implementation of Box does not print Box, just its contents.
The Debug implementation of MyBox does print MyBox in addition to its contents.

Does that clarify what is going on?

1 Like

omg...then that's why i got -even more- confused,, yes thanks!

When you introduce a generic parameter like impl<T>, it has an implicit T: Sized bound. : ?Sized removes the implicit bound. Not strictly required for the example.

Yes. It's partially a matter of taste here.

Yeah, I'm not sure exactly what your expectations were, but &str doesn't factor into the example. Instead of String you could use a HashMap or such (that isn't Copy and isn't Deref) to get a feel, perhaps.

I'm not sure which part you're talking about here, sorry.


(And I think the rest was answered by others.)

I had simply meant &String, I was trying to show precisely what was different from Mybox which doesn't move it. I think it was miscommunication in that sense.

1 Like

I just want to say that the history of boxes in Rust is fascinating and a bit confusing.

If you start reading here: 1211-mir - The Rust RFC Book

Note that having this kind of builtin box code is a legacy thing. The more generalized protocol that RFC 809 specifies works in more-or-less exactly the same way: when that is adopted uniformly, the need for shallow drop and the Box rvalue will go away.

And follow the thread on the generalized syntax's acceptance and then retraction, then box syntax addition and removal, it's easy to see how things have changed, but hard to get a clear picture of how they are.

If I'm not mistaken, box remains special syntax within the MIR or other intermediate representations, but I could be wrong about that.

box expressions got replaced by an attribute which got replaced by an intrinsic.

Edit: There are box patterns though, and box is a keyword, so there is still box syntax.

Oh cool, I didn't know about the new intrinsic in the THIR, though admittedly I don't know much about that representation.

And yes, I should have been more clear that box patterns remain. Which is very nice, even if the syntax isn't as uniform as it was when we briefly had box constructor syntax.