Here the point of the article and the write up is very good, and very clear.
But there is an important caveat imho, in this paragraph:
The reason the deref method returns a reference to a value, and that the plain dereference outside the parentheses in *(y.deref()) is still necessary, has to do with the ownership system. If the deref method returned the value directly instead of a reference to the value, the value would be moved out of self . We don’t want to take ownership of the inner value inside MyBox<T> in this case or in most cases where we use the dereference operator.
However, Box seems one of the only cases that this actually happens!
fn main() {
let a = Box::<String>::new(String::from("hell"));
let b = *a; // expects &str
let c = *a; // but clearly wasn't
}
you are right, Box, despite being defined in a library (the alloc crate), it actually has special support by the compiler, and among all the smart pointers (meaning, types that implements Deref), only Box can be moved out.
and curiously, the Deref implementation of Box is seemingly "recursive":
Box is special. Let's discuss this using something else first.
struct MyBox<T: ?Sized>(Box<T>);
// This is analogous to how `Deref` works for `Box<T>`, modulo the
// special properties, some of which we'll talk about shortly.
impl<T: ?Sized> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&*self.0
}
}
fn main() {
let a = MyBox(Box::new(String::from("hell")));
let b = *a;
let c = *a;
}
error[E0507]: cannot move out of dereference of `MyBox<String>`
--> src/main.rs:17:12
|
17 | let b = *a; // expects &str
| ^^ move occurs because value has type `String`, which does not implement the `Copy` trait
And recall the notional desugaring they mentioned:
// vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv &String
let b = *(<MyBox<String> as Deref>::deref(&a));
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ String
The error happens because you can't move from behind a &_ reference, as that would make the reference dangle.
Okay, back to Box<String>. The * deref operator implementation for Box<_> is a magic built-in operation. It has a trait implementation too for the sake of generics,[2] but when Box<_> is a concrete type, the build-in operation is used.[3]
There is no intermediate &_ in the built-in operation like there is for the trait implementation. And the built-in operation for Box<_> is special in that it does allow moving out of the place that results from *bx. The Box's allocation place is treated similarly to a variable, where the compiler can track whether it is initialized or not -- whether you have moved out of it or not, and whether you have reassigned a value or not.
So your OP playground error is similar to this, in terms of why it errored:
let star_a = String::from("hell");
let b = star_a;
let c = star_a;
MyBox<String> and Box<String> have Target = String, not Target = &str (or str). ↩︎
Just a few questions to understand what you say better. What is ? doing there (in a simplified way please) ? I do know that T:Sized is a trait bound requiring T to be or to have implemented Sized.
So here what you are saying is that the comment "expects &str" was wrong, correct? It is &String, then * returns the String.
Confusing: Why not to use the original MyBox<String> for the example, doesn't it illustrate this as well? I don't know whether this was in purpose or not?
struct MyBox<T: ?Sized>(Box);
Why does * go all the way down?
Why isn't b of type Box? Is it dereferencing "recursively", until T is reached? (but why?!)
Test
I tried this, which prints "hell" so I wonder whether this is the "recursive" part, but I don't think it is.
#[allow(unused_variables)]
use std::ops::Deref;
struct MyBox<T: ?Sized>(Box<Box<Box<T>>>);
impl<T: ?Sized> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&*self.0
}
}
fn main() {
let a = MyBox(Box::new(Box::new(Box::new(String::from("hell")))));
let b = &*a; // &String
println!("{}",b)
}
I'd really appreciate you put a heading in bold to what you are trying to show in a section (if you can), so I don't get lost trying to figure out other possibilities. I tried to do that in my reply, so that's also more clear.
In @quinedot’s original code, &*self.0 dereferences the Box<T> stored in self.0 to get T, then takes a reference, producing &T.
In your code with extra Boxes, the required extra dereferencing happens automatically via deref coercion inside the code of fn deref(). The compiler sees that the function must return &T and inserts additional dereferences to make that true. Your code ends up being equivalent to:
fn deref(&self) -> &Self::Target {
&***self.0
}
This coercion happens all the time in Rust code. There is no exception for fn deref(); it's just particularly confusing here.
The type coercion here should be happening between the deref's last expression and the &Self::Target, right? It tries to match the return value with the return type signature and may insert extra *?
But my question really is about comparing these two:
let a = MyBox::new(MyBox::new(String::from("hell")));
let b = Box::new(Box::new(String::from("hell")));
When you introduce a generic parameter like impl<T>, it has an implicit T: Sized bound. : ?Sized removes the implicit bound. Not strictly required for the example.
Yes. It's partially a matter of taste here.
Yeah, I'm not sure exactly what your expectations were, but &str doesn't factor into the example. Instead of String you could use a HashMap or such (that isn't Copy and isn't Deref) to get a feel, perhaps.
I'm not sure which part you're talking about here, sorry.
I had simply meant &String, I was trying to show precisely what was different from Mybox which doesn't move it. I think it was miscommunication in that sense.
Note that having this kind of builtin box code is a legacy thing. The more generalized protocol that RFC 809 specifies works in more-or-less exactly the same way: when that is adopted uniformly, the need for shallow drop and the Box rvalue will go away.
And follow the thread on the generalized syntax's acceptance and then retraction, then box syntax addition and removal, it's easy to see how things have changed, but hard to get a clear picture of how they are.
If I'm not mistaken, box remains special syntax within the MIR or other intermediate representations, but I could be wrong about that.
Oh cool, I didn't know about the new intrinsic in the THIR, though admittedly I don't know much about that representation.
And yes, I should have been more clear that box patterns remain. Which is very nice, even if the syntax isn't as uniform as it was when we briefly had box constructor syntax.