Puzzling Box drop implementation

This isn’t important, but I don’t understand Box:drop.

unsafe impl<#[may_dangle] T: ?Sized, A: Allocator> Drop for Box<T, A> {
    #[inline]
    fn drop(&mut self) {
        // the T in the Box is dropped by the compiler before the destructor is run

        let ptr = self.0;

        unsafe {
            let layout = Layout::for_value_raw(ptr.as_ptr());
            if layout.size() != 0 {
                self.1.deallocate(From::from(ptr.cast()), layout);
            }
        }
    }
}

Specifically the comment:

”// the T in the Box is dropped by the compiler before the destructor is run”

How does that happen? How does the compiler know?

I used drop_in_place to drop the T, which seems logical.

Box is a magic compiler-known type. It has capabilities types implemented outside the compiler do not. For example, you can move values in and out of it and the compiler tracks when the heap place is and isn't initialized, analogous to a stack slot (variable). The exact part of the compiler that emits code to drop the contents before dropping the Box is probably closely related to that.

2 Likes

I seem to have managed to reproduce most of the functionality of Box, although it requires some nightly magic to get ?Sized to coerce.

I guess there may be a reason for the compiler to do the drop of T, but it seems strange and unnecessary. Perhaps it is for historical reasons.

This is what enables moving a T value out of a Box<T>, including partially.

2 Likes

Ok (I think!), I am not sure why that is useful, I guess all kinds of history here.

I can see there is an example here:

It seems quite un-rust-like!

Try this.

Here's one of the relevant pieces. (Called from line 824.)

I would never have thought of doing that, it seems crazy that made it into stable Rust 1.0.

I found a blog here:

”I suspect that the DerefMove semantics could even have been removed from Box<T> before 1.0 (I don’t find it necessary), but people had better things to do, like fixing the million other rough edges of the language that can’t be touched after backwards compatibility is a thing.”

I cannot imagine hardly anyone using that, how would you even know about it?

I can't remember how I first learned about it. But maybe just try to move out of a box? Or see some other code that does.

Here's another magic Box example.

For completeness, there's even more than just an all-or-nothing ability to move values out of an owned Box. You can also move things out partially. (@kpreid mentioned that as well already, I'm just noticing.)

So you can try out things like

#[allow(unused)]
#[derive(Debug)]
struct PrintOnDrop(&'static str);
impl Drop for PrintOnDrop {
    fn drop(&mut self) {
        println!("=> dropped: {self:?}");
    }
}

#[allow(unused)]
struct Struct {
    field1: PrintOnDrop,
    field2: PrintOnDrop,
    field3: PrintOnDrop,
    field4: PrintOnDrop,
}

fn main() {
    let mut x: Box<Struct> = Box::new(Struct {
        field1: PrintOnDrop("field1"),
        field2: PrintOnDrop("field2"),
        field3: PrintOnDrop("field3"),
        field4: PrintOnDrop("field4"),
    });
    println!("initialized.");
    println!("moving out & dropping some fields");
    drop(x.field1);
    drop(x.field2);

    // x now partially deinitialized (field1 and field2 not containing any value)
    _ = &x.field3; // accessible
    // _ = &x.field1; // would error: "borrow of moved value: `x.field1`"
    // x as a whole of course wouldn't be accessible either
    // (until all missing fields are restored)

    println!("\ndoing some assignments..");
    // can be re-initialized
    println!("..to field1");
    x.field1 = PrintOnDrop("field1 - new");
    _ = &x.field1; // accessible again
    println!("..and to field3");
    // already-initialized fields will trigger drop of previous value
    x.field3 = PrintOnDrop("field3 - new");

    println!("\nmain function ending now.");
    // drops field1, field3 (both new values), and field4 (original value from construction)
    // no action for field2

    // precise set of drops is determined by the compiler directly; no logic in a custom
    // `Drop` implementation could replicate (or inspect) this, so `impl Drop for Box<T>` in
    // the standard library just skips this part initially, and only handles the box's allocation
    // itself, not the destructors of its contents.
}

(playground)

initialized.
moving out & dropping some fields
=> dropped: PrintOnDrop("field1")
=> dropped: PrintOnDrop("field2")

doing some assignments..
..to field1
..and to field3
=> dropped: PrintOnDrop("field3")

main function ending now.
=> dropped: PrintOnDrop("field1 - new")
=> dropped: PrintOnDrop("field3 - new")
=> dropped: PrintOnDrop("field4")

I think I personally would have first learned this only partially when I was looking for an into_inner method for Box (before even the unstable one existed). I don't remember how I eventually found out one can just dereference, but it wasn't obvious to me. Even less so was the fact that doing that (unlike an into_inner method) doesn’t actually consume the Box yet. IIRC, I only learned that much later and also only because of reading about it in some context.

The relevant kind of detail (that helped me really understand something special is going on) was the fact that you can move something back into a box afterwards, and then use it normally again. On the other hand, the details of just when exactly deallocation of the Box’s memory happens[1] is something that's too subtle [and also generally unimportant to people] to actually keep note, even if it was mentioned somewhere.


  1. which is often the only actual effect of this mechanism, compared to an .into_inner() method, if you don’t do anything else with the box afterwards ↩︎

1 Like

Good point that recognizing that the allocation sticks around is a step beyond recognizing you can unbox stuff and even re-initialize in place. I can't remember how exactly I learned that either...