Why Box<dyn T> works?

If I recall it correctly , dyn T is unsize as follow

fn main() {
  struct U;
  trait Tr {}
  impl Tr for U {}
  let a : dyn Tr;   // can't compile
}

But I can use the syntax in Box , and it seems

  1. line A and line B reveals that x is holding a reference to a fat pointer ,
  2. line C reveals that y is holding a fat pointer.

It seems the size of dyn Tr is 16 , so in what sense we call dyn Tr unsize?

fn main() {
  struct U;
  trait Tr {}
  impl Tr for U {}
  let x : Box<&dyn Tr> = Box::new(&U);
  let y : Box<dyn Tr> = Box::new(U);
  println!("{}",size_of_val(&*x));   // 16  B
  println!("{}",size_of_val(&*y));   // 0

  println!("{}",size_of_val(&x));    // 8   A
  println!("{}",size_of_val(&y));    // 16  C
}
1 Like

Here’s some more datapoints

use std::mem::size_of_val;
fn main() {
  struct U;
  trait Tr {}
  impl Tr for U {}
  struct U1(u64);
  impl Tr for U1 {}
  struct U2([u8; 1337]);
  impl Tr for U2 {}
  let x : Box<&dyn Tr> = Box::new(&U);
  let y : Box<dyn Tr> = Box::new(U);
  let z1 : Box<dyn Tr> = Box::new(U1(0));
  let z2 : Box<dyn Tr> = Box::new(U2([0; 1337]));
  println!("{}",size_of_val(&*x));   // 16  B
  println!("{}",size_of_val(&*y));   // 0
  println!("{}",size_of_val(&*z1));  // 8
  println!("{}",size_of_val(&*z2));  // 1337

  println!("{}",size_of_val(&x));    // 8   A
  println!("{}",size_of_val(&y));    // 16  C
  println!("{}",size_of_val(&z1));   // 16
  println!("{}",size_of_val(&z2));   // 16
}

The size of Box<dyn Tr> is 16, not the size of dyn Tr. The size of values of type dyn Tr differs from value to value. “unsized” means something like “size only known at run-time”. The size of the dyn Tr value in your example was 0.

2 Likes

Each box has two different locations to store things: one is inside the Box<T> value itself and the other is in the heap allocation managed by the Box. The inline value is the pointer, which is always Sized, but the heap allocation might not be. It's the size of this first allocation that you're seeing on line đť’ž.

When you initially create the box, you store a U inside it. This is Sized, so the compiler knows how much memory needs to be allocated on the heap. You then coerce this into a dyn Tr value, which makes the compiler forget that the original value was a U. During this coercion process, the box's pointer is converted from *mut U, a thin pointer into *mut dyn Tr, a fat pointer.

At this point, there's no longer any way to convert the value back to a U and get it out of the box. As far as the compiler is concerned, it might be any type that also implements the Tr trait, which isn't necessarily the same size as U.

NB: There are some simplifications here; Box is special in many ways, and can't be wholly described in terms of ordinary Rust structures

1 Like

agreed. and all things based on it, for example , if we substitute Box in the example with Rc/Arc we got the same output.

If we allow

let a : dyn Tr;

and define dyn Tr as the fat pointer , which has size of 2*usize , what kind of bad things will happen?

Rust is different from other programming languages in that it offers lots of different pointer types with different properties, particularly w.r.t. ownership. There’s owned pointers Box<T>, borrowed references in two flavors &mut T and &T, raw pointers in 3 variations *const T, *mut T, and NonNull<T> which vary in variance and nullability. Just saying “dyn Tr is the fat pointer” doesn’t answer the question “what kind of pointer?”, hence Rust chooses to be explicit and uses Box<dyn Tr> or &dyn Tr or &mut dyn Tr or *const dyn Tr etc to differentiate.

It’s often useful not to think of dyn Tr as a separate thing or type on it’s own and only consider things like &dyn Tr or Box<dyn Tr> to be (useful / ordinary) types, just like with slices you’ll usually work with &[T] or &mut [T] but not with [T] without any indirection. The whole concepts of “unsized types” can be seen as a hack to allow re-using the existing ownership+borrowing infrastructure of ordinary pointer-types like Box, &, &mut for fat pointers such as slices and trait objects.

3 Likes

The utility of unsized types is in type parameters. e.g. Borrow and Cow. You can't make a variable be one.

That's not the size of dyn Tr, that's the size of the box (data pointer and vtable pointer). The size of a pointer doesn't depend on the size of the value it is pointing; this is why it is at all possible to use dyn Trait objects: you have to put them behind indirection, so that the compiler can get away with only knowing the size of the pointer, without having to know the size of the pointed object.

5 Likes
- and define dyn Tr as the fat pointer , which has size of 2*usize ,
+ and define dyn Tr as any object that implements the Tr trait , 
dyn Tr itself is implemented as the fat pointer but make it transparent to user
struct U(u128);
trait Tr{
  fn test(&self){}
}
impl Tr for U {}
let mut a : dyn Tr = U(0);  // a will be the owner
let b = &a;              // can borrow , its semantic is reference to U
let c = &mut a;          // semantic is mut reference to U
b.test();                // test is dynamic dispatched : translated into an index into the vtable
assert_eq!(size_of_val(&a),16);    // size of a should be the size of U , not the size of fat pointer

I think what you want is: &dyn Tr, not dyn Tr. That's what a dynamic reference to an unsized type is. It's consistent notation with things like &i32.

dyn Tr is sometimes useful by itself. For instance, it can be a struct member:

struct Foo {
    a: i32,
    b: dyn Tr,
}

Although this one is hard to use, in practice if you wanted to do this you'd do it in a more complex way with generics.

Not it isn't. A trait object in itself is not a pointer. (You can make a trait object out of a pointer-like type if it happens to implement the trait, but the point is that there's no automatic, implicit indirection around dyn Trait. You have to be explicit about it by adding &, Box<_>, etc.)

@yushang is proposing an extension to the language that allows unsized variables.

This already exists as an unstable feature: unsized_locals - The Rust Unstable Book

1 Like

thank you. I'll check it. at least , it makes let a: dyn Tr valid.

So you mean that dyn Tr always adds another indirection and is an owning fat pointer? “owning pointer” would’ve been one possible answer to the question “what kind of pointer”. (I.e. you didn’t really answer the question.) What’s gained here? You can already have Box<dyn Tr>, so I don’t see the advantage. Just use Box<dyn Tr> and you get what you want. Disadvantages of the proposed language change? Many! Adding implicit Boxing means adding implicit heap allocations; Rust likes being explicit about those. It even means forcing users to live with those heap allocations; no way to opt out.

(It also means that &dyn Tr can no longer be created from &T where T: Tr; similarly for &mut dyn Tr or Arc<dyn Tr>.)

Edit: The last point does not apply if &dyn Tr and &mut dyn Tr etc have special meaning and don’t mean reference to the fat pointer, but instead only one layer of indirection. However in order to give special meaning to &dyn Tr when dyn Tr would still a special kind of type similar to the current !Sized situation; you’d just gain the ability to (syntactically) “directly” work with a dyn Tr by-value.

1 Like

To translate your example into Box<dyn Tr> setting, add the Box and add dereferencing to the use-cases:

use std::mem::size_of_val;
fn main() {
    struct U(u128);
    trait Tr {
        fn test(&self) {}
    }
    impl Tr for U {}
    let mut a: Box<dyn Tr> = Box::new(U(0)); // a will be the owner
    let c: &mut dyn Tr = &mut *a; // semantic is mut reference to U
                                  // (line moved up to avoid borrow-checking error)
    let b: &dyn Tr = &*a; // can re-borrow, its semantic is reference to U
    b.test(); // test is dynamic dispatched : translated into an index into the vtable
    assert_eq!(size_of_val(&*a), 16); // size of *a is the size of U , not the size of fat pointer
}
1 Like

Unsized locals does not necessarily mean the data is stored on the heap. It can be stored on the stack, with the hidden pointer pointing to the stack. I think that's what the unstable implementation of the RFC does.

I know that unsized locals don’t introduce implicit heap allocations. But they do keep lots of restrictions around handling dyn Trait directly; it certainly isn’t the case that dyn Trait itself just become a fat pointer. E.g. you still cannot create a Vec<dyn Trait> and things like that. AFAICT, the current proposal doesn’t support unsized return values either.

Yes, but Box unconditionally heap-allocates (unless the wrapped value is zero-sized).

True, but @yushang never said that dyn Tr would implicitly use a Box.

but the syntax is neater. if dyn Trait is sized , we can write something like this

let a : [dyn Trait;2]; // or
let b : Vec<dyn Trait>;
// we can even unify the impl syntax
fn test(x:dyn Trait)->dyn Trait {}