Just like you can think of an ordinary reference type like &DoerImpl
as being implemented as a simple pointer *const ()
, you can think of a trait object reference type like &dyn Doer
as being implemented as a pair (*const (), &'static DoerVtable)
. Such a pair is often also called a “fat pointer”.
The type DoerVtable
looks roughly like
struct DoerVtable {
size: usize,
alignment: usize,
drop_value: fn(*mut ()),
do_thing: fn(*const ()),
}
and for impl Doer for DoerImpl { fn do_thing(&self) { println!("called!"); } }
the compiler will also generate a vtable (if needed) that looks roughly like
static DOER_IMPL_VTABLE_FOR_DOER: DoerVtable = DoerVtable {
size: std::mem::size_of::<DoerImpl>(),
alignment: std::mem::align_of::<DoerImpl>(),
drop_value: |this| {
let this: *mut DoerImpl = this as _;
unsafe {
std::ptr::drop_in_place(this);
}
},
do_thing: |this| {
let this: &DoerImpl = unsafe {
&*(this as *const DoerImpl)
};
this.do_thing();
},
};
Finally, you can think of your code
let doer = DoerImpl { };
let doer_ref: &dyn Doer = &doer;
to be essentially translated into
// pseudo-code
let doer = DoerImpl { };
let doer_ref: &dyn Doer = (&doer, &DOER_IMPL_VTABLE_FOR_DOER);
I.e. all that happens is that the compiler adds in, for the cast of &DoerImpl
to &dyn Doer
, code that pairs up the pointer for the &DoerImpl
reference with a pointer to static memory where the pre-existing correct vtable is located.
As you can see, doer
is unaffected by anything, it’s still an ordinary DoerImpl
. The vtable itself is stored neither of the variables in question, but it’s put into static memory at compile time, and loaded into RAM with the rest of the program’s machine code the moment your program gets started/loaded by your OS. A pointer to the vtable is part of doer_ref
.
An apt comparison is perhaps to compare
let s = "Hello World";
for _ in 0..100 {
do_something_with(s);
}
vs
for _ in 0..100 {
do_something_with("Hello World");
}
as this also involves a pointer into static memory.
Or to compare
let n = 42;
for _ in 0..100 {
do_something_with(42);
}
vs
for _ in 0..100 {
do_something_with(42);
}
as this also features reading from a variable in a loop vs. using a compiler-generated constant in a loop (the constant is a number in this case, and a fixed pointer value in the other case).
As far as I can tell, the latter version is “more performant” before optimizations because you can just use a constant baked into the source code instead of reading it out of a variable, which may in the best case be a register (that can’t be used for other purposes then), or it might even read from stack memory.
However after optimization, the compiler will certainly have figured out that the value of the variable is constant and always the same, so it will optimize the code into the better version anyways, and really the choice doesn’t matter.
As the latter version is also more convenient to write (no need for the doer_ref
), I don’t see any reason not to do it that way 