Dyn trait vs (data, vtable)


New to rust, and I've been confused on the following:

What are the advantages of manually constructing a (data, vtable)-like structure, rather than using a dyn trait, which itself should be a (data, vtable) representation?

I've seen this pattern in:

  • std::task::RawWaker
  • tokio's RawTask

Neither of these examples have documented the rationale behind this type of design, so I assume I must be missing something obvious.

In this blog post, withoutboats says:

Rust has support for relatively easy dynamic dispatch using trait objects, but because of the rules of object safety, this easy form of dynamic dispatch is often quite limited. Indeed, we’ll find its too limited to support our use case, which is why all of the API proposals have a “Waker” type, instead of just using references to dyn Wake trait objects.

What are the limitations to dynamic dispatch through normal dyn Traits? Is this the reason why the examples mentioned are above do not use dyn Traits?

We use dyn Trait over (data, vtable) for the same reason we use, references over pointers, Rc over (pointer_data, pointer_counter), Box over malloc/free. It's a safe abstraction which can be proven to not cause errors.

A few reasons:

  • (data, vtable) layout is unstable for dyn Trait.
  • Vtable could potentially point to the wrong type's vtable if you set it in (safe) code:
    let mut my_dyn: (*const (), *const VTable) = (&2usize as *const usize as *const _, VTable::usize_table());
    my_dyn.1 = VTable::string_table();
    unsafe { (my_dyn.1.print)(my_dyn.0) }; // Uh oh!
  • As seen above, there are potentially unsafe uses of the type, and then we'd end up at a safe abstraction over a potentially unsafe thing... oh wait; we're back to dyn Trait!
  • You, as the programmer, cannot directly access the vtable for a type without unsafe and going through an unstable structure (The layout of (data, vtable) is once again, unstable).
  • How would I go about trying to make sure the destructor is run?
  • dyn Trait is a static type and can be checked at compile time. While it may be called a "runtime type", it can still be statically checked, and therefore make the type system stronger.

The main reason to not use dynamic dispatch in general is that there is a performance loss by going through several pointers (The data, the vtable and then the function).

If your structure cannot be expressed using the standard dyn Trait structure, then restructuring your program should be your first try, and then developing your own (data, pointer) structure once you've convinced yourself you cannot go forward without it.

In most cases, generics are fine.

@OptimisticPeach I think he was asking the opposite, why use (data, vtable) over dyn Trait, as is the case with the types he listed.

There are a few potential uses cases for this,

  1. To have a more compact representation of the (data, vtable) pair as with tokio's RawTask

RawTask is only 1 word in size, where as fat pointers that represent trait objects are 2 words.

  1. To allow dynamically generating parts of the process. (None of the above)
  2. To control the layout (RawTask)
  3. Other optimizations (std::task::RawWaker)

With RawWaker we don't want to constrain what can be used to represent the waker, so we would like to efficiently type erase it. We have two options here, since we don't want the end product (Waker) to have any generics on it's interface:

  1. use trait objects

But there is a catch, it will require an allocation

pub trait RawWaker { ... }

pub struct Waker {
    raw: Arc<dyn RawWaker + Send + Sync>
    //   /\ this allocation, pick your favorite smart pointer

This is unacceptable, because we want to support no_std futures in environments that don't have an allocator. So how do we fix this?

  1. roll out our own trait objects

This is what Rust went with, this way you need to supply a pointer to RawWaker and a RawWakerVTable which do all the same things as a normal data pointer and vtable, but with the added bonus of not having to deal with the allocation.

Normally in std applications we do allocate using an Arc, but Rust doesn't want to compramise on no_std futures. So we must use dirty tricks like this to get things working.


Note that the both components you mentioned are:

  1. Very low level so only developers of the async executor and runtime developers needs to touch them, who are expected to be used to reasoning every bits of the memory representation.

  2. Will be triggered in hot code of every async code, so optimize them would benefit the entire ecosystem.

So they're exceptionally optimized on it's every bits, by sacrificing large parts of its convenience.

Also note that those "optimizations" were not implemented at the initial version of the futures@0.1 and the tokio@0.1 and we added such hardcore optimizations based on the years of experience gathered from the use cases including the production servers. You shouldn't optimize hard without working code with profiling setup. Correct code first, make it faster next.


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.