Vtable method search technique

Hello.

How is the virtual table and the search for a method called for a specific structure implemented internally? I mean by the number of instances of these vtable/es - one instance of a vtable is created, which is stored in a trait and then, as in an array (a vtable in the form of an array?), is searched for dynamically. If so, how is the search and comparison of the array index with the specific implementation for the specific structure performed? Or is a separate internal structure created for each impl block of a trait that is linked to the vtable, and while the compiler determines the type (which was substituted for 'dyn') at runtime it is replaced with an instance of this structure and then a call is made based on the order in which the methods are defined?

trait Shape {
    fn area(&self);

    fn draw(&self);
};


struct Rect;

struct Circle;

impl Shape for Rect {
   fn area(&self) {
      println!("3")
   };
	
   fn draw(&self) {
      println!("Rect")
   };
};


impl Shape for Circle {
   fn area(&self) {
      println!("4")
   };
	
   fn draw(&self) {
      println!("Circle")
   };
}


fn draw_dynamic(a: &dyn Shape) {
    a.draw();
    a.area();
}

how the draw_dynamic function will determine in the vtable for which structure it needs to call the methods 'draw' and 'area'?

I'm not sure what specifically you are asking about, but the main difference between rust and C++ regarding dynamic dispatch is that, rust use "fat pointers" where the vptr (a pointer to the vtable) is part of the pointer itself, while C++ uses "thin pointers" but "fat" objects, meaning the vptr is embedded in the pointee.

in your example, the argument a of function draw_dynamic is of type &dyn Shape, which contains both both the data pointer and the vptr.

here's an example:

// trait defines the vtable structure
trait Draw {
  fn draw(&self);
}
// type defines the data structure
struct Circle {
    radius: f32;
}
// compiler generates vtable according to `impl` block
impl Draw for Circle {
    fn draw(&self) { todo!() }
}

fn main() {
    // object has no "hidden" field
    assert_eq!(size_of::<Circle>(), size_of::<f32>());
    // object pointer is "thin"
    assert_eq!(size_of::<&Circle>(), size_of::<isize>());
    // trait object pointer is "fat"
    assert_eq!(size_of::<&dyn Draw>(), size_of::<isize>() + size_of::<&Circle>);
}
// base class defines vtable structure
struct Drawable {
    virtual void draw() = 0;
}
// derived class implements the virtual methods:
class Circle: public Drawable {
    float radius;
public:
    virtual void draw() override {
        //...
    }
}
int main() {
    // the "base" class has a "hidden" vptr member
    assert(sizeof(Drawable) == sizeof(intptr_t));
    // the derived class inherits the "hidden" vptr member from the base class
    assert(sizeof(Circle) == sizeof(intptr_t) + sizeof(float));
    // pointers in C++ are all thin, though
    assert(sizeof(Drawable *) == sizeof(intptr_t));
    assert(sizeof(Circle *) == sizeof(intptr_t));
    assert(sizeof(void *) == sizeof(intptr_t));
}

if this is not what you are asking for, feel free to elaborate the question.

There is no search. A vtable is closer to a struct than a “table” or array data structure, and, like a struct, every piece of it is at a fixed offset. There is one (potential) vtable for each (concrete, not generic) implementation of each trait. A pointer to dyn Shape includes a pointer to the vtable structure for that particular implementation, so calling a method on &dyn Shape or Box<dyn Shape> consists of

  • following the vtable pointer to find the vtable struct,
  • getting a function pointer out of that struct,
  • then calling the function.

Here is a DIY implementation of vtables you can plug into your example:

/// This struct is like the vtable that the compiler creates.
/// (Actual vtables have a little more information than just the
/// methods.)
struct ShapeVtable {
    // The `self` arguments have been replaced with `*const ()` to
    // “erase” the type; we can’t store them with the actual type.
    area: fn(*const ()),
    draw: fn(*const ()),
}

/// This struct resembles a `&'a dyn Shape`.
struct ShapeDynRef<'a> {
    ptr: *const (),
    vtable: &'static ShapeVtable,
    lt: PhantomData<&'a ()>,
}

/// The compiler internally produces an `impl Shape for dyn Shape`,
/// which does roughly what this impl does.
impl Shape for ShapeDynRef<'_> {
    fn area(&self) {
        (self.vtable.area)(self.ptr)
    }

    fn draw(&self) {
        (self.vtable.draw)(self.ptr)
    }
}

/// This `From` implementation corresponds to coercing a `&T` to
/// `&dyn Shape`.
impl<'a, T: Shape> From<&'a T> for ShapeDynRef<'a> {
    fn from(reference: &'a T) -> Self {
        ShapeDynRef {
            ptr: std::ptr::from_ref(reference).cast::<()>(),
            lt: PhantomData,

            // Create a vtable from the methods of `<T as Shape>`.
            // In reality, the compiler allocates this only once;
            // for this toy example I’ll just leak the allocation.
            vtable: Box::leak(Box::new(ShapeVtable {
                draw: |this| T::draw(unsafe { &*this.cast::<T>() }),
                area: |this| T::area(unsafe { &*this.cast::<T>() }),
            })),
        }
    }
}

You can try it out like:

fn main() {
    let shape = Rect;
    let shape_ref = &shape;
    let shape_dyn = ShapeDynRef::from(shape_ref);
    draw_dynamic(shape_dyn);
}

(Don’t actually use this code — it’s much less flexible than actual dyn.)

7 Likes

In the fat pointer @nerditation mentioned there is a data pointer and a vtable pointer. The layout of a vtable is shown in this article, for example (Trait Objects and the Vtable). The trait must also be object safe Traits - The Rust Reference to support dynamic dispatch, this way the method on a trait can be resolved to the corresponding implementation on a concrete type. I stumbled upon this diagram by Raph Levien which has a diagram for Rust fat pointers (top right) which visualizes a trait object.

Your answer is exactly what I was looking for, thank you. But I still have some questions about the details that you described in your "toy" example.

  1. why should we “erase” type and can't store link to the "self" which points to concrete type?
struct ShapeVtable {
    // The `self` arguments have been replaced with `*const ()` to
    // “erase” the type; we can’t store them with the actual type.
    area: fn(*const ()),
    draw: fn(*const ()),
}

1.2 Same question for ptr: *const () in ShapeDynRef struct. Why we should "erase" type?

1.3. Did I understand correctly that an "instance/object" of the 'ShapeDynRef' structure is created for each type that implements the trait 'Shape' and it enters the work when it sees the keyword 'dyn' in runtime?

1.3 Same question and for ShapeVtable. For example, if I have 5 structures that implements 1 trate, does this mean that 5 "instances/objects" of the vtable structure will be implemented (for each of the impl block trait)?

1.4 it is not entirely clear how the necessary instance of the structure is linked at the moment when the compiler sees the keyword 'dyn' in 'draw_dynamic' function - if I can have several instances of 'ShapeDynRef' structure, how does the moment happen (it also happens in runtime and not at the compilation stage as in the case of generics)?

1.5 What you mean by compiler allocates this only once? Do you mean that it allocates a piece of memory in the heap for a particular structure, rather than storing it in binary form for the entire lifetime of the program?

// Create a vtable from the methods of `<T as Shape>`.
            // In reality, the compiler allocates this only once;
            // for this toy example I’ll just leak the allocation.
            vtable: Box::leak(Box::new(ShapeVtable {
                draw: |this| T::draw(unsafe { &*this.cast::<T>() }),
                area: |this| T::area(unsafe { &*this.cast::<T>() }),
            })),

1.6 maybe there is an exact code for the implementation of the 'dyn' structure/trait (the place where the vtable is stored and implemented) by analogy with go? Or are these details hidden behind LLVM and not shown/written in rust itself?

1.7 at what point does the conversion of a specific type of implementing trate (Circle, Rect) to the ShapeDynRef type work out? Mean, when does the implementation from the From trait work out?

I wanted to understand at a more primitive level how vtable works (without comparing it with other languages). It caused difficulties, because rust the first language that has the concept of "virtual" tables that I saw, but thank you for responding and drawing an analogy with c++ and their virtual methods and introducing a new concept of "fat pointers" for me :wink:. I have few questions about your example:

  1. Do I understand correctly that by "fat pointer" you mean an abstract structure that stores pointers to other abstract structures and acts as an "intermediary"?
  2. Why did you specify the sizeof in the asserts block in the code? Does this have any meaning in the concept of dynamic dispatch or does it relate to another part?
  3. Do I understand correctly that the whole difference between virtual methods in c++ and rust is how many "jumps" need to make before get to the desired function (rust == 2, c++ == 3)?

The type information is erased from the structs because the pointer/reference type does not carry any information about the actual type it points to and ShapeDynRef is the equivalent of &'a dyn Shape. The type does not contain any information about the actual type that implements Shape. It just includes the pointer as a value to the accoring vtable.

There is a vtable for every type that implements the trait. If the compiler has to translate a specific type T to a dyn Trait it creates a vtable for this type if it has not done so already. The vtable is not on the heap but in the binary. Nevertheless, the space has to be allocated (in the binary).

At runtime the thin pointer &T is translated to &dyn Trait by combining the pointer value and the pointer to the vtable. In C++ the vtable is stored as a reference for every object inline so you have one mor indirection. This post has a nice visualization.

One implementation detail: Since every codegen unit is independent there might be multiple vtables for the same type's trait implementation. This is definitely true for dynamic linking. I do not know what the current status is regarding deduplication with static linking.

1 Like

sorry, I assumed you were from a C++ background. because I only hear C++ people talking about it, and never the Java people, despite in Java essentially every method is "virtual" by default. the concept of "virtual" functions is a core concept for the object-oriented programming paradigm, and is how "runtime polymophism" is impelmented, but with the exception of C++, most languages don't expose the implementation details to the programmer.

the code example given by @kpreid explains the vtable structure and rust's trait objects way better than I can do it, so I won't repeat here.

not exactly. any pointer type could 'acts as an "intermediary"', "fat" pointers are different from "thin" pointers in that, traditionally in C, all pointers have the size, i.e. sizeof(void*) (not including vendor/platform specific extensions such as the far pointers in 16-bit msvc); however, in other languages, a pointer might need extra information besides the address itself, and "fat" pointer just means it is a pointer type and has larger size than a single address.

for example, rust uses fat pointers for trait objects, and this extra piece of information happens to be a pointer to the vtable for the implementing type.

it's just a concise way to show the difference between C++'s (thin) pointers and rust (fat) pointers. this is orthogonal to how the vtable is implemented by the compiler though.

well, I think you get the idea -- if by "jumps", you mean memory access, then yes, C++ would indeed need one more load in order to resolve the target address. and if you count both d-cache side "loads" and i-cache side "branches", then 2 for rust and 3 for C++ would be the correct number.

in C++, you need to first load the hidden vptr (the pointer to the vtable) through the object pointer, then load the function pointer through the vptr, and finally call the function pointer.

in rust, the trait object reference is "fat", so you already have the vptr at hand, alongside the pointer to the object "data", just load the function pointer and call it.

I’ll individually answer each of the questions you directed to me:

The whole point is that there is only one type ShapeVtable which points to the functions for any kind of Shape implementor. We can’t store the function pointers with the original concrete parameter type because that type might be &Rect or might be &Circle, which couldn’t be stored in the same vtable type.

Because it might be actually a &Rect or a &Circle. Now, we could make ShapeDynRef generic,

struct ShapeDynRef<'a, T> {
    ptr: &'a T,
    vtable: &'static ShapeVtable,
}

but that would defeat the point, because then you can’t store ShapeDynRefs made from different concrete types. Perhaps it would help to clarify this point of terminology: dyn Shape is a form of type erasure. The reason my example code contains type erasure is because if it didn’t contain any type erasure, it would not be doing what dyn Shape does.

Per value, not per type. ShapeDynRef<'a> corresponds to &'a dyn Shape. You create one every time you use an &Rect or an &Circle as a &dyn Shape. The real dyn version of the line

let shape_dyn = ShapeDynRef::from(shape_ref);

is

let shape_dyn: &dyn Shape = shape_ref;

but it doesn’t matter whether the type is written out; draw_dynamic(&Rect) would do the same thing in one line.

Abstractly, that is exactly correct. (However, in practice, the compiler is allowed to duplicate vtables for its convenience, or to combine ones that in fact have all the same functions when looked at at the machine-code level. So actually comparing vtable pointers can be misleading; the compiler even warns against doing this.)

No, I mean the second thing. The allocation happens at compile time and the vtable data becomes part of the binary, just like any static or literal value, or function code.

dyn is entirely made up of things the compiler does, not the standard library. I can’t tell you much about it. Maybe this page on unsizing coercions has some vaguely useful pointers?. I can say that it’s all part of rustc per se and not LLVM, though.

Changing &Rect to &dyn Shape is a coercion, specifically an unsized (or unsizing) coercion. Coercions can happen at any coercion site if the input type and the required type are different. In particular, in your code, a coercion will typically happen whenever you write draw_dynamic(&some_shape) — the use of &some_shape as function argument is a coercion site, and draw_dynamic’s signature provides the expected dyn type to coerce to.

3 Likes

Thanks for the detailed explanation, this answered my question exactly, but it also raised a number of questions related to "erased" types, and the general question sounds like - why is everything exactly like this and implemented in no other way for dynamic polymorphism?

  1. "tricks" with customizing the desired type from an null pointer exist due to the dynamic nature of polymorphism, but why not use some "hybrid" format to create a "ShapeDynRef" structure (as you indicated in the answer with generics)? And why did you indicate that this would not make sense - just because it expects casts from an null pointer instead of a specific link in the vtable structure? Maybe I don't see some feature of compiled languages or a rust type system? In theory, a hybrid version(with generic + dynamic cast from null pointer) could be implemented at the compilation stage due to monomorphization?
struct ShapeDynRe1<'a, T> {
    ptr: &'a T,
    vtable: &'static ShapeVtable,
    lt: PhantomData<&'a ()>,
}
  1. Can the compiler apply optimizations for the 'dyn' structure (according to the number of created "instances/objects")? By analogy with Vtable structures, if they are duplicates? Or is there a feature here and that's why for each line where the 'dyn' structure is created, there must be a unique instance of this structure (and it doesn't matter that inside it may contain the same pointer to 'data' and 'vtable')?

Per value , not per type . ShapeDynRef<'a> corresponds to &'a dyn Shape .

Thank you for answer. learning a system language after simpler languages like python/js is hell.
I didn't quite understand the idea related to the size of the pointer, do you mean that the "fat" pointer takes up more space in memory because it stores several (in the case of rust 2) pointers to other memory areas? Or did you mean that in other languages pointers have some other "metadata", but in rust this "metadata" is not needed and therefore the pointer weighs less in memory?

Thank you for details. Yep, i saw this post, but some details are not disclosed there, and they just misled me in my thoughts (I took visualization - so the "instance" of vtable is the same for all-all implementations and is represented as a data structure like an array or a hash map. I didn't quite understand about code generation, do you mean that duplicate "vtable" can be stored in a static (binary format) for different modules? And you also mentioned that for static linking - do you mean by the word "linking" - dispatching? If so, do you want to say that the result of the "monomorphization" process for static polymorphism is also stored in binary format and there may be the same "anomalies" with duplicate implementations as with vtable?

Thanks for answer, by this article that the functions declared in the trait should have the limitations specified in the article (in simple words, do not have a generic/self - because they are not output immediately?)

I’m not familiar with the history of how this particular part of Rust was designed. It does fit neatly into Rust’s use of traits for polymorphism — one can often use generics or trait objects in the same place, and the choice is a tradeoff of performance characteristics (compile speed vs. execution speed vs. machine code size).

I don’t understand what you mean. If you mean my remark about ShapeDynRef<'a, T> — that simply doesn’t work; if you don’t erase the type then you’ve achieved nothing.

By the way, “null pointer” means a pointer that points to address 0 and is always invalid. It has nothing to do with the type of the pointer.

Under some conditions the optimizer can “devirtualize” the code, completely eliminating the use of the vtable pointer, if it notices that only one concrete type is used. But there is no way to reduce “duplicates” at run time — that would be like trying to deduplicate all occurrences of the number 0. The additional management would be more expensive than storing all the copies of the number.

2 Likes

this.

using the term "pointer" for memory addresses was popularized by C (I don't know if it was invented by C though). in C, "pointer" just means "address", because of the simplicity of its type system.

for languages with advanced types, "pointer" and "address" are not necessarily synonymous. some languages doesn't use the term "pointer" for memory references so no confusion there, while many languages inherit the C terminology. in situations where it is necessary to differentiate "more than an address" and "just an address", the term "fat pointer" (or sometimes "wide pointer") is often used, where the opposite is often called "thin pointer", "narrow pointer", or just "pointer".

in rust, all pointers semantically consist of both an address and some metadata. for "normal" types (the term is "statically-sized types"), the metadata is fully known at compile time and no information need to be stored at runtime. however, pointers to dynamically sized types (DST for short) must be "fat" because the metadata (see Pointee for more info) can only be known at runtime.

trait objects are one example of DST, the other ones are str and slices, and user defined structs where the last field is a DST.

2 Likes

this is really useful, but it also raises the question of why "fat" pointers(your answer: in rust, all pointers semantically consist of both an address and some metadata) are used in rust for all types (even scalars whose size is fixed), if, as you said, this information is known at the compilation stage and is not needed in runtime. Or does rust at the compilation stage for types with a known size remove all meta information that is not available in runtime?

Yes, you're correctly interpreting what they said.

1 Like

thx

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.