How much slower is a Dynamic Dispatch really?

So, if I understood it correctly, the performance hit in a Dynamic Dispatch comes from having to look up which implementation to use at runtime. At the same time I've seen in wild people recommending strategies such as using an Enum to figure it out, but isn't that also basically an extra step at runtime ?

I don't know anything that happens at the low level, so I'm wondering if there is a significant difference between a dynamic dispatch or something like pattern matching.

1 Like

In practice, in absolute terms, either method is very fast, and you won't notice any performance difference unless the call is in some very hot path called many millions of times.

Dynamic dispatch usually prevents the compiler from inlining the code and knowing what the called code is doing. So the cost is not just in the dynamic call itself, but also a lost opportunity to optimize it more. If you dispatch via enum there's still some run-time work required to figure out what code to run, but the compiler can see through the enum variants and see what it's calling. It might be able to optimize that further if the functions called are small enough to be inlined and/or if multiple enum variants execute some code in common.

18 Likes

Dynamic dispatch has some memory overhead. Box<dyn Trait> uses an extra word to store the vtable. (C++ is different; the vtable is stored in the object itself which can be worse for latency when you have to look up something in it.) On the other hand, compiling code with dyn Trait can be faster than compiling generic code and can significantly slim down your executable (which is good because instruction cache space is limited). See this article.

2 Likes

Reusing a recent post:

3 Likes

No, not only.

A purely dynamically dispatched call:

  • has to look up the function pointer in the vtable;
  • has to perform the call indirectly, through the function pointer;
  • cannot be inlined;
  • cannot be efficiently branch-predicted or speculatively executed;
  • may cause an instruction cache miss if the function pointer (ie., the underlying type) changes frequently on a given code path.

In contrast, a match on an enum variant is just a direct conditional jump.

In reality, however, whatever the compiler ends up generating in the final binary may not be mechanistically the same as the source code you wrote. Optimizers can see through certain virtual calls and inline them or transform them to jumps just like static calls or enum matching. This is called devirtualization.

For example, here you can observe for yourself that whatever the "dyamic" call is doing has been completely eliminated and the body of greet() has been inlined into main().

4 Likes

Also, Box<dyn Trait> requires an allocation but the equivalent enum doesn't. This doesn't affect the cost of dynamic calls directly, but can be a significant performance penalty if your objects are short-lived.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.