Is there any reason to use `fn(t: &impl Trait)` and not `fn(t: &dyn Trait)`?

I tried to see the difference with godbolt. It seems that the assembly for all of those function is going to be the same (exactly as I was expecting). But using impl Trait will create a different implementation for every instantiation, so more code bloat.

I absolutely understand why one would like to use fn foo(t: impl Trait) since having a different monomorphisation for each type may lead to better runtime performance. But when using references, especially const references, is it useful? I would expect that all indices would be fixed anyway. And if not, could the compiler only generate the &dyn version instead of the &impl? Maybe with impl, the reference can be a thin pointer, while it need to be a fat pointer for &dyn?

2 Likes

I mean, just like for non-references, they don't do the same thing, and there are some tradeoffs. With impl Trait, there are advantages such as thin pointers and the compiler being able to apply optimizations that are specific to the underlying type, whereas with dyn Trait, only a single function is compiled.

1 Like

That gives the difference I would expect:

The impl version calls the method directly (well, as directly as PIC can), while the dyn version calls through the vtable.

So it has all the normal differences that one would expect from this around what happens when other functions call these ones. It might not matter here, but if it were making multiple calls that could be inlined in one but not the other, say, that may impact the behaviour of the optimizer.

(Though it certainly seems to be true that it's more common to overuse impl in Rust than it is to overuse dyn.)

3 Likes

But isn’t

        call    qword ptr [rip + <example::Y as example::T>::foo@GOTPCREL]

and

        call    qword ptr [rsi + 24]

both calls at a constant offset from a register (rip or rsi). And thus exactly as fast in both cases?

And regarding the optimizer, why use_dyn(&dyn) prevent the optimizer to inline use_dyn(&A) and not use_dyn(&B)?

Well, I guess it would be clearer with -C relocation-model=static. With that, it's

call    <example::X as example::T>::foo

vs

call    qword ptr [rsi + 24] 

I'm not actually sure what the loader does with PIC. Once the code is loaded, it could conceivably rewrite all the RIP-relative addressing to the now-known address. But if nothing else, it'll affect the CPU itself, since the RIP-relative call always goes to the same place, so no speculation or branch prediction is needed to execute through it.

1 Like

Also, the calls to use_dyn have a couple more instructions than the calls to use_impl.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.