Missing optimizations for "flattened" trait objects

I use "flattened" trait objects discussed in this thread. Unfortunately, it results in sub-optimal assembly as can be seen here: Compiler Explorer

Instead of pointing to original functions (lines 17 and 21), in the flattening function compiler creates call_once wrappers which immediately jump to the original functions (it can be done because wrappers and the original functions have exactly the same ABI). Yes, it's a very minor overhead, but still an unpleasant one. Is there a reason why compiler can not collapse those wrappers? Or is it simple a missed optimization opportunity?

I could work around it by using #[inline(always)] on Bar::foo1 and Bar::foo2, but it's not always desirable to force inlining (e.g. if the Foo trait used directly). Also it will make a bit harder to analyze such code in debugger. An alternative solution could be to transmute function pointers as discussed in the linked thread, but I am not confident about soundness of such approach.

cc @SkiFire13

You're using #[inline(never)], so I think that it's quite reasonable that they're not being inlined.

No annotations has the same effect as #[inline(always)] in this case, so you don't have to force inlining.

2 Likes

#[inline(never)] is used to emulate a large function or trait implementation which comes from a different crate not marked with #[inline]. In theory, because of matching ABI compiler could skip wrapper and save in foo1_ptr and foo2_ptr pointers to the original functions, but for some reason it is not done.