Is `struct Dip(f32)` always as fast as `f32`?

I have some UI layout code, and I was thinking of creating a new type to help keep my units straight.

/// Device independent pixels
struct Dip(f32)
impl Add for Dip { ... }
impl Mul for Dip { ... }
...

Is this always going to be as fast as raw f32? I remember a C++ talk called something like "There are no zero cost abstractions".

Will you be working with thousands of Dip per form?

I think that's possible, if I have a more complex UI widget like a long scrolling list.

Most optimizations are not guaranteed in Rust, but it is extremely common to have simple wrappers for primitive types that are expected to perform as well as the unwrapped types.

1 Like

The layout is practically guaranteed to be the same (if you want a hard guarantee, add #[repr(transparent)]). The operations, unless you go through some serious hoops to trick the compiler, should all get trivially optimized down to the same exact thing.

11 Likes

Within a single crate this will typically be equally performant. Across crates you may have mark functions #[inline] to make it performant.

I find the title of that talk somewhat misleading. He's talking about costs other than runtime performance: human comprehensibility of the code, etc.

7 Likes

You can check if it ends up optimizing the same:

And file bugs if #[repr(transparent)] and #[inline] are not enough.

2 Likes

Note that if you ever want to transmute between Dip and f32, or eg. &[Dip] and &[f32], repr(transparent) is required to avoid UB.

3 Likes

Note that the talk also mentions some abstractions with runtime costs. An example mentioned in the talk and very similar to the one in this thread is unique_ptr, which despite just wrapping a pointer is ultimately a struct and is hence passed on the stack when performing function calls.

Note that these cases are usually coming from the all-important backward compatibility. And only if you are not on Windows.

unique_ptr is non-trivial for the purposed of call not because it's a struct, but because if has non-trivial destructur!

And while there's attribute that can make it trivial… clang and llvm are stuck with inefficiency (while MSVC, surprisingly enough, the winner here).

Similar story is with tuple: here the distinction is even more subtle. If you are using libc++ (default on Android, iOS and macOS) then it's zero-cost, but if you use libstdc++… then nope.

Rust doesn't have stable ABI thus I would expect such crazy corner cases to be rare… but they are still possible. That's why people say that “nothing is guaranteed with optimizations, but usually X happens”.

3 Likes

That's exactly what repr(transparent) avoids in Rust.

1 Like

It's passed in a register either way. All three versions (f32, regular wrapper, transparent wrapper) produce the same assembly code, passing the argument and return value in a register: playground.

I don't think repr(transparent) should be recommended as a performance optimization. It only restricts what the compiler is allowed to do. If the compiler has no reason to think adding padding will improve performance, it will not arbitrarily add padding. There is no need to force that manually (unless you need that for reasons other than performance).

2 Likes