Most optimizations are not guaranteed in Rust, but it is extremely common to have simple wrappers for primitive types that are expected to perform as well as the unwrapped types.
The layout is practically guaranteed to be the same (if you want a hard guarantee, add #[repr(transparent)]). The operations, unless you go through some serious hoops to trick the compiler, should all get trivially optimized down to the same exact thing.
Note that the talk also mentions some abstractions with runtime costs. An example mentioned in the talk and very similar to the one in this thread is unique_ptr, which despite just wrapping a pointer is ultimately a struct and is hence passed on the stack when performing function calls.
And while there's attribute that can make it trivial… clang and llvm are stuck with inefficiency (while MSVC, surprisingly enough, the winner here).
Similar story is with tuple: here the distinction is even more subtle. If you are using libc++ (default on Android, iOS and macOS) then it's zero-cost, but if you use libstdc++… then nope.
Rust doesn't have stable ABI thus I would expect such crazy corner cases to be rare… but they are still possible. That's why people say that “nothing is guaranteed with optimizations, but usually X happens”.
It's passed in a register either way. All three versions (f32, regular wrapper, transparent wrapper) produce the same assembly code, passing the argument and return value in a register: playground.
I don't think repr(transparent) should be recommended as a performance optimization. It only restricts what the compiler is allowed to do. If the compiler has no reason to think adding padding will improve performance, it will not arbitrarily add padding. There is no need to force that manually (unless you need that for reasons other than performance).