Math result varies in debug vs. release builds

On Linux, this program produces different results depending on whether you run with --release:

fn main() {
    println!("{:?}", 0.7f32.tanh());

Debug playground: 0.60436773

Release playground: 0.6043678

I know tanh returns an approximation, not a perfectly rounded answer for all inputs. So it's not surprising when the results vary by platform or by compiler implementation. That's fine.

Nonetheless, it's odd that opt-level alone causes this difference in behavior.

  • It doesn't seem too hard to make math behave consistently across opt-levels
  • It's certainly valuable for math to behave consistently
  • Looking at the assembly, both builds contain a call to tanhf, and both pass the same constant value! How can it be right that they get different answers? Or... is the release build ignoring the result??

Is this a bug?


Windows 7 (as a point of reference)... Both are 0.6043678.

Debug: 0.60436773, release: 0.6043678, confirmed for Debian 10 / rustc 1.68.0.

gcc at all optimization levels: 0.6043678.

Assembly release build has value worked out that it uses.
Why it still calls tanh seems weird. (My guess is whoever added the code for pre calculated result just forgot to remove it.)

OK, that at least seems like it has to be a minor bug. How does it make sense to constant-fold a function call if you're somehow not allowed to optimize the actual call away? Is it calling the function for its side effects?

If I change the method to .exp(), .tan(), .asin(), .sinh(), or .cosh(), the call out to libm is fully optimized away. For .atan(), it's not.

Looks like a bug, please file a issue to rust-lang repo.

1 Like

Filed #108965.


OK, now that that's dealt with, back to the original issue.

How about this example?

fn main() {
    println!("{:?}", 0.7f32.tanh());
    println!("{:?}", std::hint::black_box(0.7f32).tanh());

On a scale of 0-10, how dirty does it feel that this prints two different numbers in a release build?

I'm about a 7.

Depends which one is correct.

0.7_f32 is exactly 11744051/16777216.

= 0.6043677695504778846372451444485265950613765494380278310390125379...
= 10139608.613.../16777216
~ 10139609/16777216
= 0.6043678_f32

So the optimized one is exactly right (it's the only representable value within ±½ULP of the exact result), but the runtime one is the other answer within ±1ULP.

Thus for me it's not that dirty. I think it's more that I'm sad that the libm isn't doing better. (Though getting these accurate and fast is a seriously hard problem.)


Floats by themself must be at least 7.5