NaN vs Result in math standard libraries


#1

I am new to Rust and loving it. I recently hit a bug in c++ using acos where rounding errors made a calculation slightly larger than 1 which would return NaN from acos causing failures later on. It took a while to track down and fix, and I thought “I bet Rust would have prevented that error by having acos return a Result”. When I looked however the std acos in rust returns NaN instead of a Result.

I am curious what the reasoning was for Rust to return NaN instead of Result for cases such as this or getting the sqrt of a negative number. Is it for performance reasons? Or just a result of directly wrapping the standard C library? I was also hoping there might be a crate that safely wraps them, but couldn’t find one. Does a crate exist that returns Results for standard math functions?


#2

Rust in general has its floating-point types match whatever the IEEE behaviour is, so that canonical algorithms do what’s expected of them.

I’d love to see a thorough crate with never-NAN float types, especially since I think that’s a better answer to "f32 isn’t Ord" sadness than a newtype that arbitrarily sorts NANs at one end, making min and max confusing.


#3

What do you think about noisy_float, which was recently discussed in the context of the Chucklefish AMA on Reddit? Basically, it gives you float wrappers that check for absence of NaN and inf in debug mode.

In general, I tend to think about NaN as a product of bad design in IEEE 754, right next to denormals. Trapping on invalid or abnormally lossy math operations would have been a much better default, but the IEEE commitee got stuck in a Javascript-like “must keep running at all cost no matter how stupid the result gets” mentality, of which you can still find traces if you look at the writings of Kahan from around that time. Of course, you can work around this hardware ALU design flaw in software, but it’s quite costly, which is why no sane programming environment does it in release mode.


#4

Thanks for the reference to the Chucklefish AMA on Reddit. It was a great read.

I also hadn’t seen noisy_float. It looks like a good start. With a panic though I do have to hope that it gets triggered during testing and not after deployment. I like that Result forces me to consider the possibilities at the time I am writing it.


#5

A Result would be clearer, but at the same time the clarity comes at a price: you would then need to explicitly check on every float operation (which the average CPU can execute hundreds of billions of per second) for a failure case which will only happen very rarely in well-written code. This would make computational code both more messy (cannot just compute “a + b” anymore) and less efficient (must run the check for errors on every float operation). Exceptions/panics just work better for this kind of “exceptional failure in a very common operation” use cases.


#6

I’m currently writing a floating point math heavy program and i’m definitely feeling the pain. I’m thinking about a library solution for a while now.

My current idea is: Using session types like so: say, F64 is a wrapper around f64 that is guaranteed to be non-NaN and non-Inf and it deferences into f64 so that it becomes usable with libs that are build around standard floats.
Of course, F64 is total ordered etc.

But, all operations on F64 that may return NaN or Inf are returning DirtyF64. DirtyF64 still implements all float operations, you can use them in calculations, but to get the value out, it must be unwrapped (panics on failure) or unpacked (returns Option)

The idea is, that instead of checking every single operation, we let the compiler track the range and checks occure only when nessesary.

Does anyone see holes in that concept? Currently I‘m prototyping a little bit to see if it‘s possible to get the ergonomics right.

Another possibilty: Make one Type, but with parameters for various invariants (non-NaN/non-Zero etc) and define all operations in a way, that returns fitting types (for example, two positive, non-nan/ non-inf floats added gives a positive, non-nan float)… maybe overengineered and pointless


#7

I like your DirtyF64 idea. It seems to find the balance between normal f64 and noisy_float in that you can still pick up errors, but won’t blow up the entire application if you encounter one (i.e. from a debug_assert!(!some_number.is_nan())).


#8

That looks like a nice design direction! Compared to the “check value when variable is set” approach used by noisy_float, I expect this “check value when variable is read” approach to allow better run-time performance, which could in turn allow leaving the checks on in release mode, at the cost of exhibiting a worse ability to locate the source of the error.

In this sense, the two approaches seem complementary: the noisy_float approach seems better suited for debugging, and the DirtyF64 approach seems better suited for contract correctness checking. So maybe a nice design would be to have a “debug” hook which checks whenever the floating-point value is set in order to achive precise error reporting via panicking, and an always-on “release” hook which checks when the value is read in order to make sure that the F64 contract remains upheld. This could be done by implementding DirtyF64 using a noisy_float. Just food for thought.


#9

I agree. I like the balance achieved with the DirtyF64 idea. It could also be nice to have an optional way of checking and panicking as suggested. If you do start developing it I would like to try it out. If not I might try to put something together like that.


#10

On a side note, you may want to keep the table at the top of this article from Kahan under your pillow, as a reference of what typical circumstances will lead an IEEE 754 implementation to generate NaNs and Infs from non-NaN and non-Inf input. In particular, note that even add/sub/mul are “unsafe”, as overflow will lead to the generation of a +/-Inf value.


#11

So maybe a nice design would be to have a “debug” hook which checks whenever the floating-point value is set

It would be a tradeoff between better error reporting and the ability to handle failure without panic (for example, calling unwrap_or on DirtyF64). I think debug checking should be at least optional.

If you do start developing it I would like to try it out.

Cool! I would like to do it (famous last words:smiley:)


#12

noisy_float has got you covered here. If you check the documentation, you will find that the float-checking behaviour is fully customizable, which allows in particular to turn it off given a suitable compile-time configuration.


#13

Personally I was imagining the type as extended reals, so ±∞ would be fine. If you ever flip things into logarithms, for example, it’s very useful to have log(0) = −∞ and exp(−∞) = 0. And the infinities sort fine, so don’t block Ord.

(Of course that only makes mul safe, since ∞−∞ is still NaN.)


#14

Not quite. ∞ × 0 = NaN.


#15

Oops! Good point. (And ∞∕∞ in div.)


#16

Great. I will keep an eye on this thread, and try it out as soon as you have something ready to share.


#17

I can see the rationale here, and that’s probably why noisy_float provides both “no NaNs” and “no NaNs and no Infs” types. At the same time, the problem with Inf in IEEE 754 is that it is ambiguous: given an Inf value, you don’t know if you’re getting it because you have purposely used it or because some computation somewhere has overflown the exponent.

In theory, there’s a way to know: just check the overflow flag in the status word. In practice, however, that’s somewhat fragile. Assuming you even have a way to read that status code (not sure if that’s always the case, e.g. on GPUs I don’t recall having an instruction for that), then you will face the fact that like errno, the IEEE 754 status word is effectively a global (more precisely thread-local) variable which can be modified by any computation, so it takes as little as an OS/runtime optimization which decides that it’s not worth saving on context switches in order to make it meaningless…

…which is why I think the IEEE commitee should really have used trap-based error handling for all but the most trivial “Inexact” exceptions, and left OSs and math libraries handle the traps in the way they like best. But alas, there’s no rewriting the past, and as of IEEE 754-2008, trapping on floating-point exceptions remains an optional extension of the standard which cannot be relied upon in portable programs.


#18

I’m really on the fence about that one. Infinity is often a valid “starting value” in many algorithms. But it’s rare that it’s a desired result.
And i really dislike providing separate types for that. Imo finding the right defaults is key for such a library. If it’s unergonomic and people have to read a lot of docs before using it and are forced to make decisions that they can’t really make in an informed way, they’ll probably just use standard floating point instead.


#19

If you want to allow for “inf-as-an-extended-real” without allowing for “inf-as-an-error-code”, then you will need to somehow preserve the relevant bits of the IEEE 754 status word until the DirtyF64 -> F64 transformation is carried out in order to tell the difference.

This might get tricky if, for example, a DirtyF64 is passed around from one CPU thread to another, which can happen in parallel computations or when using some flavors of coroutines. I’m not sure if Rust has a way to hook on that event, otherwise you may need to resort to ugly tricks like impl !Send for DirtyF64.


#20

Eyup :smiley: I’ll look into that, but my gut feeling is, that this would get awkward pretty quickly.

On a general note: Do we want floats to behave like array indexing (don’t bother the user and blow up on runtime) or should it behave more like an Option (its possible to handle failure explicitly if wanted)?

My initial direction is more variant 2. But then it doesn’t make sense to do debug-asserts on every operation.