# Watch out for NaNs

Whoa there.

My high school maths, sorry I have forgotten all the years of maths I did after that, causes my mind to reel at that suggestion. It's just wrong. Square root is not defined for negative numbers. At least not on the real number line. Even my ancient calculator displays "-E-" for that.

Sure floating point values are usually approximate. That is in the nature of things given the limitations of the hardware we have. Even in the world of actual mathematics I have never seen a perfect representation of the real number called PI.

Sure you could say -0.1 is close enough to 0.0 that we can return a sqrt(-0.1) as 0.0. But that seems to be introducing an inaccuracy far greater than we have already.

Good question, the answer is simple: you can define whatever operations you want, the question is what are the most useful operations.

E.g. if you're simulating a spaceship orbital parameter, and your formulas call for sqrt(-0.0001) after a bunch of approximations, then 0.00 is more likely to be a better approximation to the answer than 0.01 is, because 0.00 * 0.00 is closer to -0.0001 than 0.01 * 0.01 is.

When somebody says "not defined", they just mean "not yet defined"

But more seriously, I'm not saying we should change the mathematical notion of square root. People seem confused as if I was trying to change an existing function in all of mathematics. I'm just writing a new function in my new API. After all, f64::sqrt(2.0) also isn't quite the same as the mathematical square root of two, for one, one is rational the other is irrational.

No confusion here. I do get the idea.

I have no qualms about that.

This is correct!

What they most often mean is "it is not useful to define it." You know... because of the thing you just said above.

It is not useful to make an FP alternative which for some reason doesn't include sqrt but does have sqrt_but_wrong_sometimes. You'd probably be the only person on the planet who would ever use it, because everyone else can apparently do the exact same thing with IEEE754 by just calling abs or max on their inputs, or checking for NaNs, all with the added benefit of doing it on hardware...

So perhaps, put aside the rant, and take the time to understand that IEEE754 was designed for a wide array of use-cases, and that it is as popular as it is because of how well it is supported, is understood, and achieves some modicum of universality. It is not without flaws or compromises, but there are no alternatives that do not also suffer (often severe) flaws and compromises, as well as lacking support and common understanding. It's the best we've got until someone comes up with something so near perfect that it makes sense to overlook all the costs of ignoring the importance of IEEE754.

And if you truly believe you're onto something better, just implement it. Prove everyone wrong with your billion dollar idea. You certainly don't need the support of anyone in this thread to do that.

7 Likes

Seems like OP actually wants is automatic precision tracking through FP calculations and for FP functions with limited domains to take that precision into account. Potentially useful? But it raises a lot of questions I wouldn't want to be the one to answer.

The downside I can think of: sqrt(-1) = 0 simply is wrong (and not just a bit wrong, such as when rounding or returning infinity instead of a very huge number).

Of course, rounding errors can sum up. But when you set sqrt(-1) = 0, then a single calculation gives a wrong result (that is not just off by rounding).

Note that IEEE 754 has deterministic rounding rules. And setting sqrt(-1) = 0 can't be called "rounding" anymore.

That said, I think one can argue that NaN's are a bad idea. I see that there are pros and cons for them. I like them, but I do understand if other people don't. Maybe we wouldn't have ended up with PartialOrd (see also Traits in std::cmp and mathematical terminology) if NaNs weren't existent. I think it's legitimate to defend the hypothesis that they cause more harm than being useful, even if I feel different (for now).

As I said, in my case there is no infinity involved either: "Note that both are finite."

P.S.: But maybe you meant that infinity can be returned in some cases (when numbers are a bit off).

Incorrect. Perhaps you meant x.maximum(0.0)? https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=a6ecce04c0dbf8a71aa9ab7a1f86fb96

This is called Interval arithmetic - Wikipedia -- it's handy sometimes, but has its own issues.

We probably wouldn't have, but I wish we actually used it more. I've never heard a good reason that Ok < Err, for example. The codegen for derive(PartialOrd) on enums would be much simpler if it just deemed different variants incomparable.

No. I discussed a NaN-free arithmetic and numerical stability, nothing to do with precision tracking. Nothing I said requires tracking precision.

As @scottmcm said, precision tracking would be interval arithmetic. I think interval arithmetic is a good idea, it's just a separate, different idea.

Notably, I think an equivalent of NaN makes sense in interval-arithmetic, because it's sort of inherently an interval-arithmetic thing! Which is why I think it's not useful if you're not doing interval arithmetic. Also -0.0 makes sense as an interval-arithmetic thing, but not fun if you're not doing interval arithmetic.

NaN sort of means "I have no idea, I lost all precision, it could be anything", which in interval arithmetic would be a full interval containing every number, a very valid interval. -0.0 sort of means "something just below 0", which in interval arithmetic would be a short interval ending at 0.

Yes oops, for some reason I thought .max propagated NaNs. So yes, I meant .maximum (of course it's the same thing in my NaN-free version of arithmetic).

This allows you to:

• sort and de-duplicate Results
• binary search in a list of Results
• store Results in a B-Tree.

If you only care about comparing Oks with Oks, and Errs with Errs, you probably don't even need PartialOrd, you would more likely compare the contents after unwrapping both values.

I don't know what this means. You can consider my opening message somewhat rantish, but after that I've been only responding to feedback rather than ranting.

So you basically say: forget your idea. Just stick to IEEE-754 because it's better.

That's a fine approach, but it doesn't really help with my design goal of having NaN-free arithmetic that doesn't unnecessarily blow up in the presence of rounding errors, because both of these are false in IEEE-754.

That's actually sort of what I've already done / was doing.

I don’t think this is quite right. Instead, I think the correct interpretation is something like “I performed an operation that isn’t closed over the real numbers, and the result was not a member of ℝ.”

4 Likes

In a way it's both. If the NaN is the result of doing sqrt(-1) then you're right. If it's the result of 0.0 / 0.0, then I think my interpretation makes sense: in interval arithmetic, if you had two intervals containing zeroes and divided them, you would get an interval containing every possible number, and then this would propagate through most subsequent operations, which looks a lot like a NaN.

1 Like

As I said, I think it's legitimate to hypothesize about different floating-point arithmetic, also in the context of Rust. A lot of languages come with their own numeric types that aren't tied to what the processor does. Consider Python:

>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376

I don't want to promote introducing any of that in Rust, changing behavior of f64, or anything like that, but I find it interesting to think about alternative rules (as it also helps understanding the current rules better).

(And I don't consider the original post to be a rant.)

I still like NaNs, but I'd like to understand the proposal better.

@tczajka, basically you propose that in your model returning any result is legitimate that would be contained in an interval when using interval arithmetic? I'm not familiar with the latter. Why would [0.0, 0.0] / [0.0, 0.0] be [−∞, +∞]? Because [−∞, +∞] * [0.0, 0.0] = [0.0, 0.0], I guess. But no, correct would be: [−huge, +huge] * [0.0, 0.0] with |huge| < ∞. Or am I wrong? (Really not sure here.)

Thus (within the model of integer arithmetic, if I understand it right), I'd say the following doesn't make sense:

Instead, 0.0 / 0.0 could give 0.0, because 0.0 is one possible solution to the equation x × 0.0 = 0.0. This would also be average (and median) of all possible solutions when using finite floating-point representation.

But how about 1.0 / 0.0? Here −∞ and +∞ would be possible solutions, so an interval containing all solutions would be [−∞, +∞]. But finite numbers are no solution. So not sure what to do here.

Given these considerations, how would you answer the following questions (as of now):

• Can 0.0 / 0.0 be infinite? Or wouldn't 0.0 / 0.0 = 0.0 be a better choice?
• Should −0.0 and +0.0 be distinguished?
• Should −∞ and +∞ be distinguished, and if we don't distinguish them, how do they compare to other numbers?

Anyway, setting sqrt(−1) = 0 doesn't make sense in any case, I think, as 0² is not −1. I'd rather say sqrt(−1) = −∞ (because dsqrt(x)/dx = ∞ for x = 0) .

I would propose (for this hypothetical arithmetic):

• 0.0 / 0.0 = 0.0
• 1.0 / +0.0 = +∞
• 1.0 / −0.0 = −∞
• sqrt(∞) = ln(∞) = ∞
• ln(1) = 0
• sqrt(0) = 0
• ln(0) = −∞
• sqrt(−1) = ln(−1) = −∞
• sqrt(−∞) = ln(−∞) = −∞
• sin(−∞) = sin(+∞) = 0.0
• cos(−∞) = cos(+∞) = 0.0

No, my model I discussed in the OP has nothing to do with interval arithmetic. Every operation would have one unambiguous answer.

I was talking about interval arithmetic there. In interval arithmetic, something like 1 / (-epsilon, 0) is (- infinity, - huge), which is why I said things like 1/-0 = - infinity in IEEE try to emulate interval arithmetic in certain special cases like -0. And similarly, in interval arithmetic (-epsilon, 0) / (-epsilon, 0) = (0, infinity), because you can get any (positive) answer from these ranges, you can get 0.01, 42.7, 170.23, etc, so for -0 / -0 IEEE says "we can't possibly decide which, so NaN".

It was just an analogy to interval arithmetic. But in my ideal system there is no -0, there is no NaN, and there is no such analogy to the interval arithmetic.

NaN² is also not -1.

You can rename my sqrt function sqrt_re, the real part of the square root. I called it "sqrt" because it's what you would use in every use case where you actually want to take a real square root of a nonnegative number. It's not identical to the complex square root or whatever, and doesn't have to be.

(−∞)² is also not -1.

This doesn't work for my purposes, because it makes sqrt discontinuous at 0, this making the function numerically unstable when you take the square root of (an approximation of) a tiny positive number or 0.

I think I follow (kinda) (maybe), though I find it difficult to operate with infinity as a value unless the calculation rules are defined beforehand (which is a bit difficult, as we argue about how to define calculation rules).

Yeah, −0.0 is somewhat odd. But then I guess 1.0 / 0.0 is… ∞? Or +∞, which is distinct from −∞? I guess you could say:

• 1.0 / 0.0 = +∞
• −1.0 / 0.0 = −∞

But that's somewhat an arbitrary choice. And setting −∞ = +∞ = ∞ would make comparisons very difficult again. That's why I'm thinking that distinguishing between −0.0 and +0.0 is somewhat justified (besides the practical reasons because 0.0 does have a sign bit in binary representation).

Well, sometimes it (kinda) is (Playground)

Yes I see. But isn't 1/x also numerically unstable at around zero? I agree, however, that this isn't the same because there's a pole/singularity involved here.

Maybe:

• sin(−∞) = −0.0
• sin(+∞) = +0.0
• cos(−∞) = +0.0
• cos(+∞) = +0.0

(due to symmetry)

Yes it is, but I don't have a solution to that one.

It's a well known issue that books on numerical analysis deal with in detail. Basically you have to rewrite your algorithm to avoid doing this if you want it to be numerically stable, and it's not always easy, but sometimes possible.

Whereas for sqrt, asin, acos, atanh there is an easy solution, it's an artificial problem caused by the choices made by IEEE-754.

What should ln(−1) be? ln(−1) = −∞?

Agreed that the distinction between −∞ and ∞ is somewhat debatable, but assuming we still have both, then yes, −∞.

I found this Rust Playground rather surpising:

sqrt(-0) = -0

It may be correct according to IEEE (?), but I find the negative in the answer very surprising. If -0 is taken to mean -ε (for some ε too small for the given precision), then the "correct" answer is i√ε or NaN or something that is ok to round to (positive) 0, but certainly not -√ε.

5 Likes

Indeed surprising. But I think there are not many options. Making it return NaN would be difficult due to things like sqrt(-(1.0-1.0)), which should be == 0.0 (and it is (Playground)). But the sign is a bit weird indeed.

1 Like

Interestingly, (-0.0).powf(0.5) is 0.0: playground.

Relevant (partial) explanation: stackoverflow.

1 Like