When I check equality of two floating-point values, I basically use a tolerance value instead of the equality operator (==), like this; let almost_eq = (a - b).abs() < f32::EPSILON;
However, I'm wondering if I should do the same when writing a code example. This is because I feel that assert_eq!(result, <float literal>) is easier to understand when the result is a simple short decimal number such as 1.0 or 1.5.
I'm not even sure what a good code example looks like.
How do you write a code example of a function that returns f32?
/// Performs a linear interpolation between `a` and `b`.
///
/// # Examples
/// ```
/// // Do you write like this?
/// let result = lerp(0.5, 10.0, 20.0);
/// assert!((result - 15.0).abs() < f32::EPSILON);
///
/// // Or like this?
/// let result = lerp(0.5, 10.0, 20.0);
/// assert_eq!(result, 15.0);
///
/// // Or something else?
/// ```
pub fn lerp(t: f32, a: f32, b: f32) -> f32 {
a + (b - a) * t
}
Every time you write a function that returns f32, do you document its accuracy?
So sadly that's too difficult for me. (Ā“;Ļ;`)
I don't know how a + (b - a) * t is accurate.
I don't even know if it will be executed in software or a FPU,
or if it will yield the same result in any environment.
Should programmers who are unsure of the accuracy of their floating-point functions stop writing code examples rather than racking their brains?
Perhaps, for examples that are just conveying the meaning of the arguments, you're fine sticking to simple cases that you know for sure are exact, and thus doing an exact match is fine.
since you can easily be confident in all of those, given that 0.25, 0.50, and 0.75 are all exactly representable, as are the integers used, and thus in pretty much any plausible implementation you're not going to run into accuracy problems for those particular values.
Where lerp gets really tough is in things like lerp(0.5, -10000000.0_f32, 10000001.0).
I think that adding a test to document the result of lerp(0.5, -10000000.0_f32, 10000001.0) can be insightful as it will document possible issues and bring attention to the fact that floating point inaccuracies are something to be aware of when using this function. But if you put an exact value inside the doctest, it may fail in some environments.
Given that we are using floats according to the same standard (IEEE_754) no matter what target the code is compiled for then any floating point calculation will have some specific value. Even if it is not exactly what one might expect. Therefore it seems to me that it is quite OK to use == in asserts in test code. After all the same test should always produce the same result.
Or am I missing a point here?
(Edit: This may not work if the calculation is split over multiple threads as the order of operations may get changed. But I would argue one is then not doing the same calculation every time).
Wow, what a gripping story of something nobody cares about. A true nerd masterpiece.
I get your point. That is why I said "to the same standard (IEEE_754)". As far as I know if some compiler on some machine produces a different result and my test fails there is a bug but it is not a bug in my test. As long as my test conforms to IEEE_754.
All computers discussed there conform to IEEE 754.
Well, you may call it ābug in your headā or, maybe, āa misunderstandingā.
Well, if your test requires bit-to-bit identical output then it doesn't conform to IEEE 754. Because if you open said standard you would find various functions like sum: sum(p, n) is an implementation-defined approximation to ā(i = 1, n) pįµ¢, where p is a vector of length n (emphasis mine).
Sure, standard places some limitations on how good that approximation is, but it's still implementation-defined and that's written right in the text of said standard.
If your test uses these functions and expects them to produce the same results on all supported hardware then it would repeat the Space Cadet's story.
To make sure your standard is IEEE 754 conforming one you need to add some tolerance to it and these tolerances depend on what exactly you are calculating. In SPEC CPU benchmarks they are specified on per-benchmark basis, e.g.
I know because someone tried to use these benchmarks as if their output is 100% predictable and, of course, when someone added RISC-V to the CI they started failing. I had to look on the original harness and transfer tolerances to our CI pipeline.
As a project manager said to our team back in the early 1980's:
If you need to use floating point to solve the problem you don't understand the problem. If the problem needs floating point to solve then you have a problem you don't understand.
The whole point of a unit test is to see if the output is what you expect (and to document what you expect).
What you're describing is a regression test. The downside of making a bit-exact regression test is that it will fail every time you change the algorithm (or even when you update the standard library or a dependency), which sort of defeats the purpose of having a test.
Actually I'm losing the thread here. IEEE 754 represents numbers as: 1 bit sign, 8 bit exponent and 23 bit mantissa (for f32). On reading that I would naively assume that performing addition, subtraction etc on that, as described by the standard had to always produce the same results. If not, there is something wrong.
Now, if I recall correctly, the Intel FPU's from the 387 chip up did something a bit different, their floating point registers maintained more bits than the standard describes. Which meant that during long calculations the intermediate results maintained higher resolution and the final result could then be different.
Is it so that the IEEE 754 standard accommodates this kind of slop from Intel and no doubt other implementations?
While exact comparisons of computed floats is, in fact, generally a poor idea, it's fine for examples, especially if the implementation only relies on primitive floating point operations (i.e. those guaranteed to be Ā±Ā½ULP, e.g. add/sub, mul/div, and sign manipulation). If you know the error bounds it's good to provide them, but this often isn't super practical.
Keep in mind, the examples' primary purpose is to demonstrate a believable but minimal usage of the function. So long as you accomplish that without being actively misleading, it's a decent example.
If you really want to avoid implicitly endorsing exact float comparisons, you can use something like an approx_eq! macro to show and test the result without explicitly documenting a specific tolerance.
A lerp is just an unfortunately complicated operation for IEEE floating point. Both of the standard formulas break "obvious" properties in different ways, and most implementations want to be able to use a hardware fused multiply-add when available for speed, despite that changing the computation's accuracy. If you actually write an implementation which is all three of bounded (lerp(a, b, n ā 0..=1) ā a..=b ā a < b), exact (lerp(a, b, 0) == a and lerp(a, b, 1) == b), and monotonic (lerp(a, b, n) < lerp(a, b, m) == n < m ā a < b), it'll be too slow for anyone to actually use.
It doesn't for the most primitive operations, but people have been burnt by -ffast-math style flags which do. And any realistically interesting computation will generally end up doing an operation beyond the primitive guaranteed-correctly-rounded ones.
Do you mean allowing the use of third party macros in code examples? Or do you mean allowing the use of macros that are not actually defined in code examples for illustrative purposes?
// `assert_almost_eq!` is used in the following example,
// but its definition is not provided.
/// # Examples
/// ```
/// let result = lerp(0.5, 10.0, 20.0);
/// assert_almost_eq!(result, 15.0);
/// ```
But which result? Let's start with simplest possible numbers and addition:
+0.0 + +0.0 = ?
+0.0 + -0.0 = ?
-0.0 + +0.0 = ?
-0.0 + -0.0 = ?
The difference between #2 and #3 is immediately interesting because most of ānormalā people, naĆÆvely, expect that A + B = B + A would hold and they also, for some reason, want to have that pesky rule they learned in school - (A + B) = (-A) + (-B) to hold, too!
Ultimately this all boils down to the simple fact that one couldn't fit ā field into finite computer (while all properties that ā holds).
Integers are different. Sure, they try, unsuccessfully, emulate ā¤ ring, but they also 100% conform to the rules of ā¤āĀ³Ā² ring which gives us solid math to use as base for our reasonings.
But floats are very imprecisely emulating ā field, but, more importantly, they don't correspond to any math entity with well-defined rules. Instead IEEE 754 defined many rules that implementer can pick from.