How do you write a code example for a function that returns f32?

When I check equality of two floating-point values, I basically use a tolerance value instead of the equality operator (==), like this;
let almost_eq = (a - b).abs() < f32::EPSILON;

However, I'm wondering if I should do the same when writing a code example. This is because I feel that assert_eq!(result, <float literal>) is easier to understand when the result is a simple short decimal number such as 1.0 or 1.5.

I'm not even sure what a good code example looks like.
How do you write a code example of a function that returns f32?


/// Performs a linear interpolation between `a` and `b`.
///
/// # Examples
/// ```
/// // Do you write like this?
/// let result = lerp(0.5, 10.0, 20.0);
/// assert!((result - 15.0).abs() < f32::EPSILON);
///
/// // Or like this?
/// let result = lerp(0.5, 10.0, 20.0);
/// assert_eq!(result, 15.0);
///
/// // Or something else?
/// ```
pub fn lerp(t: f32, a: f32, b: f32) -> f32 {
    a + (b - a) * t
}

The docs for std seem to use a comparison with epsilon a lot. f32 - Rust

1 Like

The way the docs use epsilon is bad. That's not what it's for, especially in the examples where it's used as a "close to zero" check.

How much are you willing to promise from your lerp implementation? Is it always accurate to Ā½ULP? If so, then you can do an exact match.

Be warned, though, there's surprising nuance to writing a good lerp.

4 Likes

Every time you write a function that returns f32, do you document its accuracy?
So sadly that's too difficult for me. (Ā“;Ļ‰;`)

I don't know how a + (b - a) * t is accurate.
I don't even know if it will be executed in software or a FPU,
or if it will yield the same result in any environment.

Should programmers who are unsure of the accuracy of their floating-point functions stop writing code examples rather than racking their brains?

Perhaps, for examples that are just conveying the meaning of the arguments, you're fine sticking to simple cases that you know for sure are exact, and thus doing an exact match is fine.

For example, you might have

assert_eq!(lerp(0.25, 1.0, 9.0), 3.0);
assert_eq!(lerp(0.50, 1.0, 9.0), 5.0);
assert_eq!(lerp(0.75, 1.0, 9.0), 7.0);

since you can easily be confident in all of those, given that 0.25, 0.50, and 0.75 are all exactly representable, as are the integers used, and thus in pretty much any plausible implementation you're not going to run into accuracy problems for those particular values.

Where lerp gets really tough is in things like lerp(0.5, -10000000.0_f32, 10000001.0).

5 Likes

I agree.

I think that adding a test to document the result of lerp(0.5, -10000000.0_f32, 10000001.0) can be insightful as it will document possible issues and bring attention to the fact that floating point inaccuracies are something to be aware of when using this function. But if you put an exact value inside the doctest, it may fail in some environments.

1 Like

This is equivalent to result == 15.0 because f32::EPSILON is below the level of f32 precision here.

I would do something like this:

/// assert!((result - 15.0).abs() < 0.01);
1 Like

BTW, remember that "short decimal" doesn't mean much in itself, as the classic 0.1 + 0.2 != 0.3 shows.

From the perspective of f32, 0.00390625 is a much simpler number than 0.1.

3 Likes

You could also consider rounding to a fixed number of decimals, e.g.

use std::f64::consts::PI;

fn round_to_two_decimals(value: f64) -> f64 {
        (value * 100.0).round() / 100.0
 }
 
fn main() {
   let x = round_to_two_decimals(22.0f64/7.0f64);
   let y = round_to_two_decimals(PI);
  assert_eq!(x,y);
}
1 Like

This is worse than checking the difference is within bounds.

You could have two values that are within 1 ulp of each other but round differently, regardless of the number of digits you're rounding to.

For instance, if you're rounding to 2 fractional digits, this would fail when comparing 3.1450001 vs 3.1449999.

1 Like

Given that we are using floats according to the same standard (IEEE_754) no matter what target the code is compiled for then any floating point calculation will have some specific value. Even if it is not exactly what one might expect. Therefore it seems to me that it is quite OK to use == in asserts in test code. After all the same test should always produce the same result.

Or am I missing a point here?

(Edit: This may not work if the calculation is split over multiple threads as the order of operations may get changed. But I would argue one is then not doing the same calculation every time).

1 Like

You do remember why Space Cadets were removed from Windows, right?

Wow, what a gripping story of something nobody cares about. A true nerd masterpiece.

I get your point. That is why I said "to the same standard (IEEE_754)". As far as I know if some compiler on some machine produces a different result and my test fails there is a bug but it is not a bug in my test. As long as my test conforms to IEEE_754.

All computers discussed there conform to IEEE 754.

Well, you may call it ā€œbug in your headā€ or, maybe, ā€œa misunderstandingā€.

Well, if your test requires bit-to-bit identical output then it doesn't conform to IEEE 754. Because if you open said standard you would find various functions like sum: sum(p, n) is an implementation-defined approximation to āˆ‘(i = 1, n) pįµ¢, where p is a vector of length n (emphasis mine).

Sure, standard places some limitations on how good that approximation is, but it's still implementation-defined and that's written right in the text of said standard.

If your test uses these functions and expects them to produce the same results on all supported hardware then it would repeat the Space Cadet's story.

To make sure your standard is IEEE 754 conforming one you need to add some tolerance to it and these tolerances depend on what exactly you are calculating. In SPEC CPU benchmarks they are specified on per-benchmark basis, e.g.

I know because someone tried to use these benchmarks as if their output is 100% predictable and, of course, when someone added RISC-V to the CI they started failing. I had to look on the original harness and transfer tolerances to our CI pipeline.

OK. In that case I retract my argument. :slight_smile:

As a project manager said to our team back in the early 1980's:

If you need to use floating point to solve the problem you don't understand the problem. If the problem needs floating point to solve then you have a problem you don't understand.

1 Like

The whole point of a unit test is to see if the output is what you expect (and to document what you expect).

What you're describing is a regression test. The downside of making a bit-exact regression test is that it will fail every time you change the algorithm (or even when you update the standard library or a dependency), which sort of defeats the purpose of having a test.

2 Likes

That can be true when using integers as well.

Actually I'm losing the thread here. IEEE 754 represents numbers as: 1 bit sign, 8 bit exponent and 23 bit mantissa (for f32). On reading that I would naively assume that performing addition, subtraction etc on that, as described by the standard had to always produce the same results. If not, there is something wrong.

Now, if I recall correctly, the Intel FPU's from the 387 chip up did something a bit different, their floating point registers maintained more bits than the standard describes. Which meant that during long calculations the intermediate results maintained higher resolution and the final result could then be different.

Is it so that the IEEE 754 standard accommodates this kind of slop from Intel and no doubt other implementations?

1 Like

While exact comparisons of computed floats is, in fact, generally a poor idea, it's fine for examples, especially if the implementation only relies on primitive floating point operations (i.e. those guaranteed to be Ā±Ā½ULP, e.g. add/sub, mul/div, and sign manipulation). If you know the error bounds it's good to provide them, but this often isn't super practical.

Keep in mind, the examples' primary purpose is to demonstrate a believable but minimal usage of the function. So long as you accomplish that without being actively misleading, it's a decent example.

If you really want to avoid implicitly endorsing exact float comparisons, you can use something like an approx_eq! macro to show and test the result without explicitly documenting a specific tolerance.

A lerp is just an unfortunately complicated operation for IEEE floating point. Both of the standard formulas break "obvious" properties in different ways, and most implementations want to be able to use a hardware fused multiply-add when available for speed, despite that changing the computation's accuracy. If you actually write an implementation which is all three of bounded (lerp(a, b, n āˆˆ 0..=1) āˆˆ a..=b ā† a < b), exact (lerp(a, b, 0) == a and lerp(a, b, 1) == b), and monotonic (lerp(a, b, n) < lerp(a, b, m) == n < m ā† a < b), it'll be too slow for anyone to actually use.

It doesn't for the most primitive operations, but people have been burnt by -ffast-math style flags which do. And any realistically interesting computation will generally end up doing an operation beyond the primitive guaranteed-correctly-rounded ones.

5 Likes

Thanks a lot for all the advice.

Do you mean allowing the use of third party macros in code examples? Or do you mean allowing the use of macros that are not actually defined in code examples for illustrative purposes?

// `assert_almost_eq!` is used in the following example,
// but its definition is not provided.

/// # Examples
/// ```
/// let result = lerp(0.5, 10.0, 20.0);
/// assert_almost_eq!(result, 15.0);
/// ```

But which result? Let's start with simplest possible numbers and addition:

  1. +0.0 + +0.0 = ?
  2. +0.0 + -0.0 = ?
  3. -0.0 + +0.0 = ?
  4. -0.0 + -0.0 = ?

The difference between #2 and #3 is immediately interesting because most of ā€œnormalā€ people, naĆÆvely, expect that A + B = B + A would hold and they also, for some reason, want to have that pesky rule they learned in school - (A + B) = (-A) + (-B) to hold, too!

Ultimately this all boils down to the simple fact that one couldn't fit ā„ field into finite computer (while all properties that ā„ holds).

Integers are different. Sure, they try, unsuccessfully, emulate ā„¤ ring, but they also 100% conform to the rules of ā„¤ā‚‚Ā³Ā² ring which gives us solid math to use as base for our reasonings.

But floats are very imprecisely emulating ā„ field, but, more importantly, they don't correspond to any math entity with well-defined rules. Instead IEEE 754 defined many rules that implementer can pick from.