Comparing floats for equality

I understand why Rust doesn't let you use == with floating point numbers. Comparing the results of two floating point computations for equality is foolish.

BUT!

I've got some code that's working with data coming in over a network channel, and I want to know whether the f64 I got this time is the same or different than the f64 I got last time. Did I get the same bit pattern, or not? But by the time I get it, it's already been presented to me as an f64.

I'm current checking (val1 - val2).abs() < f64::EPSILON, which really puts it on entirely the wrong footing.

How do I easily check whether the bits are the same?

val1.to_ne_bytes() == val2.to_ne_bytes()

or

val1.to_bits() == val2.to_bits()
5 Likes

Excellent! Thank you!

So you are suspecting corruption of your data due to network noise/errors or whatever.

So riddle me this:

If the transmitting end intentionally sends different values of that f64, but the second one happens to get corrupted to be the same as the first one, how will you know a network error occurred?

Really, I think your detection of network errors should be handled by other means. Checksums, error correcting codes, etc. That f64 should just be a bunch of bits as far as the network is concerned.

Or am I assuming your intension wrongly?

That's exactly what @jplwill asked about: how to compare two f64 bit-for-bit. And that's what val1.as_bits() == val2.as_bits() does.

By definition, if the bits sent and received are the same, there is no data corruption. (ignore this, I was misreading)

It does allow it.

5 Likes

In a communication, over whatever medium, the receiving end cannot know if the bits it gets are the same as the transmitting end sent.

Edit: Unless the receiver and transmitter are the same device shouting down a loop back. But then, why would we be doing that?

One can become more confident that the correct thing was received by adding checksums, error correcting codes, sequence numbers, ever more complex protocols with retries and so on.

In short, on receiving a bunch of bits as an f64, one cannot tell if that is what was sent or not, or if it is supposed to be the same or different value as sent last time.

Not necessarily. For all we know, the check could be to determine whether or not to re-run some calculation that uses this f64 as input.

1 Like

Yeah; I'm not trying to check for errors at all. I"m monitoring a system, from which records of data come in every few seconds; by the time I get them, they've already been broken out into individual values. What I want to do is check whether whether I've received new data or whether it's just the same old data, i.e., which fields have changed.

1 Like

D'oh. I jumped from PartialEq isn't implemented to "can't use == with floats". My bad.

1 Like

If your data might include NaN values, then you may need special code to compare them. Otherwise, you can use ==.

2 Likes

That would be correct, but PartialEq is implemented. (Eq is not.)

2 Likes

I see. Sounds reasonable.

A potential problem with that plan is that it can happen that the same floating point calculation on the same input values could have different results. For example if the calculation were performed by some parallel running threads that perform the operations in a different order, depending on the timing of the threads.

This is likely not an issue for you but personally I would feel happier adding some meta-data to your messages to indicate that it's a new calculation and/or the input data the calculation is based on has changed. A simple message sequence number would do.

This sounds like one of the classical thought experiments around networking; no matter how many times you re-transmit the message you can't be sure that there were no networking errors just by comparing the payloads (e.g. because every message could have been corrupted in the same way).

Do you know if it has a name?

As you've pointed out, the correct way to handle transmission issues is to use checksums/signatures to detect errors and re-transmit or error-correcting codes for automatically resolving transmission errors.

It seems related to the Byzantine generals problem, which shows that confirmation messages aren’t good enough to establish a consensus: In any such protocol, the “last” message is a single point of failure.

1 Like

I think this comes under the notion of a "transaction". Atomicity (database systems) - Wikipedia A far as I understand the best one can do is ensure that either the transaction happens correctly or nothing happens at all.

To my mind the Byzantine generals problem is tacking an even bigger problem. Namely how to arrange for an arrangement of compute nodes and connections such that the system as a whole is guaranteed to function correctly in the face of one or more errors. Where "errors" can include actors that are deliberately trying to confuse the system to make it fail.

If I recall correctly the result is that one can ensure a system with N faulty nodes/connections can be made to work only if it has 3N + 1 fully connected nodes.

Fun fact: Even Fly By Wire systems, as in the Boeing 777, do not meet the Byzantine General's criteria for fault tolerance.

That sounds to me like a bug. Best to avoid buggy code rather design around it.

Sorry, I was misreading what you wrote there.

As far as I know it is not a bug. It is a fact of life when working with floating point numbers in threads. Floating point arithmetic is not commutative. Changing the order of execution of operations can yield different results. The non-deterministic timing of thread scheduling can cause operations to be reordered and hence yield lightly different results.

See for example here: https://blogs.mathworks.com/loren/2009/12/04/comparing-single-threaded-vs-multithreaded-floating-point-calculations/

And see many other discussions on the net re: floats, threads, and results.

See also: "What Every Computer Scientist Should
Know About Floating-Point Arithmetic": https://docs.oracle.com/cd/E19957-01/800-7895/800-7895.pdf

Actually there was a long discussion about this here only days ago.

1 Like

It is not; a friend of mine is doing his PhD in Computer Science in the topic of "reproducible computations". I can assure you it's not as simple as "if your code is correct, you'll get the same result every time".