Performance of sorting floating point numbers?


#1

As a long term C++ veteran turned bumbling Rust newbie, I’ve just stumbled upon the (apparently dreaded) question of how to sort a vector of floating point numbers. The standard answer seems to be this:

v.sort_by(|a, b| a.partial_cmp(b).unwrap());

This made me wonder about the performance impact of doing partial_cmp(b).unwrap(). This looks an awful lot more involved than the simple a < b comparison that is used when sorting floating point vectors with C++'s std::sort. Does Rust’s version involve an explicit test for NaN values to figure out the None case? That would sound like a bad thing to do at every comparison, performancewise. Or is there something more clever going on that the Rust compiler can magically optimize away?

I don’t mind the awkward partial_cmp syntax all that much, but I’d hate to pay a performance penalty for something (the occurence of NaNs) that is a very minor issue my usage of floating point data.

(P.S. I do realize the ultimate answer to any perfomance question is “shut up and measure” anyway)


#2

The partial_cmp is done as:

match (self <= other, self >= other) {
    (false, false) => None,
    // rest of the combos return a Some(...)
}

source

So it doesn’t check for NaN explicitly.


#3

What Rust is telling you here is that IEEE-754 was designed by dangerous psychopaths and should not be trusted even on basic things like comparisons. From this point on, how you choose to handle it is up to you. You can, for example…

  • Sanitize your floats at sorting time and panic if a NaN is encountered, as your current code does.
  • Build your own comparison function (or totally ordered wrapper type) that ignores the problem like C++ does. Face the consequences.
  • Use float wrappers that catch NaNs from input and computations (e.g. noisy_float).

#4

Yes, the best kind of awesome psychopaths. Unfortunately I feel a distinct lack of them in my daily life… :wink:


#5

Maybe I should add to my previous reply that the fact that NaNs are a rare occurence is precisely what makes them so dangerous. You will not encounter them in your normal usage scenarios. You will not think about adding them to your test suites. Most of the time, you will be able to do as if they did not exist…

…until that one day, once in a blue moon, possibly in the middle of some very important demo, where your program will mysteriously crash or produce ridiculously wrong output because some operation somewhere boiled down, in IEEE-754’s eyes, to a 0/0, a 0*inf or something similarly mathematically undefined. Or maybe because your program blindly accepted user input, which happened to feature a NaN or cause one in a later computation.

Then you will go “what is this?”, run the program again, and this time it will produce perfectly reasonable output. As if nothing happened. And no amount of debugging will manage to bring the odd NaN back. Until someday, once you are off-guard, it will come to haunt you again.

IMO, this puts NaN-induced bugs alongside off-by-one, integer overflow, and race conditions, in the league of “intermittent gotchas” that most experienced programmers know about, but fall for anyway and in spite of extensive testing due to their rarity. I think it’s nice that Rust forces you to pause and think about this specific one instead of letting you silently introduce it in your program.


#6

Alternatively, check out https://docs.rs/noisy_float/

This way you can ensure ahead of time that there are no NaNs, and be able to use regular Ord and fast < comparison.


#7

Well, it’s always a trade-off that depends on your use case, isn’t it? In my case, I know from experience that NaNs are a very minor issue that only show up once in a blue moon. And when they do, the source of trouble is usually easy to track down. Putting in extra checking code for every comparison is not a particulary worthwile trade-off for me.

And if the stars indeed misalign and some NaN pops up in the middle of that very important demo, then an unwrap-panic won’t save the day either :slight_smile:

If I were that paranoid about NaNs I’d probably rather try the option of directly enabling the trapping of floating point exceptions by setting appropriate FPU flags, although this comes with its own set of headaches in my experience.

Thanks! I will have a look.


#8

There are also panic-less solutions:

arr.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Less));


#9

I disagree. Certainly ieee arithmetic presents pitfalls for people new to numerics, but it also allows people who know the rules to write efficient code that does the right thing so far as is possible even when there are challenges.

Floating point numbers are inherently complicated , and it wasn’t easy to figure out the best behavior for every operation. In cases where the answer is mathematically undefined, you have to either crash or return an answer of some sort. Crashing is a poor idea at the lowest level, and the rustish way would be to return something like a Result for an operation the couple have an undefined result. IEEE 754 arithmetic allows you to choose which of these behaviors you want, with the latter being the default.

Having the FPU handle and propagate NaNs allows you to write beautiful, fast, and correct code that otherwise would be impossible. Take as a simple example a function to do Monte Carlo minimization:

fn minim(f: impl FnOnce(f64) -> f64) -> f64 {
  let mut xmin = std::f64::NAN;
  let mut lowest = std::f64::INFINITY;
  for i in 0..100 {
    let x = ran();
    let fx = f(x);
    if fx < lowest {
      lowest = fx;
      xmin = x;
  }
  xmin
}

This code will return a correct answer (approximate, of course) regardless of whether f sometimes returns NaNs or infinity. If I switched the direction of the comparison, it would not, i.e. ! lowest > fx would give a NaN if we ever saw a NaN. But the code as written will return the lowest non-NaN number seen.

This kind of distinction is confusing and can definitely catch users by surprise, but that is the cost of being able to write correct, concise and fast code.

If anything, I’d say that history shows the the folks in charge of integers are the dangerous psychopaths. It’s not possible to write fast, concise and correct integer code. For that matter, it’s not possible to write fast and correct integer code.

Regarding the original question, the sort_by function seems like a poor fit for efficient sorting of floating point numbers.


#10

You are missing the third alternative, which some IEEE 754 implementations do support, and whose implementation should IMO have been mandated by the standard and made the default error-handling mechanism: trapping to an OS-defined exception handler, and letting the OS and application runtime decide how to best deal with the problem.

After all, what do reasonable CPU architectures do when one of the following happens?

  • Invalid opcode in the instruction stream.
  • Valid opcode with invalid operands (e.g. fetch from a invalid address, integer divide-by-zero)
  • Valid instruction, but in an invalid context (e.g. insufficient privilege level)

In all of these cases, the CPU architecture designers could have decided to halt the system, or to return a nonsensical result which exists entirely outside of the normal data model, has ridiculous mathematical properties, and is processed 100x slower by normal arithmetic instructions, crossing fingers that someone someday will hopefully check for its presence in the output.

Thankfully, however, they chose neither option. Instead, what they did was to think “Okay, we have some operations like memory accesses which are extremely frequent, but can fail occasionally. Having the user check for success on every such instruction would be prohibitely expensive, so we won’t ask them to. Having the user perform such checks “infrequently enough” is a sure-fire path to users not performing the check at all, so we won’t ask them to either. Instead, we will jump to an OS-defined error handler, and let the OS deal with the rest, possibly by back-propagating the exception to the application”.

For infrequent failures, where the performance impact of trapping to the OS is not an issue, there is no denying that this was the correct choice. For sure, I wouldn’t want to live in a world where segfaults during fetches are handled by returning an otherwise impossible integer value which fails every single check of mathematical sanity. Or where program jumps to invalid memory addresses are handled by starting a Tetris easter egg. So why should invalid floating-point computations be any different?

Here’s to wondering about how different a world we would have lived in if FPU exceptions were a mandatory feature, and enabled by default. In such a world, programming languages and operating systems could have evolved in order to provide first-class support for this feature. Safety-conscious languages would trigger compiler warnings or errors when a program manipulates floats without defining a top-level FPU exception handler, and provide runtime support for transparently translating these exceptions into more handly monadic return types like Rust’s Result at a cold point of the code. And overall, floating-point error handling would become a consistent and well-integrated part of the system error handling strategy.

Instead, we got error handling design by commitee: +/-0, nonsensical NaNs, infinities which are not really infinite, and surprise denormals which slow down computations to a crawl, silently lose information, and can turn into infinities at the first inversion. And our floating-point divide by zero is completely inconsistent with our integer divide-by-zero. This is hardly what I would call a reasonable contender :slight_smile:


#11

And this is why I proposed multiple approaches :slight_smile: Rust is just making you aware of the problem here, it does not choose a solution for you, because as you say the solution is bound to be situational.