I just updated to 1.81.0 which caused a panic in our large project in a call to Vec::sort
We used to use this convenience thing to be able to sort f32:s easily
#[derive(Copy, Clone, Debug, PartialEq, PartialOrd)]
pub struct SortF32(pub f32);
impl Eq for SortF32 {}
impl Ord for SortF32 {
#[inline(always)]
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
self.0.total_cmp(&other.0)
}
}
/// example use-case, not panicing
let mut a = vec![9.0, 1.0, 3.0, 7.0];
a.sort_by_key(|v| SortF32(*v));
can anyone explain why this would cause a panic according to the new limitations on sort? I thought f32::total_cmp would cover the requirement of total ordering? Maybe the Eq impl is naive? I highly doubt that the vec we sorted even contained any 'strange' floats but has worked up until now
thread '<unnamed>' panicked at library/core/src/slice/sort/shared/smallsort.rs:862:5:
user-provided comparison function does not correctly implement a total order
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: core::slice::sort::shared::smallsort::panic_on_ord_violation
3: core::slice::sort::shared::smallsort::small_sort_general_with_scratch
4: core::slice::sort::stable::quicksort::quicksort
5: core::slice::sort::stable::quicksort::quicksort
6: core::slice::sort::stable::quicksort::quicksort
7: core::slice::sort::stable::quicksort::quicksort
8: core::slice::sort::stable::quicksort::quicksort
9: core::slice::sort::stable::quicksort::quicksort
10: core::slice::sort::stable::quicksort::quicksort
11: core::slice::sort::stable::quicksort::quicksort
12: core::slice::sort::stable::quicksort::quicksort
13: core::slice::sort::stable::drift::sort
14: core::slice::sort::stable::driftsort_main
There’s at least the fact that you’re lying to the compiler when you impl Eq for the wrapper. f32 is not Eq and hence the wrapper isn’t either, unless you write a custom PartialEq that actually satisfies the requirements of Eq, as described by its documentation.
I'm a little baffled why the rust docs for Ord don't even address floats. Seems like such a common use-case that there should be some ability to sanely point the user in the right direction.
Ord in std::cmp - Rust The docs for Ord showcase an example of implementing it for a struct but just ignores the fact that floats might exist and might need to be ordered
I hope this doesn't come across as patronizing, because that is really not the intent. However, floating point numbers cannot be compared for equality. A good reference for this is the Goldberg paper: here's one link to it ... not sure it's the best link though What Every Computer Scientist Should Know About Floating-Point Arithmetic.
As for whether or not this should be mentioned in the Rust docs ... that's above my pay grade.
I definately think that given how much effort has been put into the language's PartialEq, Eq, PartialOrd and Ord to handle floats, there should maybe be a docs example showing how one could go about doing it sanely for floats that won't cause a panic on sorting... for example, maybe NaNs are always 'larger' and equal and whatever special cases need to be defined by the user
Just to add context to @jlegare's link; floats are deceptive. They're a many:1 mapping from the reals (an infinite set) to a fixed representation integer (finite set), and not well-behaved numbers, but there are a set of rules about how the mapping behaves, called IEEE-754, which allow you to forget that they're not well-behaved most of the time.
Sorting, unfortunately, is one of the cases where the "not well-behaved" shines through.
I believe you can get the behaviour you want with a more complicated wrapper, where you custom-implement both PartialEq and PartialOrd:
This uses total_cmp for all comparisons (via Ord in all cases), and is thus consistent, even if not always what you expect (for example -0.0 == 0.0, but SortF32(-0.0) != SortF32(0.0)).
Also, FWIW, I'd not use a wrapper, personally. For the cases where I need a sorted list of floats, I'd use v.sort_by(f32::total_cmp) instead of v.sort(), since that leaves floats in their normal weird state most of the time, and means that when I come to look at the caller of sort, I know that it's doing a thing that isn't quite what I'd normally think of as a "sort".
Given that floats are weird, it's nice to have the reminder of the weirdness in front of you.
Edit: And I note that the documentation of sort calls this case out, and even mentions sort_by and f32::total_cmp as a solution.
This is an overstatement. There are plenty of cases where comparing values, that happen to be or include some floating-point numbers, is a reasonable thing to do. These cases are generally unlike numerical calculation, and more like algorithms on more arbitrary values. For example, you might compare a configuration against its previous value to determine if the computations that depend on that configuration need to be recalculated. The possibility that part of that configuration is floating-point does not matter to the correctness of that algorithm.
I disagree. Floats are tricky, but claims like “cannot be compared” make them sound more unusably magical than they are, and lead to people trying to address the problem with rules like “always compare with an epsilon” even when they don't apply to the situation.
Hmmm ... allow me then to rephrase: my original statement was a blanket one, which I generally find easier to make and then allow to refinements later, rather than outlining every corner case upfront. My second statement was not meant to be profound.
@kpreid and everyone: Please accept my apologies for the confusion I have caused in making grand sweeping statements that could lead to misunderstandings on the issues involved.
In such a situation you probably want to refuse NaN entering the system. ordered-float, already mentioned, offers wrappers that can help with this, either rejecting NaNs or counting all NaNs as equal to each other rather than unequal. But it would also be sufficient to just check is_nan() or is_finite() in a relevant constructor/setter function.
This is a well-known BUG. You should either derive both PartialOrd and Ord, or implement them by hand (e.g., impl Ord, and then impl PartialOrd with the Ord implementation.)
In your case, the PartialOrd's implementation comes directly from float's PartialOrd, then you impl Ord by hand, which is illegal and directly panic.
I was trying to point out that even in the seemingly simple case floats still introduce additional complications. Yes of course they can be handled. But even there naive comparisons introduce pitfalls. Basically, for a cache you want Eq, not PartialEq, otherwise you incur spurious, persistent cache misses.