Why `sort()` need `T` to be `Ord`?

Posting here instead of the other thread I forked, because this is related to the original question: "Why sort() need T to be Ord?"

@kornel, how about something like:

trait PartialSort {
    fn partial_sort(&mut self);
}

impl<T: std::cmp::PartialOrd> PartialSort for [T] {
    fn partial_sort(&mut self) {
        todo!()
    }
}

(Playground)

This could be easily implemented in a crate, right?

That could work.

Rust also has a "hack" of implementing regular inherent min/max methods directly on floats, which syntactically mimic what Ord has, so number.min(number) works for both floats and integers.

How about doing the same hack for sort functions?

impl [f32] { // is that a real syntax? :)
  fn sort_unstable(&mut self) {…}
}

"Partial sort" is unfortunately the name of a completely different algorithm that is not directly related to partial ordering. So the name is a no-go

1 Like

We could name it .sloppy_sort ^^

1 Like

But note that number.minimum(other) was also just added (in nightly), for the other "minimum" defined in IEEE 754.

I'm not really a fan, because you end up with the whole "well, where do the NANs go?" problem when defining said function. For example, do they go at the beginning or the end? And does the NAN payload matter?

I think having people do .sort_by(f32::total_cmp) would be fine (once we eventually stabilize that, or something like it).

And note that it puts NANs at both ends, depending on their sign bits.

1 Like

So, what does it mean to sort something by PartialOrd? It seems like a reasonable expectation that for any two indices i < j, it holds that v[i].partial_cmp(v[j]) != Some(Greater). For the general (worst) case, consider sorting a haystack with two needles and a million pieces of hay in it:

  • NeedleA < NeedleB
  • PieceOfHay(k) not comparable to anything but itself
  • Everything equal to itself

The needles have to end up correctly ordered, but the only way to get any useful information (including recognizing if it's a needle or not) is to compare the two needles to each other. Therefore we have to make O(n²) comparisons!

What might happen in reality is that someone tries to sort a large array that accidentally ended up containing only NaNs. If the algorithm is to be correct and generic for T: PartialOrd, it may effectively hang the program at that point. Of course for actually sorting floats, f32::total_cmp (or anything that can e.g. consider all NaNs equal) will have no issues.

1 Like

Couldn't the algorithm just arbitrarily sort PieceOfHay before any other value when applying quicksort to everything (which is O(n * log(n)), if I'm not mistaken?

Ah, PieceOfHay is comparable to itself. That might make things more difficult.

The point is that in a T: PartialOrd generic function the only thing you can do with a T is to try to compare it to another T. I suppose it could try something like examining the bit patterns, and if they match you know that the two things are interchangeable. But even that won't help if they never do match. It could equivalently be the integers 0..n with two randomly chosen as the needles. The algorithm can't know which ones they are until it finds both at the same time and compares them.

Floats are saved by the property that the incomparable elements (NaNs) are not comparable to anything, but that's not required by PartialOrd.

1 Like

Note that when I hear "partial sort" I think std::partial_sort - cppreference.com, which is different from "allows partial ord".

1 Like

I looked up this kind of question earlier, this page contains interesting links

the first answer cites a paper that contains algorithms that are more efficient depending on a parameter "w" which is (AFAIR) something like the largest size of a set of mutually uncomparable items.

On the other hand, if very few items are comparable at all (and you somehow know beforehand which are) then topological sort might be efficient; otherwise a O(n²) time might be correct for the best possible approach, in a fully general setting.

2 Likes

It most certainly is, since if all comparisons you've done so far were not comparable and there's any pair you haven't compared directly, that pair could go either way.

Thanks, I clarified my wording.

1 Like

Topological sort was my first thought. A poset is a DAG. (And a DAG can be associated with a poset.)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.