Why `sort()` need `T` to be `Ord`?

AurevoirXavier · January 5, 2022, 5:27am

pub fn sort(&mut self)
where
    T: Ord,
{
    merge_sort(self, |a, b| a.lt(b));
}

fn lt(&self, other: &Rhs) -> bool {
    matches!(self.partial_cmp(other), Some(Less))
}

It uses lt() inside the merge_sort(). If partial_cmp() returns None then lt() returns Less.
So, it seems like PartialOrd is enough for sort().

RedDocMD · January 5, 2022, 5:34am

From a mathematical POV, you cannot sort a partially ordered set. That's why you should have Ord as the trait bound for sort not PartialOrd. Although I don't know why the code you have shown bothers to use lt. I am assuming the trait bound initially used to be PartialOrd - but then someone refactored it to be Ord and didn't remove lt. For a sensible implementation of Ord based off PartialOrd, a.lt(b) returns the same result as a < b.

AurevoirXavier · January 5, 2022, 5:35am

The code is from std.

Michael-F-Bryan · January 5, 2022, 5:36am

What happens for types that are !Ord like f32? If we relaxed the requirement to only PartialOrd, it'd mean sorting a slice containing a NaN will break the sorting.

Other languages suffer from exactly this problem:

I'm trying to sort an array that sometimes has Infinity or NaN . When I use a standard JavaScript array.sort() it seems to sort until it reaches a NaN and then I get random results after that.
var array =[.02,.2,-.2,Nan,Infinity,20];
Is there a way to still sort this so that the end result is from negative to positive and still have NaN or Infinity at the end.
-.2,.02,.2,20,NaN,Infinity

AurevoirXavier · January 5, 2022, 5:38am

It will return Some(Less) if partial_cmp return None.
So, no NaN in this case.

Michael-F-Bryan · January 5, 2022, 5:42am

And what should sort() do when it asks "is this item before that one?" and gets None instead of a yes/no answer?

Your only options are:

Return a Result to indicate that sorting was unsuccessful
Trigger a panic to indicate the input was malformed
Copy JavaScript and try to keep going, resulting in a slice that isn't sorted even after you call sort()

All three options are equally terrible, so instead of making sort() do the wrong thing out of the box the standard library force developers to handle the fact that their type can't be properly ordered. Hence the T: Ord requirement instead of T: PartialOrd.

In these situations, you can use the slice::sort_by() method to provide your own comparison logic (e.g. my_floats.sort_by(|left, right| left.partial_cmp(right).unwrap_or(Ordering::Less))).

AurevoirXavier · January 5, 2022, 5:51am

impl PartialOrd for S {
	fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
		if self.a == other.a {
			return other.b.partial_cmp(&self.b);
		}
		self.a.partial_cmp(&other.a)
	}
}

impl Ord for S {
	fn cmp(&self, other: &Self) -> Ordering {
		if self.a == other.a {
			return self.b.cmp(&other.b);
		}
		self.a.cmp(&other.a)
	}
}

The partial_cmp(PartialOrd) is the right one that I want. cmp(Ord)'s order is reversed.
But sort works as expected.
This is so confusing. sort didn't depend on Ord's implementation.

the standard library force developers to handle the fact that their type can't be properly ordered

So I don't think this is a good idea.

RedDocMD · January 5, 2022, 5:56am

AurevoirXavier:

impl PartialOrd for S {
	fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
		if self.a == other.a {
			return other.b.partial_cmp(&self.b);
		}
		self.a.partial_cmp(&other.a)
	}
}

impl Ord for S {
	fn cmp(&self, other: &Self) -> Ordering {
		if self.a == other.a {
			return self.b.cmp(&other.b);
		}
		self.a.cmp(&other.a)
	}
}

The partial_cmp(PartialOrd) is the right one that I want. cmp(Ord) 's order is reversed.

This isn't a mathematically sound implementation of Ord given the implementation of PartialOrd. Rust, though, has no way of preventing you from doing it.

Michael-F-Bryan · January 5, 2022, 5:58am

I don't think it's a good idea for your PartialOrd and Ord implementations to behave differently. In particular, you are actually violating Ord's invariants:

Implementations must be consistent with the PartialOrd implementation, and ensure max , min , and clamp are consistent with cmp :

partial_cmp(a, b) == Some(cmp(a, b)) .

max(a, b) == max_by(a, b, cmp) (ensured by the default implementation).

min(a, b) == min_by(a, b, cmp) (ensured by the default implementation).

For a.clamp(min, max) , see the method docs (ensured by the default implementation).

Normally if your type implements both, you'll implement one in terms of the other.

struct Foo { ... }

impl Ord for Foo {
  fn cmp(&self, other: &Self) -> Ordering { ... }
}

impl PartialOrd for Foo {
  fn partial_cmp(&self, other: &Self) -> Option<Ordering> { Some(self.cmp(other)) }
}

If you want your sorting to be reversed, then provide your own comparison which explicitly reverses the ordering - slice.sort_by(|a, b| b.cmp(a)).

H2CO3 · January 5, 2022, 5:58am

Decades of industrial experience and probably hundreds or thousands of CVEs disagree with you.

AurevoirXavier · January 5, 2022, 6:00am

Yes, there is a bug in my Ord's implementation. But sort() works well and I didn't find this bug before.
So, I think sort() require PartialOrd would be better.

RedDocMD · January 5, 2022, 6:04am

Look at what H2C03 said....
Also Michael-F-Bryan gives the technical reason behind it....

AurevoirXavier · January 5, 2022, 6:54am

This one persuades me.
Yep, also some historical reasons. That makes sense.

https://stackoverflow.com/questions/70588237/why-sort-need-t-to-be-ord/70588789#70588789

tczajka · January 5, 2022, 11:15am

That is very surprising. While technically fine because requirements say this is equivalent to using cmp when Ord is implemented, I think it should be nevertheless changed to use Ord::cmp.

jbe · January 5, 2022, 11:22am

Let's look at this from a mathematical p.o.v.

There are partial orders and weak orders (in form of a "strict weak order", or a "non-strict weak order", of which the latter is also called "total preorder", see linked Wikipedia article).

IMHO sorting a slice based on a partial order usually makes no sense because there could be two subsets which each can be sorted, but there is no relation between two elements when each of these elements is in a different of the two subsets.

For example:

USD 50.00, USD 43.22, EUR 1.00, EUR 2.00

We could define a partial order where equal currencies can be compared, while different currencies don't compare. Possible sorting results could be:

USD 43.22, USD 50.00, EUR 1.00, EUR 2.00
EUR 1.00, EUR 2.00, USD 43.22, USD 50.00

I would claim that it is semantically wrong to call this "sorting" the list. I would say you could "partially" sort it (which is something different than an "unstable" sort).

However, if we have a weak order, e.g. ordering rectangles based on their area, then we could easily sort these. But there is no trait for a weak order in Rust!

alice · January 5, 2022, 11:27am

I mean, you totally can do the sort though:

rects.sort_by_key(|rect| rect.area());

tczajka · January 5, 2022, 11:29am

OK I see what the reason for this is:

github.com/rust-lang/rust

Slightly optimize slice::sort

rust-lang:master ←

opened 03:40PM - 04 Feb 17 UTC

ghost

+39 -38

First, get rid of some bound checks. Second, instead of comparing Benchmark: ``` name slice::bench::sort_large_ascending slice::bench::sort_large_big_ascending slice::bench::sort_large_big_descending slice::bench::sort_large_big_random slice::bench::sort_large_descending slice::bench::sort_large_mostly_ascending slice::bench::sort_large_mostly_descending slice::bench::sort_large_random slice::bench::sort_large_random_expensive slice::bench::sort_medium_ascending slice::bench::sort_medium_descending slice::bench::sort_medium_random slice::bench::sort_small_ascending slice::bench::sort_small_big_ascending slice::bench::sort_small_big_descending slice::bench::sort_small_big_random slice::bench::sort_small_descending slice::bench::sort_small_random ``` by ternary …`compare` function, use a binary function testing whether an element is less than some other element. This apparently makes it easier for the compiler to reason about the code. I've noticed the same effect with [pdqsort](https://github.com/stjepang/pdqsort) crate. before ns/iter after ns/iter diff ns/iter diff % 8,969 (8919 MB/s) 7,410 (10796 MB/s) -1,559 -17.38% 355,640 (3599 MB/s) 359,137 (3564 MB/s) 3,497 0.98% 427,112 (2996 MB/s) 424,721 (3013 MB/s) -2,391 -0.56% 2,207,799 (579 MB/s) 2,138,804 (598 MB/s) -68,995 -3.13% 13,694 (5841 MB/s) 13,514 (5919 MB/s) -180 -1.31% 239,697 (333 MB/s) 203,542 (393 MB/s) -36,155 -15.08% 270,102 (296 MB/s) 234,263 (341 MB/s) -35,839 -13.27% 513,406 (155 MB/s) 470,084 (170 MB/s) -43,322 -8.44% 23,650,321 (3 MB/s) 23,675,098 (3 MB/s) 24,777 0.10% 143 (5594 MB/s) 132 (6060 MB/s) -11 -7.69% 197 (4060 MB/s) 188 (4255 MB/s) -9 -4.57% 3,358 (238 MB/s) 3,271 (244 MB/s) -87 -2.59% 32 (2500 MB/s) 32 (2500 MB/s) 0 0.00% 97 (13195 MB/s) 97 (13195 MB/s) 0 0.00% 247 (5182 MB/s) 249 (5140 MB/s) 2 0.81% 502 (2549 MB/s) 498 (2570 MB/s) -4 -0.80% 55 (1454 MB/s) 61 (1311 MB/s) 6 10.91% 358 (223 MB/s) 356 (224 MB/s) -2 -0.56%

It's faster when lt() can do less work than cmp when it's specialized.

I'm not sure why it's actually (very slightly) faster for built-in types, according to that ticket, since the compiler should be able to see the two ways to compare built-in types are equivalent.

jbe · January 5, 2022, 11:30am

Yes, but you can't do this with .sort (which uses Ord) unless you make two rectangles with the same area equal.

Thus there is no trait to represent the weak ordering. It's represented through the closure |rect| rect.area() instead.

P.S.: My screaming smiley was an exaggeration. I guess there isn't really a need to introduce a trait for weak orders, as these can always be achieved easily through a sorting key like you suggested. I just wanted to point out that PartialOrd shouldn't be confused with WeakOrd (which doesn't exist).

louy2 · January 5, 2022, 11:34am

Could you expand on there is no trait for a weak order in Rust? If there are partial orders and those are PartialOrd, and there are weak orders but those are not Ord, then what is Ord?

steffahn · January 5, 2022, 11:38am

IMO, Ord is definitely a trait for weak orders. It's just incorrectly documented (it should say "total preorder", yet it does say "total order" im the docs). Look at Eq which is correctly documented as a trait for equivalence relations, not for equality. The distinction between stable and unstable sorting algorithms only really makes sense for weak orders, too. (The German version of the Wikipedia article also does point that out.)

(The use of "total order" in the docs makes some sense if the equivalence by the Eq implementation is (mis-)interpreted as equality. After all, a total preorder is the same as an actual total order on the equivalence classes.)

Topic		Replies	Views
Why do all sorted Data Structures only allows types that implement `Ord` help	6	1448	May 6, 2022
Traits in `std::cmp` and mathematical terminology help	44	3175	April 7, 2022
Error with simple sort implementation help	12	1633	January 12, 2023
Total order for floats? help	123	4968	December 30, 2023
Performance of sorting floating point numbers?	11	3542	January 12, 2023

Why `sort()` need `T` to be `Ord`?

Related topics