Cannot call Vec::sort_by_key without cloning data?

pretzelhammer · May 29, 2021, 4:59pm

I'd like to sort a Vec<String> in ascending or descending order. Here's how I'd do both in a simple example:

use std::cmp::Ordering;

fn main() {
    let mut strings = vec!["b".into(), "c".into(), "a".into()];
    sort_asc(&mut strings);
    assert_eq!(strings, ["a", "b", "c"]); // passes
    sort_desc(&mut strings);
    assert_eq!(strings, ["c", "b", "a"]); // passes
}


fn sort_asc(strings: &mut Vec<String>) {
    strings.sort();
}

fn sort_desc(strings: &mut Vec<String>) {
    strings.sort_by(|x, y| {
        match x.cmp(y) {
            Ordering::Less => Ordering::Greater,
            Ordering::Greater => Ordering::Less,
            _ => Ordering::Equal,
        }
    })
}

The sort_asc function is concise enough that I can just get rid of it and replace it with vec.sort() wherever I would need it, but the sort_desc is quite a lot to remember. Ideally I'd like to reduce sort_desc down to a 1-liner and get rid of it as well. So here's my 1st attempt at that:

use std::cmp::Reverse;

fn sort_desc_2(strings: &mut Vec<String>) { // works
    strings.sort_by(|x, y| {
        Reverse(x).cmp(&Reverse(y))
    })
}

This is slightly better, but not quite there yet. Here's my 2nd attempt:

use std::cmp::Reverse;

fn sort_desc_3(strings: &mut Vec<String>) {
    strings.sort_by_key(|x| Reverse(x)) // compile error
}

This is ideal... except it doesn't compile. It throws:

error: lifetime may not live long enough
  --> src/main.rs:34:29
   |
34 |     strings.sort_by_key(|x| Reverse(x))
   |                          -- ^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                          ||
   |                          |return type of closure is Reverse<&'2 String>
   |                          has type `&'1 String`

This error doesn't make any sense. It doesn't make sense why the input and output lifetimes would be different and it doesn't make sense why the sort_by_key method would require that in the first place, and in fact it doesn't, this is just a gotcha of Rust: lifetime elision works differently for closures than for regular functions. Thankfully I am familiar with this gotcha and how to fix it, or so I thought, because I tried this and it also didn't work but now with a different error:

fn sort_desc_by_string_key(s: &String) -> Reverse<&String> {
    Reverse(s)
}

fn sort_desc_4(strings: &mut Vec<String>) {
    strings.sort_by_key(sort_desc_by_string_key) // compile error
}

Now throws:

error[E0308]: mismatched types
  --> src/main.rs:42:13
   |
42 |     strings.sort_by_key(sort_desc_by_string_key)
   |             ^^^^^^^^^^^ one type is more general than the other
   |
   = note: expected associated type `<for<'r> fn(&'r String) -> Reverse<&'r String> {sort_desc_by_string_key} as FnOnce<(&String,)>>::Output`
              found associated type `<for<'r> fn(&'r String) -> Reverse<&'r String> {sort_desc_by_string_key} as FnOnce<(&String,)>>::Output`

This error also makes no sense. It says it expects $TYPE and it found $TYPE so what's the issue? Of course I could dodge all these errors by cloning but it feels super wasteful because it should be totally unnecessary and this shouldn't be so hard!

fn sort_desc_5(strings: &mut Vec<String>) {
    strings.sort_by_key(|x| Reverse(x.clone())) // 🤦
}

Can someone please show me how I can call sort_by_key on a Vec<String> without having to clone the Strings? Thank you.

drewkett · May 29, 2021, 5:23pm

How about

fn sort_desc(strings: &mut Vec<String>) {
    strings.sort_by(|a,b| b.cmp(a))
}

pretzelhammer · May 29, 2021, 5:50pm

Wow, can't believe I missed that. I completely tunnel-visioned on using sort_by_key and Reverse. While this solves my immediate problem of having a 1-liner to sort stuff in descending order I would still like to see a solution to my less-immediate problem of sorting stuff using sort_by_key without having to clone data. For example, let's say I had this:

struct Person {
    name: String,
    // many other fields
}

fn sort_people(mut people: Vec<Person>) {
    people.sort_by_key(|person| &person.name) // compile error
}

Person does not impl Ord but I'd just like to sort people by their name fields without having to clone it. Well, in that case I guess I can just do this then:

fn example(mut people: Vec<Person>) {
    people.sort_by(|p1, p2| p1.name.cmp(&p2.name)) // works
}

I dunno, I guess I'm just frustrated that the sort_by_key method comes with such a random and unexpected gotcha. I'd still like to understand why I'm getting the error that I am when trying to compile the sort_desc_4 example.

alice · May 29, 2021, 6:07pm

pretzelhammer:

Person does not impl Ord but I'd just like to sort people by their name fields without having to clone it. Well, in that case I guess I can just do this then:
fn example(mut people: Vec<Person>) {
    people.sort_by(|p1, p2| p1.name.cmp(&p2.name)) // works
}

That's the only workaround, unfortunately.

The problem is that sort_by_key is defined with the following signature:

pub fn sort_by_key<K, F>(&mut self, f: F) where
    F: for<'a> FnMut(&'a T) -> K,
    K: Ord,

This says, given two fixed types K and T, the type F must implement the following infinite list of traits:

FnMut(&'lifetime1 T) -> K
FnMut(&'lifetime2 T) -> K
FnMut(&'lifetime3 T) -> K
... and so on for every possible lifetime

However, the traits that your |person| &person.name closure implements are:

FnMut(&'lifetime1 Person) -> &'lifetime1 str
FnMut(&'lifetime2 Person) -> &'lifetime2 str
FnMut(&'lifetime3 Person) -> &'lifetime3 str
... and so on for every possible lifetime

Unfortunately, &'lifetime1 str and &'lifetime2 str are two different types, which is in violation the signature because it requires a single specific type K.

There is no way to write down a signature for sort_by_key that would allow a closure like yours besides something like this:

pub fn sort_by_key<K, F>(&mut self, f: F) where
    F: for<'a> FnMut(&'a T) -> &'a K,
    K: Ord,

However now you require the key to be a reference, which disallows keys that are computed in some manner.

alice · May 29, 2021, 6:09pm

By the way, another solution is to call Ordering::reverse.

fn sort_desc(strings: &mut Vec<String>) {
    strings.sort_by(|a,b| a.cmp(b).reverse())
}

This solution arguably makes it more obvious that the order is reversed.

2e71828 · May 29, 2021, 6:29pm

Edit: this code is incorrect; see below.

~~It's a bit less elegant than the other solutions, but you could do something like this:~~

// Determine correct order let mut indices: Vec<usize> = (0..strings.len()).collect(); indices.sort_by_key(|&i| Reverse(&strings[i])); // Put strings in the right place for (i,j) in indices.into_iter().enumerate() { strings.swap(i, j); }

~~(Rust Playground)~~

Cerber-Ursi · May 30, 2021, 5:32am

That's not working - the already correctly-placed item would be swapped away while trying to put in place the later one. In playground, output is ["a", "b", "c", "e", "d"].

2e71828 · May 30, 2021, 8:15am

You’re right; I must have been out of it when I wrote that last night. ~~The correct swap code is~~

~~// Put strings in the right place for i in 0..strings.len() { while i != indices[i] { let j = indices[i]; strings.swap(i, j); indices.swap(i, j); } }~~

~~But there’s still a problem I can’t work out: The indices list is being calculated as [4,2,3,0,1] and I don’t understand why.~~

(Removing the Reverse correctly produces an ascending list).

zthompson47 · May 30, 2021, 2:05pm

It's all a bit confusing to me, but that looks like the correct order. strings[4] is 'e', strings[2] is 'd', etc... I got forward and reverse sort working by replacing strings.swap(i, j) with strings.swap(indices[i], indices[j]) as in this playground.

steffahn · May 30, 2021, 3:42pm

I disagree.

trait BetterSortByKeyExt<T> {
    fn better_sort_by_key<F>(&mut self, f: F)
    where
        for<'a> F: GetKey<&'a T>;
}

trait GetKey<RefT>: FnMut(RefT) -> Self::Key {
    type Key: Ord;
}

impl<S: ?Sized, RefT, K> GetKey<RefT> for S
where
    S: FnMut(RefT) -> K,
    K: Ord,
{
    type Key = K;
}

impl<T> BetterSortByKeyExt<T> for [T] {
    fn better_sort_by_key<F>(&mut self, mut f: F)
    where
        for<'a> F: GetKey<&'a T>,
    {
        self.sort_by(|a, b| f(a).cmp(&f(b)))
    }
}

Now let’s try using this…

// helps because it has an explicit type signature
fn reverse_helper<T: ?Sized>(x: &T) -> Reverse<&T> {
    Reverse(x)
}

fn sort_desc_3(strings: &mut Vec<String>) {
    // strings.better_sort_by_key(Reverse); // compile error
    // strings.better_sort_by_key(|x| Reverse(x)); // compile error
    // strings.better_sort_by_key(|x: &_| Reverse(x)); // different compile error
    strings.better_sort_by_key(reverse_helper); // works
    // strings.better_sort_by_key(|x| reverse_helper(x)); // compile error again
    // strings.better_sort_by_key(|x: &_| reverse_helper(x)); // different compile error again
}

looks like we’re still having to work around limitations of closures, but it demonstrates the principle. But I’m fairly confident that all the cases where sort_by_key already works should work with this more general version, too. Except for use cases that explicitly pass the type parameter K, because that one’s gone now, so it’s technically a breaking change to bring something like this into std.

(playground)

steffahn · May 30, 2021, 4:12pm

maybe in the meantime macros can help here for writing closures that are generic over elided lifetimes. This is what I came up with:

macro_rules! generic_parametrized {
    ($Fn:ident $([$($generics:tt)*]$([$($bounds:tt)*])?)? |$($($arg:ident: $Arg:ty),+$(,)?)?| $(-> $Ret:ty)? $body:block) => {
        {
            fn __local_helper_function<$($($generics)*,)?F: $Fn($($($Arg),+)?) $(-> $Ret)?>(f: F) -> F
            $($(where $($bounds)*)?)?
            { f }
            __local_helper_function::<$($($generics)*,)?_>(|$($($arg),+)?| $body)
        }
    };
}

macro_rules! generic_once {
    ($($t:tt)*) => { generic_parametrized!{FnOnce $($t)*} }
}

macro_rules! generic_mut {
    ($($t:tt)*) => { generic_parametrized!{FnMut $($t)*} }
}

macro_rules! generic {
    ($($t:tt)*) => { generic_parametrized!{Fn $($t)*} }
}

use std::cmp::Reverse;

fn sort_desc_3(strings: &mut Vec<String>) {
    strings.better_sort_by_key(generic_mut!(|x: &String| -> Reverse<&String> { Reverse(x) }));
}

// using generics
fn sort_vec_ref<T: Ord>(v: &mut Vec<T>) {
    v.better_sort_by_key(generic_mut!([T] |x: &T| -> Reverse<&T> { Reverse(x) }))
}

#[derive(PartialEq, Eq, PartialOrd, Ord)]
struct WithBounds<T: Clone>(T);

// with bounds
fn sort_vec_with_bounds<T: Ord + Clone>(v: &mut Vec<WithBounds<T>>) {
    v.better_sort_by_key(generic_mut!([T][T: Clone] |x: &WithBounds<T>| -> Reverse<&WithBounds<T>> { Reverse(x) }))
}

(playground)

Yandros · May 30, 2021, 5:04pm

(Tip: you shouldn't need the FnMut / Fn variants of the macro, since you are returning the original F anyways)

steffahn · May 30, 2021, 5:18pm

I thought so too. It’s not true though. I had to learn that the hard way, since I started with just a macro using FnOnce and just got the weird error messages. Try it yourself by replacing any of the generic_mut! with generic_once!. Rust’s closure type inference is a weird beast.

alice · May 30, 2021, 8:37pm

I understand that you can make custom traits for these kinds of things, but closures don't work with those traits.

steffahn · May 30, 2021, 8:53pm

It’s less of a “custom trait” and more of a “synonym for specific closure trait bound (without the syntactic limitation that forces you to always name the return type)” though, provided that the “custom trait” has no methods at all and instead relies on a closure supertrait, and also that its blanket impl provides implementations for all types that could possibly implement it. I don’t know what exactly you mean by “closures don’t work with those traits”, as they do work with my trait, just closure type inference is crappy when the return type contains lifetimes that must depend on closure argument type’s lifetimes. (It doesn’t really work at all unless for the case that the types involved are already very explicitly provided to the level of detail that it’s clear where the relevant lifetime parameters are and how they’re interacting).
[Here’s a playground where with feature(unboxed_closures) some closure bounds are used directly. It doesn’t like the Ord bounds; without Ord bounds, the behavior is the same as with the GetKey trait synonym.]

Edit: I guess, there is a difference in usability in the case where the lifetime in the return type isn’t needed at all.

    strings.better_sort_by_key(|x: &String| (Reverse(x.clone()))); // works
    // strings.better_sort_by_key(|x| (Reverse(x.clone()))); // doesn’t work!
    strings.sort_by_key(|x: &String| (Reverse(x.clone()))); // works
    strings.sort_by_key(|x| (Reverse(x.clone()))); // works

This difference applies equally to the for<'a> F: GetKey<&'a T> version as it does to the unboxed_closures bound for<'a> F: FnMut<(&'a T,)>. I supposes, eventually better type inference should hopefully fix both cases in the future.

Raniz · June 3, 2021, 8:01pm

However now you require the key to be a reference, which disallows keys that are computed in some manner.

Couldn't that be solved with a Cow?

alice · June 3, 2021, 8:20pm

Hm, I suppose it seems like you could. It would have a runtime cost though. And the ToOwned bound that Cow uses could be inconvenient.

Raniz · June 4, 2021, 6:23am

Yeah, the requirement on ToOwned may be a bit superflous, maybe one could have a similar enum without that requirement.

Any runtime cost of Cow would likely be less than that of cloning though (when a reference is enough).

However, the best solution might be to just have an additional method named sort_by_key_ref.

alice · June 4, 2021, 8:36am

Of course, but if you compare to using sort_by with a cmp call, then the sort_by will still be best.

system · September 2, 2021, 8:37am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
How do I sort a `Vec<(f32, f32)>` help	4	2408	August 13, 2020
[Solved] Advanced sort on vector of slice help	18	6653	January 12, 2023
Sorting complex vec structures	7	2061	January 12, 2023
Sorting in Rust	3	1859	August 19, 2020
Sort Vec by another Vec help	5	824	October 19, 2021

Cannot call Vec::sort_by_key without cloning data?

Related topics