Can't understand this lifetime

Hi folks!

Please, could you help me understand this lifetime and make it work?

/// Sort the files by groups, and apply some algorithm on each.
fn detect_duplicates<'a, G, FG, FA>(medias: &'a mut [Media], grouping: FG, algo: FA) -> usize
where
    G: PartialEq + Ord + 'a,
    FG: Fn(&Media) -> &'a G,
    FA: Fn(&G, &[Media], &mut usize),
{
    medias.sort_unstable_by_key(|m| grouping(m));
    let mut count = 0;
    medias
        .chunk_by(|m, n| grouping(m) == grouping(n))
        .for_each(|acc| algo(grouping(&acc[0]), acc, &mut count));
    count
}

This code actually type-checks. I thought I've covered the lifetime here, making it very clear that FG (the func for grouping) returns 'a bounded references from the original &'a mut [Media].

But, when I try to use it, Rust is not happy:

    // first by size.
    detect_duplicates(&mut medias, |m| &m.size, |&size, acc, count| todo!());

    // then by name.
    let count_name = detect_duplicates(&mut medias, |m| &m.words, |words, acc, count| todo!());
error: lifetime may not live long enough
  --> src/lib.rs:18:40
   |
18 |     detect_duplicates(&mut medias, |m| &m.size, |&size, acc, count| todo!());
   |                                     -- ^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                                     ||
   |                                     |return type of closure is &'2 u64
   |                                     has type `&'1 Media`

error: lifetime may not live long enough
  --> src/lib.rs:21:57
   |
21 |     let count_name = detect_duplicates(&mut medias, |m| &m.words, |words, acc, count| todo!());
   |                                                      -- ^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                                                      ||
   |                                                      |return type of closure is &'2 Vec<String>
   |                                                      has type `&'1 Media`

I've tried every possible combination of lifetimes, including FG: for<'b> Fn(&'b Media) -> &'b G, but nothing has worked.
@quinedot You've helped me before with a similar problem... I'd appreciate your input.
Playground: here

Thanks!

1 Like

I think the issue is that sort_unstable_by_key has been written in a way that allows it to move elements while holding the key of an element, which means the key can't reference the element. It doesn't do that, but it has the same signature as sort_by_cached_key which does do that. I don't think it's possible in current Rust to write it differently from sort_by_cached_key.

So in summary, detect_duplicates can't call sort_unstable_by_key.

You can make it work with sort_unstable_by.

fn detect_duplicates<G, FG, FA>(medias: &mut [Media], grouping: FG, algo: FA) -> usize
where
    G: PartialEq + Ord,
    FG: Fn(&Media) -> &G,
    FA: Fn(&G, &[Media], &mut usize),
{
    medias.sort_unstable_by(|a, b| Ord::cmp(grouping(a), grouping(b)));
    let mut count = 0;
    medias
        .chunk_by(|m, n| grouping(m) == grouping(n))
        .for_each(|acc| algo(grouping(&acc[0]), acc, &mut count));
    count
}
5 Likes

Two things:

  1. Lifetimes in Fn traits are higher-order; no amount of generic lifetime params will satisfy those.
  2. You can't sort_unstable_by_key with a key function that returns a type of which the lifetime depends on the item.

Fixed code.

4 Likes

I agree :100: with the other two replies.

You can see the problem in the sort_unstable_by_key closure bound:

pub fn sort_unstable_by_key<K, F>(&mut self, f: F)
where
    F: FnMut(&T) -> K, // AKA for<'a> FnMut(&'a T) -> K
    K: Ord,

K must be a single type, which means it can't vary based on the input lifetime 'a, which means it can't be a borrow of some field of T. sort_unstable_by is more flexible in this sense.

4 Likes

Wow, amazing answers, thank you @drewtato, @paramagnetic, and @quinedot!
But I confess I didn't quite get it, even after reading several times each answer this afternoon...

How was this deduced? And how would Rust detect that sort_unstable_by_key could move elements while holding a key? And why does this capability "taint" my function somehow?

  1. I think I get this part, but why did this impact my function? It seems that that would be beneficial, actually.
  2. But why does it compile then? I specifically say that FG will return a reference, and no error is reported there.

I found this fascinating! Lifetimes are generics, I know, but at this level?? As far as I could see, they did have the same type K, e.g. &u64 on the first case. But OK, each ref should have an internal elided lifetime, which makes their types different for each element, is that it? Yikes! In this case, the compiler should be able to report something there, as it does seem like an error...

Why is sort_unstable_by "more flexible" in this sense? How could I notice this, and how can the compiler?


Even with almost 4 years of Rust, I still find myself getting stuck like this. And I love Rust so I study it a lot, I code a lot, and I follow blogs of a lot of great Rustaceans, hence I consider myself an expert-ish.
So please, what more should I study to get there? To grasp this kind of advanced concepts? I've never read the Reference or the Rustonomicon, but I will if you say this magic is in there.
Thanks again.

Right, there isn't really "a type &i32". For every distinct lifetime 'a, &'a i32 is a distinct type. Knowing that, and then combining it with the knowledge that type parameters like K have to resolve to a single type, (and then encountering the situation enough times to be familiar), is what let's you see

F: FnMut(&T) -> K

and conclude "ohhh okay, the closure can't return a borrow of T's fields".

Or more generally, "this closure can't borrow". This comes up with closure bound errors a lot: If the inputs have elided (higher-ranked) lifetimes, and the return type is a type parameter, the closure can't borrow from the inputs.


With sort_unstable_by, you perform the comparison between two &T within the closure, instead of trying to return a "key" for a single element. So it's okay if you use a key that borrows from T temporarily: the keys never escape the closure.

One could imagine another method that basically encapsulates the solution of this thread:

pub fn sort_unstable_by_field<K, F>(&mut self, f: F)
where
    F: FnMut(&T) -> &K,
    K: Ord + ?Sized,
{
    self.sort_unstable_by(|a, b| f(a).cmp(f(b)))
}

But note that this bound basically forces the closure to return a field of T,[1] so it is not strictly more general than sort_unstable_by_key (where you can calculate some value on the fly, like an integer, and return that). They serve different use cases.

(Rust still doesn't have a good way to support both use cases in a single method.)

This thread does make me feel that at least an example should be added to the docs that says "if you need to borrow, use sort_unstable_by instead".


It does notice them, but the error messages aren't great. For example, let's see what happens if your detect_duplicates had looked like so:

fn detect_duplicates<G, FG, FA>(medias: &mut [Media], grouping: FG, algo: FA) -> usize
where
    G: PartialEq + Ord,
    FG: Fn(&Media) -> &G,
    FA: Fn(&G, &[Media], &mut usize),
{
    medias.sort_unstable_by_key(|m| grouping(m));
    let mut count = 0;
    medias
        .chunk_by(|m, n| grouping(m) == grouping(n))
        .for_each(|acc| algo(grouping(&acc[0]), acc, &mut count));
    count
}
34 |     medias.sort_unstable_by_key(|m| grouping(m));
   |                                  -- ^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                                  ||
   |                                  |return type of closure is &'2 G
   |                                  has type `&'1 Media`

The compiler saw that the closure returned &G for some lifetime, and tried to find a way to make this a single type (K for sort_unstable_by_key). That would mean the closure would have to return &'g G for one specific lifetime 'g. But because grouping is a borrowing closure, that's not possible -- if the input lifetime is shorter than 'g, you can't return a &'g G. The error is trying to communicate this last part, but the real error is "you can't use a borrowing closure here".

(If you make the closure return &'g G for a single lifetime, you end up with your OP where this function compiles -- but then the error just moves to the call site instead.)

If you try to use grouping itself instead of creating a new closure:

    medias.sort_unstable_by_key(&grouping);

You get a different pair of errors:

error[E0310]: the parameter type `G` may not live long enough
  --> src/lib.rs:34:5
   |
34 |     medias.sort_unstable_by_key(&grouping);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |     |
   |     the parameter type `G` must be valid for the static lifetime...
   |     ...so that the type `G` will meet its required lifetime bounds
   |
help: consider adding an explicit lifetime bound
   |
30 |     G: PartialEq + Ord + 'static,
   |                        +++++++++

That one is mostly a distraction, but there's also:

error[E0308]: mismatched types
    --> src/lib.rs:34:5
     |
34   |     medias.sort_unstable_by_key(&grouping);
     |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ one type is more general than the other
     |
     = note: expected reference `&_`
                found reference `&_`
note: the lifetime requirement is introduced here
    --> /playground/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/mod.rs:3007:25
     |
3007 |         F: FnMut(&T) -> K,
     |                         ^

and the "one type is more general than the other" is the compiler's way of saying "I expected a closure that returns a single type K, but you gave me one that returns a different type based on the input lifetime". That's a lot closer to something useful, at least, but you still have to have enough experience to translate from "more general" to "can't use a borrowing closure here".

In summary, the compiler does notices these things, but struggles to communicate the real problem back to the programmer. Until that improves, you just need the experience to be able to unravel the errors, IMO.


  1. without a bunch of tricks that probably aren't worth it for this small of a convenience, anyway ↩ī¸Ž

5 Likes

Fascinating answer, again, very comprehensive @quinedot!!
Thank you very, very much. I do understand it now!

I particularly liked the insights you provided to notice such things...
Your example, "let's see what happens if your detect_duplicates had looked like so" is exactly what I initially wrote, and after being greeted by that error I tried to bend the rules with manual lifetimes... I didn't know I would just move it to the call sites, awesome!

But the best one was using grouping itself instead of a new closure!! I never imagined this shorthand could have such implications! The errors are pretty interesting, a "type that may not live long enough" (Wow), and a mismatched types because "one is more general than the other".... All the clues were there, but this experience and familiarity were really needed or they just look like cryptic nonsense.

Thanks to everyone who replied here, the Rust community is awesome!

PS.: @quinedot You have a great book there! That will be my next read.

@quinedot already answered this one. The signature of the key projection function is FnMut(&Item) -> Key, which means that the returned key has no lifetime parameter dependent on the argument's (item's) lifetime.

The compiler doesn't look at the implementation of the sort_unstable_by_key() function itself to detect this; it's merely the interface that causes the problem. The implementation is then allowed to rely on this property. (The converse would also be true: if the interface didn't enforce this stronger requirement on the key extraction function, then the implementation of the sorting algorithm wouldn't be allowed to reorder the elements while holding onto the key.)

Because you are trying to return a reference to a field of the item, which does depend on its lifetime.

You can pass any lifetime to a HRTB, including references to very short-lived locals/temporaries, such as the items of an iterator. There's no way to express those lifetimes with a generic lifetime parameter.

Lifetime parameters, like all generic parameters, are chosen by the caller, which means they always outlive the function body. By adding an explicit lifetime param to the argument type of the key function, your were trying to force the key function to accept and/or return references with an arbitrarily long lifetime, which is of course nonsensical if you only have the borrow of a local variable.

1 Like