Why are two lifetime parameters needed here?

While going through this, I came across a lifetimes issue:

This compiles fine:

struct S<'long, 'short> {
    data: &'long &'short str,
}

impl<'long, 'short> S<'long, 'short> {
    fn act(&self) -> &'short str {
        self.data
    }
}

fn act_on_str(x: &str) -> &str {
    let s = S { data: &x };
    s.act()
}

But it fails if the same lifetime is used for both levels of indirection of data:

struct S<'a> {
    data: &'a &'a str,
}

impl<'a> S<'a> {
    fn act(&self) -> &'a str {
        self.data
    }
}

fn act_on_str(x: &str) -> &str {
    let s = S { data: &x };
    s.act()
}

with

error[E0515]: cannot return value referencing function parameter `x`
  --> src/lib.rs:15:5
   |
14 |     let s = S { data: &x };
   |                       -- `x` is borrowed here
15 |     s.act()
   |     ^^^^^^^ returns a value referencing data owned by the current function

I don't understand why we need 2 lifetime parameters here. Why does the compiler demand that the outer reference (&&str) have a strictly longer lifetime than the inner one (&str)? Or is something else going on?

PS: I looked through older questions to no avail, but I suspect this should have been asked before. The most related I found is this, but you may point me to something more relevant.

The input reference to a string slice, x, is a borrow whose lifetime (may) extend beyond the end of the function body (since generic parameters, including lifetime generics, are chosen by the caller).

Then we have x, which is an instance in and on itself.

As with any function param, it becomes a local binding / local variable of the function, and is thus moved away / dropped / ceases to exist when the function returns.

And since an instance cannot be borrowed after it is dropped / goes away, the lifetime of the borrow over x itself, is a lifetime that cannot go beyond the end of the function body.

let inner = String::new();
let ret = act_on_str(&inner);
drop(ret);
drop(inner);

// becomes:

let inner = String::new();
let ret = {
    let x = &inner; // ------------------+
    let s = S { data: &x }; //           |
    s.act() //                           |
}; // <--------------- `x` dropped here -+ (outer lifetime cannot go beyond this point)
drop(ret); //                            | (inner lifetime must go beyond this point)
drop(inner); // <- `inner` dropped here -+ 

That's why requiring that both lifetimes be exactly the same is too restrictive.

  • One thing that can happen, is that the inner lifetime can shrink down to that of the borrow over x itself, but in that case the inner lifetime is now too short for it to be returned.

Addendum

You therefore got 'short' and 'long swapped:

//                     ≥
struct S<'short, 'long : 'short> {
    data: &'short &'long str,
}
3 Likes

You got it all mixed up. You need to swap the names long and short in your source code to correctly reflect what’s going on. (Ah, now that I’ve finished writing this, @Yandros already noticed this, too.)

A reference type &'a T is only a valid type if T: 'a, i.e. the lifetime 'a of the reference must be such that T outlives 'a (you read the : as “outlives”). This notation, T: 'a means, roughly, that (the lifetime of) all references (transitively) contained inside of T must outlive 'a. Finally, 'b: 'a between two references, read as 'b' outlives 'a means that 'b is “longer” than 'a. It does not need to be strictly longer (i.e. 'a: 'a is true), and longer vs. shorter means, thinking about lifetimes roughly as scopes / code-blocks, that the “shorter” lifetime is entirely contained within the longer one. The intuition for why &'a T is only a valid type if T: 'a is something like this: A reference of type &'a T is supposed to be valid for its whole lifetime 'a, and a valid reference must point to valid data. If that data itself contains references, then the data is only valid while all the contained references are valid. Thus, throughout all of 'a all of the references inside T must still be valid.

In your example you have a nested reference, so we need to apply the knowledge from the previous paragraph to the case where T is of the form &'b S. Now all references in T is the reference &'b S itself as well as all references inside S. Thus T: 'a, i.e. &'b S: 'a is fulfilled whenever both 'b: 'a and S: 'a.

We can conclude that the type &'a &'b S is valid whenever &'b S: 'a holds and &'b S is valid, i.e. whenever 'b: 'a and S: 'a as well as S: 'b. The “outlives” relation is transitive, hence S: 'a follows from S: 'b and 'b: 'a. The final conclusion is that &'a &'b S is valid whenever 'b: 'a and S: 'b.

For the concrete type &'a &'b str, the type str does not contain any more references, hence str: 'b is trivially true for all lifetimes 'b. (The fact “str: 'b is true for all lifetimes 'b” is usually written as “str: 'static”. The lifetime 'static is longer than [i.e. outlives] every other livetime, this means 'static: 'b is true for any 'b. Then str: 'b follows transitively from str: 'static and 'static: 'b.) This means that &'a &'b str is valid whenever 'b: 'a, i.e. 'b is longer than 'a. By the way, the reason that you can still write

struct S<'short, 'long> {
    data: &'short &'long str,
}

as opposed to

struct S<'short, 'long: 'short> {
    data: &'short &'long str,
}

is that the rust compiler infers and enforces these relations between lifetimes implicitly and automatically. Both versions of the code do the same thing, one is just more explicit.

2 Likes

I did get it backwards. On second thought, I presume it is in general impossible to have x: &'a &'b T where the lifetime 'b is strictly shorter than 'a, because then after the end of 'b and before the end of 'a, x would be a dangling pointer. Please correct me if this is wrong.

Also, just to see if I got it, the issue with a single lifetime would be that it would demand data to live at least as long as *data (call this data >= *data) since they have the same lifetime. However, because s and its data are created within the scope of act_on_str() (and thus both are dropped right at the end of this scope) whereas *data is returned (and thus outlives the end of the scope), it is data < *data, therefore the demand can't be satisfied. Is this intuition right?

2 Likes

It seems correct