Help annotate lifetimes

according to a tutorial this has a problem validating which reference to use:

fn main() {
    let magic1 = String::from("abracadabra!");
    let magic2 = String::from("shazam!");

    let result = longest_word(&magic1, &magic2);
    println!("The longest magic word is {}", result);
}

fn longest_word(x: &String, y: &String) -> &String {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

the error will be:

error[E0106]: missing lifetime specifier
     --> src/main.rs:9:38
      |
    9 | fn longest_word(x: &String, y: &String) -> &String {
      |                    ----        ----        ^ expected named lifetime parameter
      |
      = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
    help: consider introducing a named lifetime parameter
      |
    9 | fn longest_word<'a>(x: &'a String, y: &'a String) -> &'a String {
      |                ^^^^    ^^^^^^^        ^^^^^^^        ^^^

the solution is to use named lifetime parameters:

fn longest_word<'a>(x: &'a String, y: &'a String) -> &'a String {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

my question is, even the solution is confusing. it only prefixed the parameters with the same name 'a so how could the borrow checker know which one, x or y, it should refer to?

also, why would a return type bother which parameter returns as long as the type is the same as declared?

It doesn't. The whole point of the annotation is that both x and y and the return type have the same lifetime, so the return value can come from either. The compiler doesn't analyze the signature differently depending on the body. The very goal of the signature is to provide an interface that can be used for analyzing code that calls the function independently from the function's implementation.

Again, a reference in itself is not a type. &String is not a type. Only &'a String is a type for some lifetime 'a. The desugaring of the signature of the original longest_word is:

fn longest_word<'x, 'y>(x: &'x String, y: &'y String) -> &'??? String {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

so there are two distinct lifetime parameters introduced by the two reference-typed arguments. So, the compiler doesn't attempt to guess which one the return value should be tied to. But that's absolutely important, so you must specify it by hand.

This is nothing special, it's just elementary logic – if you are borrowing from a value, then the function's signature must contain enough information to allow the compiler to determine whether the code is correct or you would be creating dangling pointers.

For all the compiler knows, it would be entirely possible to create a function with the following signature:

fn longest_word<'x, 'y>(x: &'x String, y: &'y String) -> &'x String

which would then mean that you are not allowed to return y from it, because the compiler couldn't verify that y is also valid for long enough.

4 Likes

This is perhaps one of the most important aspects to understand about Rust. Things like lifetime analysis and type inference do not cross function boundaries. A function signature must be valid by itself, without reference to how it is used, or the body of the function.

When checking the body of a function, the compiler will not take how or where the function is used into account. When calling a function, the compiler will not take the body of that function into account.

If you ever catch yourself thinking "but this should be valid because of how I'm going to use this" or "... because of the implementation", you're thinking about it wrong, and will likely just confuse yourself.

The great upshot is that when you are trying to read and understand code, you can limit your analysis to the signature of the function you're looking at, and the signature of functions you're calling.

5 Likes

Types which are the same except for the lifetime are still different types. So &'a String and &'b String are different types unless 'a == 'b.

If you had a function like so:

fn one_or_the_other(x: bool, y: u32) -> ???

there's no way to return exactly a bool or exactly a u32 to be determined at runtime, because Rust is statically typed.

In the ambiguous lifetime case, you want to do something like:

// fn longest_word(x: &String, y: &String) -> [ one or the other ]
// Desugared
fn longest_word<'a, 'b>(x: &'a String, y: &'b String) -> ???

So a similar problem presents itself. In this case however, there's an out: the lifetimes of shared references are covariant, meaning, a shorter lifetime is a supertype of a longer lifetime. You can upcast from the longer lifetime to the shorter lifetime.

Or more casually, you can shorten the lifetime of a reference. For example, we can shorten both of the input lifetimes down to some intersection of the two:

fn longest_word<'a: 'c, 'b: 'c, 'c>(x: &'a String, y: &'b String) -> &'c String

'a: 'c means "'a outlives (or is equal to) 'c", and similarly for 'b: 'c. Together, this means either one can be "shortened" to 'c and so either input parameter could be returned.

But this is a pretty verbose signature. Instead, we can have the caller shorten the lifetimes of the arguments down to some common intersection lifetime at the call site instead:

fn longest_word<'c>(x: &'c String, y: &'c String) -> &'c String

There's no practical loss in functionality compared to the version with three different lifetimes.

Now our signature is more like:

fn one_or_the_other(x: u32 y: u32) -> u32

where returning either x or y is possible.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.