Why can't the compiler elide this lifetime?

Hi guys, first post. I am a beginner.

Trying to wrap my head around lifetimes, as many beginners are. I honestly thought it had clicked, but then I tried out this useless, but simple code.

fn main() { }

fn longest(x: &str, y: &str) -> &str {
    x
}

Which gives:

error[E0106]: missing lifetime specifier
  --> src/main.rs:39:33
   |
39 | fn longest(x: &str, y: &str) -> &str {
   |               ----     ----     ^ expected named lifetime parameter
   |
   = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
   |
39 | fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
   |           ^^^^    ^^^^^^^     ^^^^^^^     ^^^

I do not understand: how can Rust not elide that (in my eyes obviously!) the returned reference is going to need to have the same lifetime as the reference to x. I understand that the error says "the signature does not say whether it is borrowed from x or y". But why does this have to be a problem for an intelligent compiler i.e. the compiler sees the body of the function right?

I mean, if I can see it, why can the compiler not?

That's exactly it, the compiler doesn't look at the body to check the signature. Rust will never use the body of the function to infer any part of the signature. This let's you write forward compatible signatures when you want or precise signatures when necessary. But the compiler can't know which is preferred so it forces you to decide

5 Likes

Note the distinction between type inference and lifetime elision.

Inference is smart. It looks at all the details, and runs a complicated algorithm to find something that works.

Whereas elision is basic. It has a few simple rules, and applies them straightforwardly without checking to see if they fit with your code -- it'll happily apply elision rules that result in borrowck errors later.

For function signatures, the language has intentionally chosen to be less smart than it could be. For example, it could usually figure out the return type without you needing to specify it either. But it chooses not to because that makes error messages less localized.

For example, imagine you write this:

fn longest(x: &str, y: &str) -> &str {
    todo!()
}

If rust looked at the body of the function, it would be legal for it to say "well, none of them matter" and give you

fn longest<'a, 'b, 'c>(x: &'a str, y: &'b str) -> &'c str

But that's really not what you want as you're writing the rest of the code. By specifying the signature unambiguously you can call it from other code and get the right borrowck compiler errors in those other methods, then come back and fill in the implementation details in place of the todo!() later.

4 Likes

I think one of the most important considerations for why it doesn't do that is that it would prevent the documentation from fully describing the code. As things are now, the generated documentation will use the function header pretty much exactly as written to show you how the function can be called. If the body were used to decide what lifetimes needed to be there, then the header would be insufficient, so the documentation would be insufficient. Lifetimes can be given descriptive names, so automatically generated names for the docs would not be a great solution either.

2 Likes

Incidentally, the suggestion below basically means "I might return (a copy of or something referencing) either argument", which keeps the options open for the function writer:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str

While these would mean it must be one of the specific arguments:

fn must_be_x<'a>(x: &'a str, y: &str) -> &'a str
fn must_be_y<'b>(x: &str, y: &'b str) -> &'b str
// Same things, but more explicit
fn must_be_x<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
fn must_be_y<'a, 'b>(x: &'a str, y: &'b str) -> &'b str {

Which don't make sense for longest and the compiler won't let you get away with it. (Imagine what would happen if you told the compiler you were returning a &'static str and it believed you, but you really returned a shorter-lived &str borrowed from a String.)

2 Likes

OK, that's the little missing piece I needed.

Thinking about the forward compatible part of your answer, in combination with other answers in this topic, it makes sense...Let's say someone uses my function in production code; then changing my body could have huge impact if Rust uses it to dynamically infer lifetimes. It could suddenly break the caller code. While when Rust does not infer lifetimes by means of the body, I can do whatever I want in the body, as long as I respect the signature I once created, with explicit lifetimes if necessary.

Right? Just thinking out loud.

3 Likes

Thanks, your answer helped me understand it's especially important to remain unambiguous for the caller of the code, independent of the body.

1 Like

I like the mental model of thinking from the perspective of the documentation, which should accurately and explicitly describe function signatures. If lifetimes would be inferred from bodies this would go all over the place indeed, which would be bad. Thanks.

Thanks, I love this overview!

I still need to dig into what 'static is precisely. So I will keep your final rhetorical question in mind once I do.

Static is the simplest lifetime, it just means it'll survive forever. In most cases, you get them from literals, static or const variables, or by leaking a box.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.