Lifetime rules for functions without arguments

Yes, that nails the problem. First time I saw Rust I was quite put off by all the 'a - and that's coming from C++ which isn't a stranger to weird syntax constructs. However I can say now that the sooner I learn as a Rust newbie what a lifetime is the better.

However, at the danger of sounding like a broken record, I would still +1 a rule to allow elision for the case of zero arguments like fn foo() -> &X. I just can't see how there is any other option than interpreting it as fn foo<'a>() -> &'a X. Maybe this is a really bad example though as it probably only makes sense for &str (I guess it is allowed for &str..). However for an X<'_> to allow elision for fn foo() -> X as it can only be interpreted as fn foo<'a>() -> X<'a> could make sense in practice.

If it was going to be added, it would be as fn foo() -> X<'_>, because even if it's elided, having that syntactic marker for "there's a lifetime here" is still helpful.

I still think that that's rare enough that it's not worth adding the elision rule, and that overall it's better to just have people write fn foo() -> X<'static>. (And unless it's invariant in the lifetime you can turn a 'static into anything else, so that's usually fine.)

But more concrete examples might help, if you think this really common enough to be worth doing.

3 Likes

I am generally in favor of less "magic", so X<'_> feels better than a game of "guess if that type holds a reference".

concrete examples might help

A free function which builds an object based on some "pure by-value" settings should be treated the same as a member function which builds a object based on some "pure by-value" settings stored in a field. In the same sense it should not matter if I pass those settings by value or reference. Example:

use std::borrow::Cow;
struct Settings { n: i32 }
struct X<'a> { x: i32, c: Cow<'a, str> }

struct Factory { s: Settings }
impl Factory {
  fn build1(&self) -> X { ... }  // ok
}

fn build2(s: &Settings) -> X { ... }  // ok

fn build3(s: Settings) -> X { ... }  // oh no ..

fn build4() -> X { ... }  // oh no ..

Playground

These are still ... examples, so they're not terribly persuasive to me. Especially since

fn build2(s: &Settings) -> X { ... }  // ok

will compile, but looks from those type definitions like it's not what you actually want anyway.

Unelided, it's

fn build2<'a>(s: &'a Settings) -> X<'a> { ... }

But there's nothing useful to borrow from the Settings class you defined, and thus you probably actually wanted to write

fn build2(s: &Settings) -> X<'static> { ... }

instead, since the only useful thing you can put in the X is a string literal, and thus tying its lifetime to that of the argument is unnecessarily restrictive to your caller.

2 Likes

The playground has the full text. The ... is just a X { x : s.n, c : Cow::default() } for this example.

My take would be: I agree that for build2 elision rules did not pick the right choice of the basically two possible interpretations. However for build3 and build4 there is only one way how it could work, but elision rules did not allow to pick it.

Now as a newbie "fixing compiler errors as they appear" by following the suggestions from the compiler error output I end up with:

fn build1(&self) -> X {...}
fn build2(s: &S) -> X {...}
fn build3<'a>(s: S) -> X<'a> {...}
fn build4<'a>() -> X<'a> {...}

Which imo is the worst outcome, because now I still don't have the correct lifetimes for build1 and build2, but I was also forced to "fix" and spend brain cycles on build3 and build4 which were not ambiguous to begin with.

Imho if the compiler would force me towards the code below it would be a better outcome:

fn build1<'a>(&self) -> X<'a> {...}
fn build2<'a>(s: &S) -> X<'a> {...}
fn build3(s: S) -> X {...}
fn build4() -> X {...}

Now the first two cases are "fixed" and have the correct and intended lifetimes, while the other two cases can happily live on because they cannot have lifetime issues.

Not sure if that is the intention of the proposal, but something like this looks good to me too:

fn build1(&self) -> X<'_> {...}
fn build2(s: &S) -> X<'_> {...}
fn build3(s: S) -> X<'_> {...}
fn build4() -> X<'_> {...}

Because although this looks a bit busy, at least now it is both perfectly predictable and correct.

Just as a last though maybe the syntax X<'> could also be an option instead of X<'_>.

1 Like

If you come from C or C++ world then you, most likely, already know what lifetimes are. You can't write valid C or C++ code without understanding these.

You actually need to know what lifetime lifetime markup marks are. And for some reason most Rust documentation hides the truth among so many word.

Basically, two rules:

  1. Lifetime markup doesn't affect the runtime behaviour of valid Rust program at all, they can be simply ignored (that's what mrustc does).
  2. Which means they are not part of the program (in some sense). They are mandatory proof of correctness that every Rust program carries around.

That's it. Took me one reading of the appropriate part book to understand what it is and why do you want it. May be because I learned in specialized math class and my first high education is mathematical, but… in general it wasn't easy for me to understand. It was obvious. I always struggle to understand why anyone may have trouble with that concept.

But I wouldn't call that topic trivial. Because I see how people genuinely struggle with that concept, thus perhaps it's another a monad is a monoid in the category of endofunctors, what's the problem? moment? Obvious for some, crazy hard for most?

Anyway. Lifetime markups are theorems, essentially. I know: math, scary, want to avoid. But it's what it is, in reality.

When compiler looks on the function from inside — it verifies that promised “lifetime theorem” holds. When compiler look on the function from outside — it relies on that “lifetime theorem” to prove correctness of other, higher-level theorems. If the whole program is Ok, then it's time to produce binary.

As simple as that.

The problem is not with the fact that one theorem is hard to imagine and another is hard to imagine. The idea is that fn foo(&X) -> &Yfn foo<'a>(&'a X) -> &'a Y transformation is very useful: 99% of time it's precisely what you want. But fn foo() -> &Yfn foo<'a>() -> &'a Y transformation? Nope: 99% of time that wouldn't be the theorem you want.

As was already explained: 99% of time when compiler observes fn foo<'a>() -> &X that means not that someone forgot to add 'a, but that someone picked the wrong type to return.

You don't need an interpretation for that nonsense. You need an error message.

And that's exactly what you are getting.

Rust is not about being 100% self-consistent (that's Haskell and even Haskell doesn't achieve that ideal). Rust is about being useful. That rule wouldn't be useful.

It would suffer from precisely the same problem as fn foo() -> &Yfn foo<'a>() -> &'a Y transformation. It wouldn't be useful.

Well… that's precisely faith which leads you to believe that compiler understands you even if language doesn't work like this:

If you understand that compiler can not follow your intent then how can you turn around and then hope to arrive at correct program while blindly following compiler recommendations?

This is argument in favor of better error message for functions which don't have arguments yet return references. Maybe even worth filing the appropriate bug.

As you, yourself, noted: this transformation haven't helped you any, in the end, it just made it easier to see the actual bug in your program.

How is it better? For someone who understand how lifetime markups work it would be, more-or-less, the same. For someone who doesn't understand that it would still be strange that first three make some sense while last one doesn't.

Come on: build4 function doesn't have access to any object with non-'static lifetime, where would it get 'a to put it in X<'_> to produce useful “lifetime theorem”?

Because argument may have non-trivial lifetime marks. Nothing that function without argument may access have these. Thus the only thing which it may put in place of '_ would be 'static (or arbitrary lifetime which is practically the same thing). And as the whole discussion shows how that decision wouldn't be useful. It wouldn't lead to correct program faster.

1 Like

It's worth noting, in this context, that Rust's elision rules are evolved based on what was seen in real code.

In pre-1.0 Rust, there was a time where lifetime elision did not exist, at all. So you could not write fn hello() -> &str (still true today), and fn hello(x: &str) -> &str had to be written as fn hello<'a>(x: &'a str) -> &'a str.

The lifetime elision rules we have came from the observation that there were a few patterns in Rust code that recurred all the time, and we should make code like this easier to write:

  1. For input lifetimes only, when you have multiple inputs to a function that need lifetime parameters filled in, guessing that they each have their own unique lifetime is the common case. In other words, most code taking multiple inputs ended up looking like fn foo<'a, 'b, 'c>(a: &'a Bar, u: &'b mut Baz, v: Cow<'c, str>) -> u32, with a unique lifetime for each input that needed a lifetime parameter filled in.
  2. When one of your inputs to a function is a receiver (&'a self or &'b mut self) with a lifetime, the output lifetime is almost always the receiver's lifetime. In other words, fn foo<'a, 'b, 'c>(&'a self, a: &'b Bar, u: &'c mut Baz) -> Cow<'a, str> is almost always the right way to fill in the lifetime parameters on foo.
  3. When you have only one input with a lifetime, and the output includes lifetimes, the only choices for the output lifetimes are 'static and the same as the input. In practice, if you're taking an input by reference, and have an output with a lifetime, the chances are very high that your output lifetime is bounded by your input lifetime (your output contains a reference to part of your input). Plus 'static lifetimes in outputs are unusual, and often a sign that you need ownership (possibly shared via Arc or Rc) rather than references.

For a new elision rule to be added the way the original three were, you'd need to show that in a significant corpus of Rust code (such as all 1.0 or above crates on https://crates.io/), there are lots of explicit lifetimes that are almost always written a particular way, and thus that the elision rule will cover the vast majority of cases where those lifetimes were previously explicit.

From the experience we've had with current elision rules, two things make an elision rule not worth having:

  1. There isn't a "best" choice for the elided lifetime. For example - and this existing elision rule is bad for other reasons, notably that it hides that there's a lifetime hidden in there at all - fn foo() -> Bar could become fn foo<'a>() -> Bar<'a> or fn foo() -> Bar<'static>. If Bar is actually &str, then the second is more likely, while if Bar is actually Cow<'_, str>, then the first one is more likely. Making elision rules depend on the types involved is also not a good choice, since we then need some way to determine which rules are correct for a user-defined type (I should be able to write my own equivalent of Cow, or of str, and have it behave just like the provided versions).
  2. Not much code would benefit from the elision rule. The input lifetime elision rule gets rid of a huge number of lifetimes from code that takes parameters with lifetimes, and either returns an owned value (such as u32 or String) or is only executed for its side effects. The first output lifetime elision rule means that "computed getter" type methods in an impl block don't need a lifetime, while the second means that any case with only one alternative to 'static gets the alternative chosen by default (which is more likely to compile, and more likely to get good error messages if it should have been 'static).
3 Likes