Hi everyone, I've been trying to wrap my head around the semantics of the anonymous lifetime '_. I've originally learned about those from Jon's video and I don't think he's right about how they interact.
He claims that these two function signatures are the same:
Problem is that I couldn't get foo to compile in any edition so I don't think he's right about that.
I understand his explanation as follows:
in argument position, the anonymous lifetime means a fresh unnamable lifetime, that it can be used to "discard" lifetimes from consideration during inference (because its unnamable?). Yet I've never been able to use it in that way.
in return position it means lifetime inference (?) - I don't think is a very good word to describe what happens because in rust there's no inference across function signatures, and compiler does not analyse code to determine which lifetime it being returned.
In my opinion Its more of a "placeholder" for a lifetime parameter in a type annotation where there's only one possible value for it, as in:
This compiles just fine.
This however is just explicit annotation that a borrow occurs, but the underlying mechanism is the lifetime elision - which dictates in which cases lifetimes can be trivially inferred. Its not the anonymous lifetime which causes inference - its the lifetime elision rules, the need to use the anonymous lifetime here is necessary to make the borrow explicit.
I found very little information about this topic in authoritative sources life rust reference.
How do you understand / use this language feature?
Indeed, foo will never compile. &str and &'_ str are completely identical, so adding '_ does not resolve anything compared to the version without lifetime annotations.
'_ does not mean a specific lifetime, but rather, it is equivalent to leaving out the lifetime but can be used in places where, syntactically, including a lifetime name is required.
This is true. But it is equally true that in argument position, leaving out the lifetime entirely means a fresh unnameable lifetime. This is a fact about argument position, not about '_.
That’s right. The lifetime inference rules — sometimes called lifetime elision rules, though lifetime elision is technically the practice of writing &str instead of &'a str, because they are rules that apply only when lifetimes are elided — are very simple syntactic rules. They will never infer a lifetime for foo because foo has two unnamed input lifetimes. Return-position lifetime inference works in either of two cases:
The function parameters include exactly one lifetime (elided or not). That one lifetime is used.
The function parameters include an &self or &mut self parameter. The lifetime of that reference is used.
The only thing I'll add is that writing out '_ is different than eliding it for a particular case: The dyn Trait<..> + '_ lifetime outside of expressions. The main time this matters is when you get a 'static requirement you didn't want, usually with Box<dyn ...>.
// These are the same
fn take_box(bx: Box<dyn Display>) { ... }
fn take_box(bx: Box<dyn Display + 'static>) { ... }
// These are the same accept more inputs
fn take_more_box(bx: Box<dyn Display + '_>) { ... }
fn take_more_box<'a>(bx: Box<dyn Display + 'a>) { ... }
// These are the same
impl Whatever for dyn Trait { ... }
impl Whatever for dyn Trait + 'static { ... }
// These are the same and provide more implementations
impl Whatever for dyn Trait + '_ { ... }
impl<'a> Whatever for dyn Trait + 'a { ... }
// These are the same, which is almost never a problem.
//
// But if you ever get a complaint about `'static` when calling
// a function like this, perhaps it's due to an implementation
// that left off `+ '_`.
fn take_ref(r: &dyn Display) { ... }
fn take_ref<'a>(r: &'a (dyn Display + 'a)) { ... }
(The actual set of rules around the default dyn lifetime is complicated and the Reference is not accurate on the topic. Aside from knowing about the hidden 'static, it's niche knowledge anyway, so don't sweat it.)
The '_ lifetime isn't an actual lifetime like 'a in your example, it's kind of like a placeholder for a lifetime specifier that can be discarded when you syntactically need to add a lifetime but there won't be any conflicts / issues with the code if a lifetime is not specified for the borrow, so it can't really solve any lifetime issues
After some further deliberation it seems to me like a very overloaded language feature.
I'd much rather have it just be a placeholder where lifetime is syntactically required, but can be trivially inferred - which is not what lifetime elision does, as in my opinion lifetime elision functions more like a macro than the logical inference, since it arbitrarily assumes receiver's lifetime (&'a self) in methods which is not at all how inference works.
But then '_ also establishes lifetime relations in impl Trait + use<'_> and as @quinedot mentioned in dyn Trait + '_, then there are edge cases like this where it causes lifetime subtyping bounds to be included, which has nothing to do with lifetime elision nor it does not seem to me like simple placeholder as the two fresh lifetimes are implicitly bounded?
I guess there's no getting around finally having to spend a few days properly studying the associated RFCs.
The whole point of lifetime, it's raison d'ĂŞtre, is to avoid inference in function's interfaces. Thus, of course, lifetime elisions and anonymous lifetimes couldn't work like that: this would make the whole construct rather pointless.
This being said for functions that are not part of your crate interface you, very often, don't really care about proper declaration of interfaces and, perhaps, some kind of Rust++ may decide to infer these (like in C++ one may infer the result type for auto function from its body).
This would be radically different language, thought.
P.S. As for dyn trait and use<'_> story — it's question of picking orthogonality vs usability. Most of the time it's more usable to have implicit 'static in these case… but then you need special syntax for the “more generic” (even if less often needed) case. Similar story happened with implied : Sized bound.
I agree, and that's why I find the use of therm "lifetime inference" a very poor choice of words, as there is no and, in my opinion, there should never be any type inference of function signatures in Rust. Signature should always be the true source of typing information.
Eliding lifetimes does result in inference within function bodies.
'_ being the same as elision (where elision is allowed) is what allows getting rid of completely hidden lifetimes in function signatures without naming the lifetime.
fn whatever(&self) -> Something<'_> { ... }
// Your choices use to be:
// fn whatever(&self) -> Something { ... }
// fn whatever<'s>(&'s self) -> Something<'s> { ... }
It was introduced in a kitchen sink RFC and what actually happened strayed quite a bit from the RFC. Though I guess the motivation section holds up.
impl<'a> Iterator for MyIter<'a> { ... }
impl<'a, 'b> SomeTrait<'a> for SomeType<'a, 'b> { ... }
tomorrow you would write:
impl Iterator for MyIter<'_> { ... }
// They changed the number of trait parameters in the RFC lulz
impl<'tcx> SomeTrait<'tcx> for SomeType<'tcx, '_> { ... }
...assuming you don't need to name the elided lifetimes in the implementation block. (You probably need to name the lifetime in the Iterator implementation...)
The original elision lint (elided_lifetimes_in_paths) was deemed too strict and is allow by default. More nuanced lints were a long time coming, but have finally arrived. However, they are warn-by-default. (They arrived mid-edition.)
Other
There was no change to struct definitions, no implicit lifetimes based on variable or field names, no automatically introducing ("binding") lifetimes just by naming them (thank goodness), and no linting against single-letter lifetime names.
When you elide lifetimes in a method which returns a borrow, compiler simply assumes that the borrow has the lifetime of &self (which simply happens to be a quite common case), there's no actual type inference there hence this does not compile
struct Foo(u32);
impl Foo {
fn elision_happens_to_work_here(&self, _: &u32) -> &u32 {
&self.0
}
fn naive_elision_failes_here(&self, a: &u32) -> &u32 {
// do something using self
a
}
}
Compiler assumed that since lifetimes are elided, and both items are methods the lifetime of the returned &u32 is the lifetime of &self, it did not perform any form of reasoning about which lifetime is the correct one, so treating lifetime elision as synonymous with inference does not seat right with me.
This is what I mean by the "placeholder" functionality, we know that lifetime elision is possible in the case you've provided - thus we don't have to use named lifetimes, compiler's guess (the only one possible) will be the right one, but we still need to annotate that a borrow occurs on a syntactic level, hence we'd use the anonymous lifetime as a placeholder (which is exactly the wildcard thing).
Thanks a lot for the breakdown! For some reason even after years in the Rust community I still find RFCs, tracking issues, drafts, working group repos [...] quite difficult to wrap my head around and more often than not I simply fold.
That's an interesting point. Wouldn't that be an unwarranted narrowing of the type parameter's value though? I don't know if that's what you ment, but I guess nothing stops one from devising a type inference scheme with such a rule since this situation seems ill-defined in all sorts of ways.
Its impossible to provide an analogue for other type parameters, since no other type parameter can be omitted in function signatures, and functions form a boundary for type inference
How so? You cannot infer a value for that type parameter without an actual usage, It would be similar as to claim that in fn foo<T>() { }T should be () or String, it all depends on the context of the usage, no?
It was a minor nit. I just meant there are different ways elision can be defined. The case was:
fn whatever(&self) -> Something<'_> { ... }
And you said:
we know that lifetime elision is possible in the case you've provided - thus we don't have to use named lifetimes, compiler's guess (the only one possible) will be the right one
But there is more than one way that the compiler could guess elision could be defined for the use case.
The '_ could be the same as the &self reference (what we have)
The '_ could be 'static
The '_ could be an independent lifetime variable
Elision could just not be defined for this case
More elaborate and silly possibilities like "the first lifetime parameter of Self if it has one" or "something from the trait parameters if present"
(Same goes for the cases where elision doesn't apply. We could make up rules for those, but haven't; it's less clear that there's an "obvious winner" for those signatures.)
Also, the &self elision we have is not always right. If you have something like a zero-copy framework where you're SomeStruct<'source>, a lot of time you want