Hi all! I am still struggling with lifetime, the single hardest concept in rust. I think I have felt fairly comfortable with lifetime annotations of functions, but for lifetime annotations for structs containing references, it is not so. Let's see an example
struct S<'a> {
x: &'a i32,
y: &'a i32
}
fn main() {
let x = 10;
let r;
{
let y = 20;
{
let s = S { x: &x, y: &y };
r = s.x;
}
}
println!("{}", r);
}
The program fails to compile and the compiler complains that "error[E0597]: y does not live long enough". If we replace the definition of the S struct with the following version:
struct S<'a, 'b> {
x: &'a i32,
y: &'b i32
}
Then the program compiles fine.
I understand a struct definition with lifetime annotaions is like a contract. I know the first contract is more restrictive than the second. But exactly what contracts (detailed explanation wanted) do these two two struct definitions provide? What is the difference between these two contracts?
Suggesting a reading material (the detailed the better) is highly appreciated.
Just as with algebra, the same letters/names mean the same thing. Thus, your first declaration requires that the two lifetimes be the same, while the second one allows different lifetimes.
In the first case, the "same lifetimes" constraint can only be satisfied if the compiler takes the intersection of both lifetimes, ie. the shortest one. That however means that the struct itself isn't valid for long enough to be accessed outside of the scope of y.
Each lifetime corresponds to a location where the lifetime ends. (That is, the start of a lifetime is not important, only when it ends.)
Whenever you borrow a variable, that creates a lifetime representing the borrow. The lifetime must end before the value is destroyed or moved, and if the value is ever borrowed mutably, the lifetime must end before the next time it is mutably borrowed. (Notably, this implies that lifetimes are the duration of borrows, not the duration in which a variable exists.)
Any value whose type is annotated with a lifetime must not exist after the end of that lifetime.
The compiler will automatically shorten most lifetimes, e.g. the expression &x in your code actually has an implicit shorten-the-lifetime operation so that it looks like this:
let s = S {
x: shorten_the_lifetime(&x),
y: shorten_the_lifetime(&y),
};
Shortening the lifetimes lets them become equal when S has only one lifetime annotation, without requiring that x and y are borrowed for the same duration.
Other than that, the borrow checker essentially works by turning the above into a system of inequalities. If there is a solution to the system of inequalities, then it compiles, otherwise it doesn't.
This definition does not compile because it promises to the Rust compiler that both x and y live for the same length of time (i.e. they share the same lifetime annotation 'a).
The Rust compiler will work out that x exists for the whole of the main() function, but y only exists inside your anonymous scope {} between let r; and println!. Therefore, 'a must refer to the shorter of these two lifetimes (the lifetime of y in the anonymous scope).
But since you need to access s.x outside this scope, the compiler will stop you by telling you that y doesn't live long enough. You can fix this problem by defining y outside the anonymous scope, which gives it a longer lifetime equivalent to x, or by moving the println! inside the anonymous scope which is covered by 'a. You can also allow x and y to have different lifetimes 'a and 'b.
You can read more about lifetimes in various Rust books:
struct S<'a> {
x: &'a i32,
y: &'a i32,
}
fn main() {
let x = 10;
let r;
{
let y = 20;
{
let s = S { x: &x, y: &y };
- r = s.x;
+ r = &x;
}
}
println!("{}", r);
}
That is because the compiler implicitly converts &'b x into &'a x if 'b: 'a (i.e. if 'b outlives 'a). So &x in S { x: &x, y: &y }; can be shortened, and when you later set r = &x, it won't use the shorter-living reference, but a longer-living one.
I generally try to avoid introducing too many lifetimes. Say if the struct S contains two (borrowed) coordinates x and y, then it's (in my opinion) okay to say they have the same lifetime. As you can see, my above Playground compiles with S<'a> (without 'b).
The only thing you need to be aware of is that when you rely on subtyping to shorten your lifetimes accordingly, then you can't late "undo" the shortening. That's why when you try to re-obtain &x by accessing s.x, you'll only be able to get the shortened lifetime (that can't be used outside of the inner block).
Depending on the semantics of x and y in your example, I'd most probably use a single lifetime 'a for both references.
Thanks to H2CO3, alice, hax10, jbe! I have read your detailed explanations, which I can now only understand partially.
But just now an idea came to my mind: Are lifetime annotations for structs really necessary?
I understand lifetime annotations for functions are necessary, because it dramatically reduces the burden of the borrow checker algorithm. Without lifetime annotations, the borrow checker needs to look into both the calling function and the called function(s), and the logical structure is far from linear. Now, with lifetime annotations, the borrow checker needs only do two tasks:
For every called function, verify that the implementation does fulfill the lifetime relationship promised by the lifetime annotation.
For the calling function, verify that there will be no dangling pointers provided that the called function(s) satisfy the lifetime relationship promised by the lifetime annotation.
These two tasks are independent from each other therefore the borrow checker can be dramatically simplified.
But, with structs, the struct definition does not provide any implementation. We can do with bare struct definition without any lifetime annotation (like in C/C++). By so doing, it is not much harder to analyze directly the calling function which includes the struct assignment statement (the statement let s = S { x: &x, y: &y } in my example), and hence task 1 is not necessary, only task 2 is necessary.
I am not a native English speaker, so forgive me for unidiomatic English. Hope you can understand me well.
Struct types can be used in function signatures, and there needs to be a way of referring to the lifetimes used inside the struct. The fields of a struct could be private and subject to change, so the lifetimes required must be specified as part of the definition of the struct, and they form part of its public interface.
Yes, totally. They are part of the type just like field types are. They can't come out of nowhere.
This is like saying "types can be inferred, so we don't need them in struct fields". Then tell me, how a struct definition without types, like this, is supposed to work?
struct Foo {
field_1: _,
field_2: _,
}
There's no context for them to be inferred, so they can't be inferred. The same is true for lifetimes.
You inspired me a lot. Structs and their references can be function arguments and return values. In addition, tuples and their references can appear directly in function signatures, and if they include references, their lifetimes need to be specified. A struct (after being defined) is denoted by a single name, but essentially it is the same as a tuple, we also cannot avoid giving lifetime specifiers for the references, so the only way is giving lifetime specifiers in the definition of the struct.
Thank you. Now I learned a lot. But I still don't understand the exact meaning of the lifetime annotations of structs. That needs time.
I also realized that in a function there is no need to use a struct as a local variable, unless that struct is to be the function's output, or needs to be used as an argument when calling another function.
For pure local variables, use separate variables, not structs, not even tuples.
Lifetime inference is not different from type inference. In fact, the compiler does it for local variables inside functions — both type inference and lifetime inference (which are in fact the same thing). It could, if it wanted, infer the most general types and lifetimes for a function, just like languages from the ML family let you define function without type-annotating the arguments and the return type. In fact, this is what it does for closures.
However, the Rust designers have deliberately enforced the requirement that every function is annotated with the argument types (which may include lifetimes) and return type (which may include lifetimes). This is done on purpose. It's easier for a human to understand a program if you don't have to look inside a function to immediately understand how to use it. It also means the error messages can be better, because if function types were inferred, when calling a function with an argument of a wrong type, it would not be possible to know whether the error was made on the call site or on the function definition site. And finally, this makes the interface of a library explicit. There is no risk of accidentally changing your interface without noticing (e.g., by forcing two lifetimes to be the same whereas they could previously be different) when you change the source of a function.
Others have already given good explanations, but here's how I personally think of it. I will use pseudocode:
struct S<'a> {
x: &'a i32,
y: &'a i32
}
fn main() {
'a {
let x = 10;
let r : &'b i32;
'b {
let y = 20;
{
let s : S<'b> = S { x: &'a x, y: &'b y };
r = s.x;
}
}
println!("{}", r);
}
}
The thing is, the language doesn't let us declare lifetimes explicitly. The reason for that is that control flow statements like break and things like non-lexical lifetimes would make inventing a syntax for local lifetimes very hard, since lifetimes do not strictly correspond to lexical regions. But in this case, the lifetimes are simple, so let's imagine we can write them as 'a { ... }.
When the compiler looks at the function definition, it notices that x is defined for the duration of the scope annotated as 'a[1]. This create an anonymous phantom type called 'a. Just like there is no Vec type but only a Vec<X> type for a certain X, there is no &T type but only a &'a T type, for 'a a certain "lifetime type". The same goes for 'b, and then when you construct S, it works because 'b is a supertype of 'a, just like a class B inheriting from A is a subtype of A in OOP languages: since A is a supertype of B, you can convert a A to a B, or if you prefer, reinterpret an A as a B. In Rust, we don't have classes, the only source of subtyping is lifetimes.
Bottom line: if you understand types, you almost understand lifetimes. Mainly, lifetimes are phantom types. A struct field needs a type so that the compiler can understand what kinds of operations can be done with it. Lifetimes are no different, you can do more things on a value of type struct S<'a> { x: &'a i32, y: &'a i32 } than struct S<'a, 'b> { x: &'a i32, y: &'b i32 } just like you can do more things on a struct S<T> { x: T, y: T } than on struct S<T, U> { x: T, y: U }. If you leave out the ' characters, you can exercise all the intuition you have from types with lifetimes. And just like with all types, the compiler could in theory let you leave out lifetimes in structs, make each struct be defined with the most general lifetimes possible, insert implicit lifetime constraints on function arguments, and check them at call site. But it would make for confusing error messages and difficult-to-understand interfaces. The only aspect of lifetimes that this doesn't explain is how local lifetimes are assigned (the borrow checker), but that is not the question for lifetimes in structs.
this is only valid because we're in a simplistic case; as someone already said, the lifetime of a reference is the lifetime of a borrow, since mutable borrows must be exclusive ↩︎