Understanding lifetimes


#1

Can someone enlighten me about lifetimes? I don’t quite get the point why we need to expand them some times.

Question 1

We need to expand lifetimes in structs that contain references:

struct Foo<'a> {
    x: &'a i32,
}

fn main() {
    let x;                    // -+ x goes into scope
                              //  |
    {                         //  |
        let y = &5;           // ---+ y goes into scope
        let f = Foo { x: y }; // ---+ f goes into scope
        x = &f.x;             //  | | error here
    }                         // ---+ f and y go out of scope
                              //  |
    println!("{}", x);        //  |
}  

However, even if the lifetime would not be explicitly defined I would expect the same result. Foo gets destroyed and thus x becomes invalid. What good does it do to explicitly expand the lifetime in the struct for Foo?

Question 2

How is it useful to have two named lifetimes in a function? e.g.

fn bar<'a, 'b>(x: &'a i32, y: &'b i32) -> ();

What difference does it make that the lifetimes are named uniquely? They both will be valid in the scope of the function anyway (they will have the same scope). Can someone give me an example where uniquely named lifetimes are necessary? What are they good for?

Question 3

This is somewhat of a bonus question. On the same page I also found this:

struct Foo<'a> {
    x: &'a i32,
}

impl<'a> Foo<'a> {
    fn x(&self) -> &'a i32 { self.x }
}

fn main() {
    let y = &5; // this is the same as `let _y = 5; let y = &_y;`
    let f = Foo { x: y };

    println!("x is: {}", f.x());
}

I am referring to the function in the implementation. What is the reason that I have to write:

fn x(&self)

the &self confuses me. Is this just a convention? I mean why can’t I write

fn x()

instead? After all the function call is

f.x() // without parameter

What happens if I have a function definition in the implementation like this

fn x(y: i32, &self, z: i32)

Is this illegal, or can I still call the function like

f.x(2,3) // where y=2 and z=3


#2

Maybe I can help!

Answer 1

There are only 1.5 places where you can omit lifetimes:

  1. Inside function bodies. In fact, you wouldn’t be able to write them out even if you wanted to.
    1.5. In function signatures, in basic cases (known as “lifetime elision”).

Everywhere else you need to explicitly annotate lifetimes. That’s just the rule.

And you think about it in a wrong way. It’s not that you need to explicitly annotate them to convey some additional meaning. You always have to, except when it’s done implicitly with lifetime elision (but the elided lifetimes are still very easy to write out explicitly: fn foo(&str) -> &str means fn foo<'a>(&'a str) -> &'a str). Lifetime elision doesn’t add or subtract any meaning, it just shortens some basic common cases a bit.

Now, it might make sense to extend lifetime elision rules to allow eliding lifetimes in simple structs. I’m quite sure it was a deliberate decision to disallow this; it’s up to someone more qualified than me to tell you why.

Answer 2

Consider this sample code (from my lifetimes explanation):

fn search<'a, 'b>(needle: &'a str, haystack: &'b str) -> Option<&'b str> {
    // imagine some clever algorithm here
    // that returns a slice of the original string
    let len = needle.len();
    if haystack.chars().nth(0) == needle.chars().nth(0) {
        Some(&haystack[..len])
    } else if haystack.chars().nth(1) == needle.chars().nth(0) {
        Some(&haystack[1..len+1])
    } else {
        None
    }
}

fn main() {
    let haystack = "hello little girl";
    let res;
    {
        let needle = String::from("ello");
        res = search(&needle, haystack);
    }
    match res {
        Some(x) => println!("found {}", x),
        None => println!("nothing found")
    }
    // outputs "found ello"
}

Notice that needle only needs to be in scope while the function itself is executed. haystack, on the other hand, stays borrowed for as long as we use res. You can find some more examples in my post (link above).

Answer 3

foo.bar(args) is “just” a syntactic sugar for Foo::bar(foo, args). Now,

impl Foo {
    fn bar(&self, x: i32) {}
}

is also “just” a syntactic sugar for

impl Foo {
    fn bar(_self: &Foo, x: i32) {}
}

This “omit type, put references in front” style is only applicable to self, and it must be the first argument.
Also notice that you can use the “method notation” foo.bar(args) only if Foo::bar was declared with &self (or self, or &mut self). On the other hand, you can always call it as a “static method” (known as Universal Function Call Syntax): Foo::bar(&foo, args), no matter which way (of these two) it was declared.

If you were to declare it as just fn bar() without &self, you won’t be able to call it like foo.bar(), only like Foo::bar() (without passing any instance of Foo); and the function obviously wouldn’t be able to access self. This would be essentially how you define a static method in Rust. Ex. Box::new() is declared this way:

impl<T> Box<T> {
    fn new(t: T) -> Box<T> {
        // some magic here
    }
    ...
}

#3

Its an important distinction that lifetime parameters are not inferred, they are elided - this means that if they are omitted, they are assumed to follow some specific pattern, rather than the compiler attempting to determine what your intent actually was.

Lifetimes are not elided in structs, even though the rule to do so would be quite simple (ie just assume they’re all the same lifetime), because the lifetime parameter provides important documentation that this struct contains a reference. We want to be able to tell that because e.g. String doesn’t have a visible lifetime parameter, String must be a fully owned type.


#4

Thanks for the answers. I would like to check if I got it right (still somewhat confused here)

To Question 1

So, lifetimes are basically always required. However, due to some syntactic sugar they may be elided in some cases. Depending on whether rust wants to be explicit about it.

To Question 2

If I were to write the function like this:

fn search<'a, 'b>(needle: &'a str, haystack: &'b str) -> Option<&'a str> {}

Changing the lifetime of the return value to 'a, then res would be invalid after the scope of the {}. Meaning that the match would give us an error because we would want to access a freed resource?

To Question 3

I think I get it. However, it’s still strange because bar is an implementation of Foo. So, there should be a keyword self that always refers to the object Foo, even if I didn’t pass it to the function with a parameter. Given that it is an implementation of Foo we would know what self is referring to. Am I correct in assuming this has not been done so that we have the possibility to declare static functions?

In that case I would have preferred some sort of declaration for static functions that then disallow the use of self. E.g.

imp Foo {
    fn bar(x: i32) -> () {
        self.x = x;    // this should work
    }

    [static]
    fn double(x: i32) -> i32 {
        2 * x  // this should work because self is not used. It could be made static
    }

    [static]
    fn baz(x: i32) -> i32 {
        self.x * x  // this would panic
    }
}

I think that would have been more clear and easier to understand. The distinction of self, &self and &mut self would then not be necessary either. I am probably thinking too much in terms of OOP…


#5

You wouldn’t be able to implement the search function in the first place, because you want to return a slice of the haystack. Instead you would have to return a slice of the needle which is not really useful.

The crucial point is, that you couple the lifetime of the haystack and the result, because the result is borrowed from the haystack.
This means for the caller, that the result must not outlive the haystack.


#6

Ok, I know this example won’t make much sense (since nothing I do really compiles, I had to come up with something functional that illustrates my confusion :blush: ) It should however illustrate what I mean:

fn test_a<'a, 'b>(one: &'a str, two: &'b str) -> Option<&'a str> {
    Some("Lifetime of a")
}

fn test_b<'a, 'b>(one: &'a str, two: &'b str) -> Option<&'b str> {
    Some("Lifetime of b")
}

fn main() {
    let x = "foo";
    let res_a;
    let res_b;

    {
        let y = "bar";
        res_a = test_a(&x, &y);
        res_b = test_b(&x, &y);
    }

    println!("res_a: {}", res_a.unwrap());
    println!("res_b: {}", res_b.unwrap());
}

This prints:
res_a: Lifetime of a
res_b: Lifetime of b

x is the lifetime 'a and y is the lifetime 'b
So, I would expect that
println!("res_b: {}", res_b.unwrap());
should have given an error because res_b outlives the lifetime 'b when it shouldn’t.


#7

The distinction between self, &self, and &mut self is extremely necessary. A method with self moves the object into the method, whereas &self and &mut self borrow the object immutably or mutably. This is a huge difference, so you need to specify which you want. Explicit self is a very elegant solution to this problem in my opinion (and Rust isn’t the only language with methods like that, see Python for example).

EDIT: I should clarify also about some important sugar in method syntax:

struct Foo;

impl Foo {
    fn bar(&self) { }
}

let foo = Foo;
foo.bar();

The method call with automatically take foo as an &Foo, even though foo is a Foo. It can be reduced like this: foo.bar() -> (&foo).bar() -> Foo::bar(&foo). How would it know to do that if you weren’t explicit?


#8

Yes, that’s why I mentioned that I probably think too much like OOP.
If

struct Foo;

impl Foo {
    fn bar(&self) { }
}

would translate to C++ like

class Foo
{
public:
    void bar() { }
}

then the self would be like the this in C++. As that is not the case the distinction between self, &self and &mut self is of course important.

I’m not quite at impl and traits yet. Still reading the docu. For now I am actually more curious about what is going on with my 2. Question about lifetimes.


#9

This actually has to do with string literal magic. both x and y are &'static str, meaning the str lives for the lifetime of the program (because they are allocated in rodata). So you’re passing &'x &'static str variables to the function, which is dereferencing them to &'static str variables (making the lifetime of both 'a and 'b the static lifetime). If you make them both owned Strings, you get the expected error.


#10

Argh, I used the worst possible type to make my test :smile:

That makes sense. I think I get the point now. Knowing this everything behaves as expected. Thanks for the clarification.


#11

I don’t completely understand what you mean here. This is indeed how the Rust example translates to C++, barring const and private:

class Foo
{
private:
    void bar() const { }
};

And self is indeed like (explicit) this. You see, C++ went with implicit this, but it’s still passed just like in Rust under-the-hood (“thiscall”). The downside is this hm strange syntax for specifying stuff related to this after the arguments

void bar() const volatile && {}

(By the way, this is still a pointer, not a reference, but it’s denoted as an rvalue reference here).

You do not need a special #[static] thing. You need to mark static methods in C++ as such precisely because the this thing is implicit, and static is an opt-out (I don’t want you to pass an implicit pointer to the instance, and I also want to be possible to call without having one at all). In Rust, if you need the instance, you pass it explicitly (either by value (moving), by immutable reference or by mutable reference, and the distinction is very important – as important as it is in the rest of Rust), if you don’t need it, you don’t – as simple as that. If you do, there is a convenient “instance-dot-method” syntax for the actual call that references and dereferences the object automatically.

Rust is quite OOP here, it’s just using the explicit notation (like Python, unlike C++/Java).

You thinking to much like C++ here :wink:
You’d better assume all methods are static by default (thus there’s no instance passed), and if you want one, you ask for it with self.


#12

Thanks for the clarification. My argument essentially boiled down to implicit vs. explicit. Only I didn’t explain it very well. You added some extra input that helps me to think about how rust handles this.