[Solved, Newbie question] About lifetimes

Hi there !

I've some questions about the lifetimes syntax. I understand the need and importance of this feature in rust, but I definitely don't understand what I write when I handle these in my Rust code. I've read the part of the book about the lifetimes, but it's not enough for me to understand :cry:

If I well understand : each definition of variables come with its own lifetime, which relies generally on the scope of this variable. At first sight, the compiler may handle the lifetime when we are not using references in function (by example) because not using references mean that we give ownership of the resources to the function arguments.

But, what is the "by default" behaviour ? By example :

fn function1(b: bool, s: usize) -> u32
{ ... }

1/ Can we use explicit lifetimes annotations for this ? Is it useful?

Other question, in this code :

fn function2<'a>(b: &'a bool) -> &'a bool

2/ What do we do ? If I well understood, our annotations don't change the real lifetimes associated with the variables, it's just "for compilier understanding" annotations. But : which parts of these 'a are definitions ? Which parts of these are necessary informations ? Indeed : we have three 'a. But I assume they have different meanings. What are they ?

So, I'm totally lost with the lifetimes annotations syntax. If someone could clarify it, I would be grateful. :slight_smile:

Solution
All the conversation is interesting for whom doesn't understand lifetimes. thanks to alice ! :slight_smile:

Here are a few thoughts of mine:

The way I think of it, lifetimes are not "the lifetime of a value", but instead they are used to describe a region in which a value is marked borrowed, and this region may very well be (and usually is) shorter than the "lifetime of the value".

The basic idea is that when you borrow a value (by creating a reference to it), that creates a new lifetime, and the resulting reference is annotated with that lifetime. The compiler now uses ordinary type inference (with the assumption that &'a u32 and &'b u32 are different types!) to figure out which lifetime every reference in the function should have.

Once it has figured out every line where a lifetime is used, it figures out what the smallest region of code is that contains every use of the lifetime, and to ensure that the reference stays valid in this region, it checks that the value you created a reference to is not accessed in any bad way during this region.

Regarding this type inference, if a function calls your function2, it knows that your function2 will return a new reference with the same lifetime annotation as the reference it was given as an argument, and by looking at those lifetimes, the compiler knows that the pointed-to value must also stay valid during any uses of the returned reference. Note that when type-checking the insides of function2, the compiler simply assumes that 'a is some lifetime whose region is larger than the entirety of the body of function2.

So the <'a> on your function2 simply tells the compiler that this return value stores a pointer into the memory of the argument. When you have multiple arguments that are all references, you can use this to tell the compiler which argument the returned reference points into.

There can't be any lifetime annotations on your function1, as there are no references involved in arguments nor return value.

If you have any examples you are confused about, I'll be glad to walk you through how my idea of lifetimes works on that example.

4 Likes

To add on @alice's explaination from my own experience, I wrongly assume before that the <'a> on a function or a struct was meaning "lifetime of this function" or "lifetime of this struct". Which is totally wrong because as alice said, it's only applicable on references.

1 Like

Try taking a look at this example:

// notice, no `'a` on snd
fn return_first<'a>(fst: &'a str, snd: &str) -> &'a str {
    fst
}

fn main() {
    let foo = "foo".to_string();
    let bar = "bar".to_string();
    
    // this creates a lifetime
    let ref_to_foo: &str = &foo;
    
    // this also creates a lifetime
    let ref_to_bar: &str = &bar;
    
    // Because of our lifetimes on `return_first`, the
    // compiler annotates `also_ref_to_foo` with the same
    // lifetime as the one on `ref_to_foo`.
    let also_ref_to_foo = return_first(ref_to_foo, ref_to_bar);
    
    // Destroy bar.
    drop(bar);
    
    // This would not compile because the region of the
    // lifetime on `ref_to_bar` would have to include the drop
    // above.
    // println!("{}", ref_to_bar);
    
    // But this is fine, because drop(bar) does not touch foo.
    println!("{}", ref_to_foo);
    
    // And this is also fine. The lifetime on `also_ref_to_foo`
    // is only tied to the variable `foo`, and not to `bar`, even
    // though `ref_to_bar` was involved when creating `also_ref_to_foo`.
    println!("{}", also_ref_to_foo);
    
    // Destroy foo.
    drop(foo);
    
    // These would both fail now, as the region would include 
    // the drop above.
    // println!("{}", ref_to_foo);
    // println!("{}", also_ref_to_foo);
}

view example on the playground

This uses some of the reasoning I outlined in the post above. After reading it, try changing the return_first first to the following. (notice that I added 'a to snd)

fn return_first<'a>(fst: &'a str, snd: &'a str) -> &'a str {
    fst
}
2 Likes

Thanks for your replies ! They have already cleaned up a part of my confused thoughts about lifetimes.

But I'm still confused, and so I'm about to test the code written by alice to explain this to me (thanks to you !). But before this ... what is the difference between thses different codes, are they valid ?

fn return_first<'a>(fst: &'a str, snd: &'a str) -> &'a str { fst }
fn return_first<'a,'b>(fst: &'a str, snd: &'b str) -> &'a str { fst }
fn return_first<'a,b'>(fst: &'a str, snd: &'a str) -> &'a str { fst }
fn return_first<'a,b'>(fst: &'a str, snd: &'a str) -> &'b str { fst }

And, does the "<'a,'b>" bracket only means "Hi compiler ! I'm going to use lifetime anotations. Here the notations I'm going to use" ? (So the bracket is just a definition ?)

[now, I'm playing with the code of Alice, I come here back after :3 ]

Yes :slight_smile:

Sorta. You have to use them on the arguments or return values. It doesn't work if you only use the lifetime on types inside the body of the function. In some sense, it means that your code must be valid for any possible choice of lifetimes in the brackets.

To explain what your four examples mean, here they are:

  1. In this case, it means that the return value may contain a pointer into either argument, or maybe even both arguments at the same time. (&str can't do that, but it could happen like this)
  2. In this case, it means that the return value can only point into the first argument. Note that this is the same as the one in my example; I just elided the 'b in the example.
  3. This is the same as your first case.
  4. In this case, the return value is unrelated to the two arguments. The only possible implementations would return pointers into an immutable global or into leaked memory.

Note that the fourth case highlights the any possible choice thing above. The code must be valid for every choice of 'b, including 'b = 'static, so the returned reference must remain valid forever.

1 Like

[After reading Alice's code]

Ok, I think I understand better thanks to this code. Thanks a lot !

So a reference type is type composed with :

  • A lifetime (generally implicited by the compiler)
  • A type pointed to

But what about the reference as return value of function ? By example,

fn newreference<'a>(refargs: &'a onetype) -> &'a othertype { ... }

By writing these, I tell the compiler the lifetime of the new reference is the same of the refargs reference. But, the new reference was defined in the function. So the ownership always stays in the function if the reference is returned (and not the resource itself). So there is no "borrowed" value, because borrow assumes there exists a owner, no ?

An example of that would be:

struct HasString {
    field: String,
}

fn newreference<'a>(refargs: &'a HasString) -> &'a str {
    &refargs.field
}

In this case, newreference does not have ownership of the HasString as it was only given a reference to it, but it returns a reference that points into the HasString, thus whoever calls newreference may not destroy the HasString while the returned &str exists.

fn main() {
    let has_string = HasString {
        field: "foo".to_string()
    };
    
    let ref_to_field = newreference(&has_string);
    
    // This is ok, the HasString is still valid.
    println!("{}", ref_to_field);
    
    drop(has_string);
    
    // But this will fail:
    // println!("{}", ref_to_field);
}

view example on playground

The owner of the HasString does not have to be visible from inside newreference.

1 Like

Note that the fourth case highlights the any possible choice thing above. The code must be valid for every choice of 'b , including 'b = 'static , so the returned reference must remain valid forever.

Imagine I want to trick my compiler and I want to return a reference that doesn't remain forever. Normally, that appears in the code of the function and Rust reads it and refuse the code so ? (normally, I imagine there exists tricky code where Rust doesn't see, but I'll have fun with theses cases when I'll be more fluent with Rust)

I'm not sure what you are asking here. If the compiler is not convinced that your references are valid, the code will not compile.

1 Like

Not really important. I was asking for the possibility to "bypass" the borrow-checker with some tricky code and for the possibility to create code (for the function newreference) that creates a reference which doesn't remain valid for the rest of the code.

So, for the struct lifetime annotiations. When I write :

struct HasReference<'a>
{
      field: &'a onetype;
} 

I mean : "For every lifetime of reference pointed to a "HasReference" structure, the reference field will be available" ?

What about the lifetime with impl ? Why do we need to anotate the whole block 'impl' with impl<'a> ? We already annotate the functions itselves ?

I think this is just not possible as the whole point of the borrow checker is to avoid this case :slight_smile:

1 Like

Hm, that makes sense ! :slight_smile: Definitely falling in love with this language <3

Recall that I said that HasReference<'a> is a different type to HasReference<'b>. A HasReference<'a> has a field of type &'a onetype, whereas a HasReference<'b> has a field of type &'b onetype.

You can't put a &'b onetype into a variable of type HasReference<'a>, because the type must match.

As for the impl block, lets consider this example:

struct HasReference<'a> {
    field: &'a str,
}

impl<'a> HasReference<'a> {
    fn new(s: &'a str) -> Self {
        HasReference {
            field: s,
        }
    }
}

view example on playground

In this case, the type HasReference<'a> has a function called new that takes a &'a str and returns a HasReference<'a>. Similarly, there is also the type HasReference<'b>, which also has a method called new, but this other new instead takes an argument of type &'b str and returns a HasReference<'b>.

You can think of impl<...> meaning "duplicate this impl block for every choice of stuff inside the brackets". The same applies to functions, where the function is duplicated for every choice of stuff inside the generic parameters.

You may sometimes see this:

impl<'a> HasReference<'a> {
    fn new<'b>(s: &'b str) -> HasReference<'b> {
        HasReference {
            field: s,
        }
    }
}

view example on playground

But this is a bit weird, because now the HasReference<'a> type has many new functions. There is an HasReference::<'a>::new<'a> and an HasReference::<'a>::new<'b> and so on for every combination of two lifetimes.

Note that with the pair of lifetimes case, returning Self would not work, as Self always refers to the type on the impl block, which is HasReference<'a> in this case, so you get a type mismatch when trying to return a HasReference<'b> from the function. click here to see the error


Regarding tricking the compiler, the type system normally makes it impossible to write code that compiles but does something invalid with references. It may sometimes reject code that would be valid, but that is much better than accepting invalid code. That said, it is possible to circumvent this with unsafe code: see here for an example of that.

3 Likes

Oh ! I didn't know the "Self" keyword (I'vnt read all the book yet) ! :slight_smile:

Ok, I think I understand. I try to summarize all in order to be sure :

  • Each reference type are in fact a type built from two components : one is a lifetime parameter, and the other is the type the reference pointed to. In plus, lifetimes is only a question about references (and pointers).

  • Generally, the compiler can infer the lifetime and the link between them, when it is in the same piece of code. But when a new reference comes from a function, compiler needs our help to ensure the lifetimes.

  • fn newreference<'a,'b>(arg1: &'a onetype, arg2: &'b othertype) -> &'a otherothertype { ... }
    

    is a function anotates with lifetime anotations. the part "<'a,'b>" is only for defining the lifetime annotations we are going to use. The lifetime annotation is only for reference (as first said), and here means : for every lifetime of the parameters given to the functions (which we call here 'a for the first and 'b for the second argument), the result has a lifetime longer than the one of the arg1. It's used to indicate that the result relies on the first reference argument : so, the first argument must be valid during al the life of the result.

  • For

    struct hasReference<'a> { field: &'a onetype, }
    

    that means that a reference to a struct hasReference is valid only during the time that the relied reference field is valid.

  • hasReference<'a> and hasReference<'b> are two different types !

  • Using impl<'a> hasReference<'a> we are telling to compiler that we are going to use lifetime annotation 'a (defined by impl<'a>), to implement the type hasReference<'a>.

Am I right in my summarize ? (I've written this in order to be pinned as the solution of this, for all other newbies that could have same questions)

I think the gist of it is right. There are some inaccuracies that may just be the wording.

Here you talk about "a reference to a struct hasReference", but we should really just be talking about the hasReference directly. Of course, you can have a &'b hasReference<'a> too, and here we must have 'b < 'a, but I think we would usually talk about the struct directly.

Well, you used the same lifetime annotation on both arg1 and the return value, so they have the same lifetime. The one on the return value is not longer than the one on arg1, but equal.

Not just the wording.

  • So the lifetimes arn't only question of references. They may concern no basic types. Does that mean that each resource which is a struct has a lifetime ?
  • "the same" ? But, the struct can't live without the relied reference, but this reference can live without the struct, no ? So the lifetime of the reference may be longer than one of the struct ?

Recall from the beginning:

Lifetimes are not "the lifetime of a value", but instead they are used to describe a region in which a value is marked borrowed, and this region may very well be (and usually is) shorter than the "lifetime of the value".

To add to this, a value annotated with a lifetime cannot exist outside the region of that lifetime, but may exist for a shorter time than the region.

In some sense if we borrow value, we have value ≥ 'a ≥ &'a value. Here value is the value we borrowed from, and 'a refers to the region for the lifetime 'a, and &'a value is where the reference we got to value exists. So the lifetime lies in between in some sense.

In short, the return struct has the same lifetime as the input in the sense that they are annotated with the same lifetime marker, but this does not mean that "the lifetimes of the values themselves" are equal. Only that they have the same annotation, and thus both the reference and the struct has the same upper bound on where they can exist.

Consider this example:

fn main() {
    let has_string = HasString {
        field: "foo".to_string()
    };
    
    let ref_to_has_string = &has_string;
    let ref_to_field = newreference(ref_to_has_string);
    
    drop(ref_to_has_string);
    
    // It doesn't matter whether the reference stays valid.
    // This is ok because the has_string is still valid.
    println!("{}", ref_to_field);
}

playground

1 Like

Ow. Yes, indeed, THanks for your patience :slight_smile: My questions become littler :slight_smile:

  • Structs are allocated on the heap or on the stack by default in Rust ? (i ask it to completely understand why we need lifetime for structure with reference inself (that seems intuitive, but as "lifetime is a reference question", it seems inconsistant))
  • Ok for "the same" question. Just a little thing : for every lifetime we can consider that value \geq 'a ? Or you wanted to mean "if we have value \geq 'a so we have 'a \geq &'a value ?