Lifetime Specifier Confusion

I have to admit, I am past the point of knowing where and when I need to specify lifetimes… Generally speaking, when I get lifetime specifier errors, I know how to fix them… But there is still a huge misunderstanding of what the actual purpose is. Best way to describe this is to show an example. Say I have this program:

fn main()
{
    let mut my_names = ThreeNames
    {
        name1: "Thomas",
        name2: "James",
        name3: "Sarah"
    };
    
    set_two(&mut my_names);
    set_three(&mut my_names);
    my_names.print_me();
    
    
}
impl<'a,'b,'c> ThreeNames<'a,'b,'c>
{
    fn print_me(&self)
    {
        println!("Name 1:{}\nName 2:{}\nName 3:{}", self.name1, self.name2, self.name3);
    }
}
struct ThreeNames <'a, 'b, 'c>
{
    name1: &'a str,
    name2: &'b str,
    name3: &'c str
}

fn set_two(tn: &mut ThreeNames)
{
    tn.name2 = "Spielberg";
}

fn set_three(tn: &mut ThreeNames)
{
    tn.name3 = "Lindsay";
}

This code compiles and runs fine. But that’s not my problem; my problem is that I have no idea why this works when I’ve given 3 different lifetime specifiers 'a, 'b, and 'c and I’ve provided 3 references to memory of the same lifetime. In other words, I’ve specified that there are 3 different lifetimes here, but in actuality, there is only 1. It is my understanding that lifetimes 'b and 'c would be for memory defined in idented blocks past 'a. But that’s not actually the case. So that begs my questions:

  1. Why does this actually compile/work when there are in fact not 2 additional nested lifetimes?
  2. What is the actual difference between this and if I were to just specify 'a 'a 'a for each member?

In general, although I’ve gotten my programs to compile, I feel like I’m just throwing arbitary lifetimes at the compiler to clear up the errors, and they always seem to work, I just don’t understand exactly why still. I hope this question makes sense.

Thank you.

1 Like

Before going into this topic further, I’d like you to consider the following and how it relates to what I quoted above:

struct ThreeThings<A, B, C> {
  one: A,
  two: B,
  three: C,
}

let things = ThreeThings {
   one: 1,
   two: 2,
   three: 3,
};

Take your statement above about 3 different lifetime parameters, yet using only 1 ('static), and replace it with 3 different type parameters, yet using the same one for all 3 (i32).

In this thought experiment, consider why you’d want 3 different type parameters - I find people tend to mentally model things easier when dealing with generic type parameters.

4 Likes

Does this usage of the functions make more sense to you given the lifetimes?

#[derive(Debug, Clone, Copy)]
struct ThreeNames <'a, 'b, 'c>
{
    name1: &'a str,
    name2: &'b str,
    name3: &'c str
}

fn set_two<'a, 'b, 'c>(tn: &'a mut ThreeNames<'a, 'b, 'c>)
{
    tn.name2 = "Spielberg";
    let nest1: &str = "Jason";
    set_three(tn, nest1);
}

fn set_three<'a, 'b, 'c>(tn: &'a mut ThreeNames<'a, 'b, 'c>, last: &'b str)
{
    tn.name2 = last;
}

In this example, set_two() calls set_three() and passes it a variable with a lifetime which began inside of set_two() rather than inside of main() so I decided to give it 'b since I was visualizing 'a to be originating at the main() block.

I noticed that the compiler does actually enforce the lifetimes inside the struct because if I try to give the last parameter of set_three() a lifetime of 'c instead of 'b, it won’t build because it recognizes that tn.name2 is of lifetime 'c instead.

nest1 is &'static str because it’s a string literal. Given 'static lifetime is substitutable for any other lifetime (in immutable contexts), you’re not going to really see the difference in how lifetime parameters affect things if you only use static references.

The nest1 variable/binding is declared in set_two() but it doesn’t matter because the binding plays no role here - it’s the fact it’s a static str is what does.

Yes, this is similar to my example with ThreeThings - if you take &mut ThreeThings<String, i32, f32>, then last would need to be i32 and not some other type - it’d need to match the type of the field stored into.

I guess what I’m trying to convey is you should think of generic lifetime parameters the same way as generic type parameters. They’re essentially the same concept, except lifetimes can also participate in subtyping/variance relationships and type parameters can have trait bounds.

When you declare generic lifetime parameters you only indicate the relationship, if any, between the fields using those lifetimes. When it comes time to instantiate your struct and you provide the concrete references, the compiler will ensure that the ascribed relationships hold - not much different to instantiating a generic struct with concrete types.

1 Like

Aha! Key word - relationships really helped here. Also oh I got it now about the static lifetimes… Would be better to test this by newing up some stuff on the heap. This makes a lot more sense now, thanks a lot @vitalyd !

1 Like

BTW, if your intention is to just “store a string in a struct” (rather than write a contrived example to specifically exercise overuse of lifetimes), then the correct answer is: none. Use String which doesn’t have a lifetime and is used to own strings.

The difference is that owned string is stored permanently and “lives” in the struct. References are only for temporarily borrowing things that have already been stored somewhere else.

String literals are one confusing sort-of exception to this, because they borrow “imaginary” String owned by your program, so code written to work with string literals is unrealistic and unusual for Rust code.

1 Like

So I just contrived another example which has introduced a tiny bit more confusion… I don’t understand why this program compiles because here I’ve newed up some data in main() inside of a block… Wouldn’t this mean that I’ve passed references of two different lifetimes into the function new() which is defined to take two references of the same lifetime ('a)? Here, my_second_fav has a different lifetime than my_fav_sent yet the program builds and runs fine. Is this because the function doesn’t actually return either reference?? That’s my only thought at this point.

UPDATE: Actually, I even tried returning the inner-scoped var from the func as a 'a and it still runs: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=e53f22875ea68da761226665c5ec807a

Don’t use foo: &'a String, but foo: String.

Try not to use struct Foo<'a>. That <'a> there is not for regular simple uses, but for advanced cases (like custom temporary views of other data structures, or complex relationships between data in FFI), and for novice users it doesn’t do what they expect.

Unlike C, Rust doesn’t have syntax that distinguishes passing things by pointer vs by value (mainly because you don’t need to worry about it, as there are no situations which could accidentally cause a large wasteful copy).

Rust has syntax for passing things as owned or borrowed, but that is a very different thing, and owned things can be pointers too, e.g. Box<Foo> and &Foo are both pointers and are identical in their physical representation. OTOH &str is not a pointer, but a struct that contains a pointer and a length.

& is not a pointer in Rust in the same way that char * is not a number in C.

String contains the pointer to the actual string data, so &String is an expensive double indirection. And it’s not a string, but only a borrow of (a temporary read-only lock on) a string that must already be owned somewhere, and has its use (lifetime) limited by that somewhere.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=fc5ef1ddc5de28c8373f3e646b72bf91

2 Likes

The reason you can pass two arguments with different lifetimes into a function which expects two references with the same lifetime ('a) is due to something called subtyping. The idea is that the two lifetimes (e.g. 'a and 'b) both have a common subset/range in which they are valid ('c). The compiler is smart enough to figure out the correct 'c lifetime and will use that when calling the function.

Using this pseudo-code as our example:

fn main() {
  let first: &'a str = "Hello World";
  let some_string = String::from("Second");  // create an owned `String` on the heap

  {
    // get a reference to our owned string
    let second: &'b str = &second;
    call_with_references(first, second);
  }
}

fn call_with_references<'c>(first: &'c str, second: &'c str) {
  ...
}

An analogy from the OO world would be that you've got an Animal class with two sub-classes Dog and Cat. In the same way that I can pass both a Dog and a Cat into a function expecting two Animal arguments, because the region that 'a is valid for overlaps with the 'b region, the compiler infers that 'c is the overlapping region and uses that as the lifetime when calling the call_with_references() function.

That's the general idea around why you can use one lifetime when passing in two references with different lifetimes.


As others have pointed out, there may be a bit of a misunderstanding around &str and String. Basically, &str is a reference to a string owned by something else (where a &'static str is a reference to some literal compiled into the binary) while String is a string allocated on the heap which will be automatically deallocated when the String goes out of scope.

Lifetimes are mainly used when you need to retain a reference to something you don't own, typically to avoid creating unnecessary copies or in more advanced scenarios like RAII guards. For most "normal" applications your structs should contain own the data they contain, so you should be using Strings.

1 Like

So I applaud you for trying to figure out lifetimes :slight_smile:. Lifetimes/references are one of the key parts of Rust, enable efficient code (both in CPU and memory footprint) and are memory safe(!) - there’s no reason to be afraid of them, and sooner or later, one needs to learn them to be truly effective (or be able to read/modify other people’s code).

So, the subtyping was mentioned and that’s the reason for your example working. I just wanted to illustrate the subtyping a bit more here. Take this variation of your code, where we try to store a reference to the fav_sentence field into a binding that outlives the struct (but not the source String) - this is safe, and should be allowed. But if you try to compile the code above, it’ll fail telling you that my_second_fav doesn’t live long enough - and this is true in so far as my_second_fav is confined to that inner scope, but r is outside that scope. The compiler “unifies” both my_fav_sent and my_second_fav to the narrower of the two - in other words, it “shrinks” my_fav_sent to the same lifetime as my_second_fav; this is the subtyping part: a longer lived reference is a subtype of a shorter one.

Now, if we actually wanted to allow the above code, we’d introduce a second lifetime parameter - this would allow the compiler to track the two lifetimes separately, and no shrinking/subtyping is needed. Here is that code.

7 Likes

WOW, this has to be the best explanation/illustration of lifetimes that I’ve ever seen so far… Definitely going to bookmark this and refer others to it when they are confused.

So when there are two references from separate scopes (lifetimes) with the same specifier, the compiler shrinks down as far as it can, but if they are separately specified/designated, it will “make way” for the separate scopes.

That shows clearly the purpose of lifetime specifiers… Thank you very much. And yes, I love learning and experimenting, and I find this stuff very interesting… That tends to happen when you’ve been in GDB far too long chasing after memory bugs in past lifetimes… No pun intended :smiley:

1 Like

@vitalyd so I got a follow-up question for you on this. I was trying to sorta reproduce a situation like the one you showed here… I came up with this playground

As you can see, I wanted to add more complexity to the lifetime mix by having a function which could return one of multiple references. However, you will note that I’ve actually only specified 'a for my use-cases.

We can see in the below code:

    let a: int32 = -423;
    let b: int32 = 4590452;
    let c: int32 = 2321;
    
    let d: u32 = 5;
    let str_ref: &str = "Hahaha";
    
    let cr: ComplexRefs = ComplexRefs
    {
        int_ref: &a,
        unsigned_ref: &d,
        str_ref: str_ref
    };
    //let r;
    {
        let r = cr;
        three_refs(&a, &b, &c, &r);
    }

cr is moved into r which does not live as long as the lifetime of int32 a.

But my function signature claims it is of the safe lifetime as &'a int32:

fn three_refs<'a>(a: &'a int32, b: &int32, c: &int32, cr: &'a ComplexRefs<'a>) -> &'a int32
{
    if(*a < 5)
    {
        cr.int_ref
    }
    else
    {
        a
    }
        
}

However, for all intents and purposes of this function, both live long enough for the function to execute on the data… Is that why this is ok?

Note: for this example, consider type int32 = i32;

Rust will infer that the intersection of multiple lifetimes and use that as the lifetime of the return type.
You can see this in this playground link.

type int32 = i32;

struct ComplexRefs<'a>
{
    int_ref: &'a i32,
    unsigned_ref: &'a u32,
    str_ref: &'a str
}

fn main()
{
    let a: int32 = -423;
    let b: int32 = 4590452;
    let c: int32 = 2321;
    
    let d: u32 = 5;
    let str_ref: &str = "Hahaha";
    
    let cr: ComplexRefs = ComplexRefs
    {
        int_ref: &a,
        unsigned_ref: &d,
        str_ref: str_ref
    };
    //let r;
    let e; // this is new
    {
        let r = cr;
        e = three_refs(&a, &b, &c, &r);
    }
}

fn three_refs<'a>(a: &'a int32, b: &int32, c: &int32, cr: &'a ComplexRefs<'a>) -> &'a int32
{
    if(*a < 5) 
    {
        cr.int_ref
    }
    else
    {
        a
    }
}

Here we can see variable e lives longer than the return type of three_refs (which has the same lifetime as r), so this refuses to compile.

2 Likes

Aha! That makes a lot of sense and it reminds me of an earlier answer where I was confused about why something would compile when it appeared as though the data should have been “used after a borrow” of a function… The answer in that case was that since I was not assigning the return to anything, the function was freeing up (or “giving back”) the borrow.

Again, since I was not actually trying to borrow the returned ref here, Rust allowed it, but since we are now trying to borrow a returned ref of an out-of-scope variable, it complains…

Thank you very much! I learned a TON in this thread and I can finally say I’ve got a grasp of how these specifiers and the borrow-checker works. What I’m getting from this is that the key player is the returned reference and what is done with it. The lifetimes seem to just be a mapping from an input to an output.