How is the lifetime parameter `'a` determined for a temporary reference `&'a mut T`?

For a temporary reference &'a mut T, how is the lifetime parameter 'a determined?

For example, consider the following code:

use std::marker::PhantomData;

pub struct OutRef<'a, T: 'a> {
    ptr: std::ptr::NonNull<T>,  
    _lifetime: PhantomData<&'a ()>,     // UAF
    // _lifetime: PhantomData<&'a mut T>, // ok
}

impl<'a, T: 'a> OutRef<'a, T> {
    pub fn write(self: OutRef<'a, T>, value: T) -> &'a mut T {
        unsafe {
            self.ptr.write(value);  
            &mut *(self.ptr.as_ptr())
        }
    }
}

impl<'a, T: 'a> From<&'a mut T> for OutRef<'a, T> {
    #[inline]
    fn from(val_ref: &'a mut T) -> OutRef<'a, T> {
        unsafe {
            OutRef {
                ptr: std::ptr::NonNull::new_unchecked(val_ref as *mut T), 
                _lifetime: PhantomData,
            }
        }
    }
}

fn main() {
    let mut my_str: &'static str = "static str";
    let at_s = OutRef::<&'static str>::from(&mut my_str);   // point 0

    {
        let s = String::from("short-lived str");
        let x = at_s.write(&s); 
    }

    println!("{:?}", my_str); 

}

In the main function above, &mut my_str is a temporary mutable reference (variable), and its type is &'a mut &'static str. According to my understanding, the lifetime (loan scope) of &'a mut &'static str should only be the line marked as point 0. However, I have seen from other documentation that &'a mut &'static str is associated with the return value of the OutRef::from function, OutRef<'a, &'static str>, and therefore the value of 'a is the scope of the variable at_s. I am not sure if this statement is correct. Could someone help clarify this?

@ Yandros

Shouldn't a reference to which you can only write be contravariant, not covariant, in its lifetime? It is basically a function (&mut T, T) -> (). In fact if you try to write this example without unsafe you will hit this issue as the borrow checker correctly forbids it.

Your function signature ties the lifetime of its input and output together. So the lifetime on the borrow of my_str and at_s are related. (What's confusing about Rust, to me at least, are the reborrowing rules; you can't use the borrow of my_str while you're using at_s). Lifetimes aren't quite scope based though. Anyway, since you use at_s in the inner block, the borrow of my_str has to last as long.

So, are you saying that the borrow of my_str in this example cannot be treated as a temporary value?

The reference itself no longer exists after that point, however the lifetime 'a referes to the borrow that the reference carries, and the borrow continues to be alive through the OutRef because your From implementation specifies so (by using the same 'a for both the mutable reference and the OutRef).

This is the correct interpretation.

2 Likes

The outer lifetime should be covariant and the inner T (in this case also a reference) should be invariant. (The T can still be read.)

Edit: contravariance is fine (but covariance is not) as noted in the following comments.

Which was indirectly noted in the OP (or so I interpret).

OutRef<'a, &'static str> being a subtype of OutRef<'b, &'s str> where 'a: 'b and for all 's is what allowed main to pass borrow check. At the at_s.write call, the inner lifetime can coerce to a lifetime that ends before s goes out of scope,[1] so the call passes borrow check.

I don't see how it would help the OP if this were the case. If your borrow was instantly expired, why would you expect writing a new value based on that borrow later to be sound?

(You can get rid of the lifetimes by using *mut say, but notionally there's still a borrow going on, just not one the compiler checks.)

It's not the lexical scope, but uses of at_s will keep the original borrow alive. So it can end after at_s.write but before the println!, for example.

The problem in the OP isn't that my_str stays borrowed, though, it is the variance of T as mentioned above.


  1. and the outer lifetime can coerce to anything not longer than that ↩︎

Is it? Assuming OutRef was put into its own module, the only way to read a T through it is to call write, which will however first replace the old T with the new value, and leak the old one in the process (so not even the destructor can read from it). Arguably the leaking is likely unintentional though.

1 Like

I believe you're correct; that's the part I didn't fully consider.

@phil-skillwon Short answer: See the Rustonomicon chapter on subtyping and variance.

Long answer: The bug is not related to the lifetime of the temporary reference. The lifetime 'a is correctly inferred as the lifetime of the variable my_str. The type of at_s is then OutRef<'a, &'static str>, as expected.

What's breaking your code's safety here is that the OutRef<'a, &'static str> is being incorrectly interpreted as an OutRef<'a, &'s str> (where 's is the lifetime of the string s) when you call write on it.

To see that this is true, try changing the line that says

let x = at_s.write(&s);

to require OutRef<&'static str>:

let x = OutRef::<&'static str>::write(at_s, &s);

This will result in a compiler error, 's' does not live long enough. This proves that your original code was not calling OutRef<'a, &'static str>::write(), because it demonstrates that that would've been a compile error. Instead, your code is calling OutRef<'a, &'s str>::write(), which satisfies the borrow checker, but is fatally unsafe.

Rust reinterprets the lifetime this way because OutRef is covariant in T. For this type of unsafe code, you need to get variance right.

@jorendorff @quinedot @SkiFire13

Let me clarify: there is no doubt that the code I provided has errors, and the correct implementation should be:

pub struct OutRef<'a, T: 'a> 
{
    ptr: std::ptr::NonNull<T>,  
    _lifetime: PhantomData<&'a mut T>, 
}

OutRef must be invariant over T. However, the purpose of my question is not to discuss how the _lifetime field should be correctly defined. What I don't understand is this line:

let at_s = OutRef::<&'static str>::from(&mut my_str);

&mut my_str is a temporary variable, and its type is &'a mut &'static str. As a temporary variable, &mut my_str is destroyed immediately after the call to OutRef::from completes. So, what is the scope corresponding to the lifetime 'a in &'a mut &'static str?

Although this code has safety issues, it compiles successfully. From this perspective, the scope of 'a should be: from the moment at_s is created until the last use of at_s, which is at_s.write(&s). However, I completely fail to understand how the specific value of 'a is inferred. Therefore, I have two questions:

  1. Is it true that &mut my_str, as a temporary variable, is destroyed immediately after the call to OutRef::from completes?
  2. How is the lifetime 'a in &'a mut &'static str inferred, and what is its corresponding scope?

There is no temporary here; the mutable reference created by &mut my_str is moved into the OutRef::from() function call. Temporaries happen when you write expressions like &mut Vec::new() because the Vec has to be put somewhere.

Lifetime variables do not necessarily correspond to any scope; they merely have constraints arising from how they are used, and they are free to be underconstrained — there is no one true answer to what a lifetime “is”. For example, If you write

fn foo<T>() {}
foo();

you will get a “type annotations needed” error — if there is no concrete type to pick, the program cannot be compiled — but fn foo<'a>() {} has no such restriction; the program is valid regardless of the lifetime “value” of the parameter 's. In your case, there are two constraints that apply:

  • 'a must end before println!("{:?}", my_str) is executed (would be a borrow conflict).
  • 'a must not end before the created OutRef<'a, &'static str> created from it is dropped. This constraint came to exist solely because of the signature of OutRef::from(), and has nothing to do with whether OutRef “actually contains” a &'a mut ....

Arguably/technically, it’s being re-borrowed and there is a temporary, but only its target is actually borrowed, so it’s fine and the re-borrow ends up acting like a move anyway :sweat_smile:

[Some looking at MIR later:] Oh… nevermind, in this particular case it’s going through From at the call site in a way that “nail down the type enough early enough” to actually allow for implicit re-borrowing. In MIR (in the playground) currently, you’d see

<OutRef<'_, &str> as From<&mut &str>>::from(copy _3)

vs

<OutRef<'_, &str> as From<&mut &str>>::from(move _3)

for the re-borrow vs. the “actual” move.

Either way, for comparison, this works, too, making a temporary variable explicit, and the re-borrow, too:

    let at_s: OutRef<'_, &'static str> = {
        let tmp = &mut my_str;
        OutRef::from(&mut *tmp)
    };

The lifetime of the &mut my_str can be longer than the reference itself … ahem … “lives”. This language can be confusing, it looks like I’ve touched on this same kind of point before in an earlier thread from @phil-skillwon

Anyway, what I’m meaning to say is that the lifetime 'a denotes, in the type &'a mut &'static str of &mut my_str can denote a longer scope than the scope the (temporary) variable exists that was holding the reference created by the expression &mut my_str before it’s moved into the From::from call.[1]

On the note of “temporaries” it seems like one could also more generally speak of temporary variables being introduced for all kinds of operands; the reference says:

Temporaries are also created to hold the result of operands to an expression while the other operands are evaluated. The temporaries are associated to the scope of the expression with that operand. Since the temporaries are moved from once the expression is evaluated, dropping them has no effect unless one of the operands to an expression breaks out of the expression, returns, or panics.

I don’t know whether this strictly applies only to operands that exist while other operands are evaluated (i.e. never to the last operand), though arguably it semantically doesn’t matter anyway, and one might as well imagine there’s always a temporary for every operand.


I think the correct lower-bound is that 'a must not end before the created OutRef is last used because the type actually has no drop-glue. The last use is in the let x = at_s.write(&s); line; but as it’s being passed by-value there, that’s also exactly where it’s dropped, so no difference in this particular case :innocent:

For illustration @phil-skillwon – together, these bounds mean that 'a pretty much ends around:


fn main() {
    let mut my_str: &'static str = "static str";
    let at_s = OutRef::<&'static str>::from(&mut my_str);

    {
        let s = String::from("short-lived str");
        let x = at_s.write(&s); 
        // <- EITHER HERE
    }
    // <- OR HERE

    println!("{:?}", my_str); 

}

Between these possibilities, there really is not ultimately any actually “correct” answer, because lifetimes are allowed to be underconstrained.


  1. Whether it’s a re-borrow or a move operation that allows the reference to be passed (by value) to the from function – and then e.g. potentially stashed into the OutRef from within there – doesn’t really matter. ↩︎

1 Like

IMO there's no point trying to understand what the "scope" of a lifetime is, or how it's "inferred", because that's not how they work. Instead lifetimes are just used to generate a bunch of constraints that the compiler checks whether are satisfiable or not. Most notably this check does not require finding a concrete solution that satisfies them!

Think of it as having to prove whether a set of equations has a solution. Equations like x>10 or a<b && b<c obviously are satisfiable, but I don't need to pick a value for x, a or b to say that. Meanwhile equations x<10 && x >= 15 or a<b && b<c && c<=a have no solution, and again I can say that without trying to pick some values for x, a, b or c.

4 Likes

@SkiFire13 @quinedot @kpreid @steffahn @jorendorff @user16251

I found a plausible explanation in the Rustonomicon: Limits of Lifetimes - The Rustonomicon
As mentioned in LearningRust: Once you declare a bound like 'a: 'b, the two lifetimes "infect" each other. Even though the return type had a different lifetime than the input, it was still effectively a reborrow of the input.

Although the function signature fn from(val_ref: &'a mut T) -> OutRef<'a, T> is overly rigid and restricts flexibility, sometimes we have no choice but to do it this way. Even if we modify it to fn from<'b, 'a: 'b>(val_ref: &'a mut T) -> OutRef<'b, T>, we still cannot avoid the relationship between the input type and the output type, because the lifetime parameters contained in the input and output types are inherently related, such as 'a: 'b or 'a: 'a. This "infection" of lifetimes is a result of the function or method signature.

By the way, although LearningRust provides an explanation for this situation, to my knowledge, LearningRust is not an official document. Could someone confirm if there is a similar explanation in the official Rust documentation?

This took awhile to write out for a few reasons:

  • You're asking about 'a, but the unsound code and the error you get when you fix the unsoundness are not about 'a, so it's a bit awkward to be talking about how 'a is calculated right next to some unrelated unsoundness
  • I don't have a great, succinct citation on how lifetimes are calculated / how the borrow checker generally (and actually) works, even though I feel there should be one that makes this example clear without too much trouble
  • Simply being bad at being succinct

But hopefully it's still useful to you.

First, to make sure it's clear: despite the unfortunate overlap in terminology, Rust lifetimes ('_ things) don't directly correspond to the liveness of variables/values. They more directly correspond to the duration of borrows. The main interaction between the two is that it's not allowed for something to be borrowed when it is destructed,[1] including when it goes out of scope.[2]

For references in particular, keep in mind that the lifetime corresponds to how long the referent is borrowed. The duration of the borrow can be longer than the reference itself exists.[3] And in general, values with lifetime-carrying types can go out of scope before the end of the (Rust, '_) lifetime.

A reference like &mut my_str going out of scope is practically a no-op;[4] it just doesn't matter to the borrow checker in this example. References going out of scope can cause borrow checker errors when the reference itself is borrowed -- like a & &mut _ -- but that's about it.

The borrow checker computes lifetimes by using mostly data flow analysis. When a value with a lifetime in its type is going to be used along an execution path -- when the value is "alive" in compiler parlance -- the (Rust, '_) lifetime must also be active. Additionally, constraints due to subtyping or annotations, etc., also come into play. If there's a 'long: 'short constraint, then everywhere 'short is active, 'long is also active.[5]

Let's look at the example.

    let mut my_str: &'static str = "static str";
    let at_s = OutRef::<&'static str>::from(&mut my_str);  // L1
    {                                                      // L2
        let s = String::from("short-lived str");           // L3
        let x = at_s.write(&s);                            // L4
    }                                                      // L5
    println!("{:?}", my_str);                              // L6

Let's say at_s has a lifetime 't in its type, and the &mut my_str has a lifetime 'a. When you pass it to OutRef::from, it's possible that something shorter than 'a is passed in, due to subtyping and reborrowing. And when the result of the call is assigned to at_t, it's possible again. But let's ignore that second one for brevity and just say that the return value also has 't in its type.

That means you called OutRef::<'t, _>::from, and from the signature[6] that means you must have passed in a &'t mut _. In order for this to be sound, a 'a: 't constraint is noted by the borrow checker.[7]

at_s is used on L4, and not overwritten before that, so it's alive on L2, L3, and L4. Therefore 't is active on those lines. And because of the 'a: 't constraint, 'a is also active on those lines.

Nothing else involving 't or 'a is used after that, so the lifetimes can end after L4.[8]

If you change the code to fix the unsoundness, the analysis for 'a is exactly the same. The error in that case is about the borrow of s on L4, not about 'a.

Unfortunately, most things I would consider official documentation[9] are, IMNSHO, quite bad at explaining how borrow checking actually works. They lean very heavily on analogies to scopes, they say inaccurate things about how lifetimes are "chosen", and so on. They're typically trying to give the reader the general gist and not get into the weeds, which is understandable. But I don't think they're very successful -- I've seen lots of confusion and incorrect take-aways.

There are references I consider good, but they are also usually technical, detailed, and dense. I think there's a better introduction waiting to be written somewhere. It is challenging though, as the borrow checker analysis is complex (so it can support many scenarios) and also still evolving. Probably it needs to present multiple, increasingly fine-grained approximations to how borrow checking works.

This is a pretty good introduction to the type of analysis I did above, though still quite low-level/detailed.[10] It doesn't talk about making a function call specifically, though. Those can generally be understood by recognizing that subtyping may come into play during the call, and by understanding how the lifetime annotations on the function turn into constraints at the call site.


Finally let me spend a little bit of time tackling a specific a specific function signature pattern, since you explicitly asked about it: an input and the output have the same lifetime.

Here's an example from the book:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

The book says:

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a. In practice, it means that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the values referred to by the function arguments.
[...]
When we pass concrete references to longest, the concrete lifetime that is substituted for 'a is the part of the scope of x that overlaps with the scope of y. In other words, the generic lifetime 'a will get the concrete lifetime that is equal to the smaller of the lifetimes of x and y. Because we’ve annotated the returned reference with the same lifetime parameter 'a, the returned reference will also be valid for the length of the smaller of the lifetimes of x and y.

That is not how the borrow checker works. A more accurate description would be:

The function signature now tells Rust that for some lifetime 'a, the function takes references to two string slices which are both borrowed for the same duration 'a. The function signature also tells Rust that the string slice returned from the function will have the same duration 'a.

Due to variance, the string slice parameters at the call site do not have to have the exact same borrow duration. The lifetimes in the types of the parameters both can coerce down to some shorter lifetime in the intersection of the two. However, that coercion requires that the original lifetimes remain active whenever the shorter lifetime is active.

In practice, this means that every use of the returned string slice will keep the borrows of the input string slices alive. That is, the use of the return value determines the duration of the borrows of the inputs, and not vice-versa!

This is the key bit: because the lifetimes were tied together -- the signature says that the input lifetimes and output lifetime are the same -- uses of the returned value determine where the inputs remain borrowed. The compiler does not look at scopes[11] around the call site and then assign an input lifetime based on the parameters. The compiler computes the lifetimes as part of figuring out where things are borrowed, and then checks all uses[12] for conflicts with active borrows.

So the Book's explanation is roughly the opposite of how the borrow checker actually works for this example.

Now let's look at the Nomicon example:

impl Foo {
    fn mutate_and_share(&mut self) -> &Self { &*self }
    fn share(&self) {}
}

fn main() {
    let mut foo = Foo;
    let loan = foo.mutate_and_share(); // M2
    foo.share();                       // M3
    println!("{:?}", loan);            // M4
}

Here we have a similar signature, and a borrow check error that has nothing to do with lexical scopes (it happens on M3).

The same pattern applies: Using the returned value keeps the input borrow alive. In this case, the lifetime in the type of loan must be alive on M3 and M4 due to the use on M4, which keeps the exclusive borrow needed to call the method on M2 alive as well. So foo is exclusively borrowed on M3, which conflicts with taking a fresh shared reference to call .share().

The nomicon says:

This program is clearly correct according to the reference semantics we actually care about, but the lifetime system is too coarse-grained to handle that.

To me this reads like we could make the example compile some day by refining the borrow checker, but that's not true without also changing the function signature. We need a new kind of function signature in order to allow the example to compile, because the "returned value keeps input exclusively borrowed" semantics can be relied upon for soundness.

Because 'a is covariant, adding 'b doesn't really add anything valuable to this function. Without 'b, the input parameter at the call site (with lifetime 'input) will just be coerced or reborrowed for however long the use of the return value requires ('a such that 'input: 'a).

The extra parameter technically allows you to force the borrow in the input to be longer than required by the use of the return value, but you generally don't want that anyway.[13] It doesn't allow it to be shorter.

In terms of determining the borrow duration of the input parameter at the call site, it's just adding one more intermediate lifetime to the chain of constraints.

`'a` and `'b` such that 'input: 'a and 'a: 'b

The use of the return value still keeps the input borrow active.


  1. or moved, or have a &mut taken to it ↩︎

  2. Most non-trivial destructors are also considered to observe lifetimes in their types, but that doesn't apply to the examples in this topic so far. ↩︎

  3. Although we don't tend to think about it, this happens all the time -- e.g. practically every time you take a reference as a function argument. Or for a simpler, concrete example: You can put a &'static str into a local variable. ↩︎

  4. it doesn't observe its referent, it doesn't keep the lifetime alive, and it doesn't have a destructor ↩︎

  5. There's even more nuance than that, but this is a good starting point. ↩︎

  6. with the same lifetime in the input and output positions ↩︎

  7. You can coerce a &'long mut _ to a &'short mut _, but not vice-versa. ↩︎

  8. This happens to be at the end of a block, but that's irrelevant -- you could add some statements not involving at_s before the end of the block and the lifetimes would end in the middle of the block. ↩︎

  9. such as the Book ↩︎

  10. There are some links to some other resources at the end of this post. ↩︎

  11. or any other uses ↩︎

  12. including but not limited to going out of scope ↩︎

  13. It also won't happen in practice unless you force it somehow, like by using turbofish. ↩︎

3 Likes

Brilliant answer, especially this part: "In practice, this means that every use of the returned string slice will keep the borrows of the input string slices alive. That is, the use of the return value determines the duration of the borrows of the inputs, and not vice-versa!" I think the answer might be found in RFC 2094.