Is `&'a Foo<'a>` a footgun?

(I feel like this sort of thing has been discussed a bunch on this forum and elsewhere, but I'm having trouble figuring out what exactly to seach for. Forgive me if you're sick of this topic.)

I have a struct Foo<'a> that contains references to some data. I would like to store an instance of Foo inside a struct Bar<'b> (which already contains some references to other data) without cloning Foo or giving ownership to Bar. If I write:

struct Bar<'a> {
    some_ref: &'a Quux,
    foo: &'a Foo<'a>,
}

will that come back to bite me? Should I change Bar to use two different lifetimes, one for the references and one for the Foo parameter? I know there are gotchas when using the same lifetime twice in &'a Foo<'a>, but that might only be a true problem for mutable refs, not for shared refs. Does it matter exactly what I'm doing with foo outside of Bar?

Bonus wrinkle: One of the fields of Foo<'a> is a reference to a field in Quux. Does that change anything?

Links to references/blog posts/etc. about this sort of thing are welcome.

I wouldn't call it a footgun, it's just incorrect. Like writing 1 / 0. All your references should have different lifetimes unless they have a reason to be the same. It sounds like &'a Quux and Foo<'a> are actually the same, so your struct should look like:

struct Bar<'a, 'b> {
    some_ref: &'a Quux,
    foo: &'b Foo<'a>,
}
3 Likes

From a technical perspective, all of the potential issues arise from incompatible variance of the two lifetime positions that share the same name.

2 Likes

It's a footgun if the inner lifetime is invariant.

Otherwise covariance is often flexible enough that it's ok.

Probably not since Quux is also borrowed and not owned by the struct.

9 Likes

I think it can be a problem even with covariance in niche circumstances, but I can't recall the exact scenario I encountered that so I can't easily back that up... probably I could make up something very contrived, heh.

Anyway I agree it's rare at most.

&'a Foo<'a> is completely fine to use provided that:

  1. Foo<'a> is covariant in 'a.
  2. You do not intend to obtain a longer-lived reference from the Foo than you can with this definition.

To clarify the second point, consider these two possible structs:

struct One<'slice> {
    foo: &'slice [&'slice str],
}

struct Two<'slice, 'strings> {
    foo: &'slice [&'strings str],
}

If you have Two, you can copy out &'strings str references from it. If you have One, you cannot do that — you can only copy out &'slice strs that are only valid as long as the whole slice reference is.

So, in your original case, if Foo<'a> has no public API by which you can obtain &'a SomethingElse or SomethingElse<'a>, or if Bar never uses Foo in that way, then there is no disadvantage to using &'a Foo<'a>.

You only need two lifetime parameters if you intend to make use of the difference between them, or if there is invariance.

There are also cases where you cannot use two lifetime parameters; in particular, constructing a linked list that points up the stack (or uses an arena, or is in any other way made of references):

struct Node {
    value: String,
    children: Vec<Node>,
}

struct Context<'a> {
    value: &'a str,
    parent: Option<&'a Context<'a>>,
}

fn walk(node: &Node, ctx: Option<&Context<'_>>) {
    let ctx = Context {
        value: &node.value,
        parent: ctx,
    };
    for child in &node.children {
        walk(child, Some(&ctx));
    }
}

If you try to make this code have Context<'a, 'b> (or try to use &mut Context references), you'll find you need Context<'a, 'b, 'c, ...> and so on to infinity — but it works fine with the single lifetime.

5 Likes

if you're using an arena with mutually recursive types, &'a Foo<'a> is exactly what you want, e.g.:

4 Likes

I'm trying to think of a simple way to understand and explain this, without bringing in lifetime variance. And I love these two examples of when it is necessary to use a single lifetime:

I can think of these cases as a single case defined by the fact that the outer and inner refs both refer to the same arena-like memory pool.

Since an arena is freed all-at-once, it seems obvious even to my simple mind that the inner and outer lifetimes must be the same. So I like this as a way to explain it.

Refs on the stack are a little different because the stack shrinks as well as grows. However, although it is a bit of a stretch, I can think of the stack as a arena when all refs and values of a data structure are on the stack. At any point in time, all existing refs are in the stack's arena as are all the values they refer to.

So if that is the only use case where the inner and outer lifetimes must be same, then a simple rule would be to make them the same only for that use case. Then I don't need to think about lifetime variance.

I can't think of other use cases where the lifetimes must be the same.

What do you think?

  • Am I thinking correctly?
  • Is there any other use case where a single lifetime must be used? (I realize there are other cases where a single lifetime can optionally be used.)

I think this is not a good way to think about it. You cannot use a single lifetime to describe different stack frames in this fashion, because then a reference into a shorter-lived frame could be carried up and outlive that frame.

But, in fact, my Context<'a> example is not using a single lifetime; it's using a single lifetime parameter. Each time Context<'a> is constructed in a more-nested, shorter-lived frame, the lifetime of that particular Context’s borrows is shorter. When we initialize the parent field, the &Context reference given by its parent is coerced to an &'a Context<'a> reference, where 'a is a new lifetime that is no longer than the shortest of the 3 elided lifetimes in the input (which, in this case, will always be the &Context reference, but that doesn't matter at all because all 3 are guaranteed to outlive the stack frame for that call).

So, covariance is (in my opinion) actually critical to a proper understanding of what is happening here. Each time you construct a Context, you're taking some references and considering their lifetimes to be shorter than they could be, which is fine (because the code doesn't pass the references out of that stack frame) and necessary (because we can't keep track of the unbounded list of parent lifetimes with a finite set of lifetime parameters). And shortening those lifetimes is valid because, and only because, Context and & are both covariant in their lifetime parameters.

4 Likes

You're right. Thanks for explaining!

Regarding “think of the stack of as an arena”, I think the general model — of what you can do with lifetimes that the borrow checker understands natively, that is — actually has to be “stack of arenas”. That is,

  • At any time you can allocate more data and borrow it with some lifetime 'a, as long as deallocation of it will happen after 'a ends; that is, if you have “an arena” with lifetime 'a, you can always validly put more stuff in it to borrow, if you have a means to do that. (The simple case with ordinary stack variables is that you don't have the ability to allocate more stuff with that lifetime, because the allocations were either chosen statically (simple let bindings), or the place the allocations go is not interior-mutable, like a reference to an element of a Vec prevents Vec::push().)
  • if you have a borrow of some data, you can always interpret it as a shorter-lived borrow. This is equivalent, in our everything-is-arenas model, to pretending the data was allocated in a more-nested, shorter-lived arena than it is — which is valid because the whole idea of the arena model is that you don't track deallocation in detail, just be sure to stop soon enough.

And I think that captures a lot of the character of what the borrow checker actually checks. If it makes sense.

1 Like

&'a Foo<'a> itself isn't always wrong, but such types are very often a symptom of novice users trying to use references without understanding ownership and borrowing.

If your Quux data wasn't behind a reference in the same struct, you would have a self-referential struct, the lifetimes would be wrong, and the code using this struct would need to be rearchitected.

If any of the references involved were mutable, the variance and exclusivity restrictions could easily make this struct impossible to use.

This design forces you to have some other storage for Foo and Quux objects. If you already have built data structures that own all of them, or you're using an arena, then this design may be fine. But very often novice users don't understand that structs with lifetimes are temporary views that can't store data, and get completely stuck when they need to create a new instance of Foo, and try to lend a local variable.

So I'd say &'a Foo<'a> is not a footgun if you know what you're doing, and fully understand why you need this, and not &'b Foo<'c> or &'a Foo<'static> or Arc<Foo>.

But if you've just kept adding &'a because the compiler complained about lifetimes, you're in big trouble, and loading a double-barrel footbazooka.

7 Likes

Exactly &'a Foo<'a> should not bite you, but &'a mut Foo<'a> will bite you with incomprehensible error messages sometimes, because 'a becomes invariant.

This might be non-sequitor to the OP's thread; however, I have to admit that this is really the only time I ever add 'a to a type's signature (when the compiler complains). The reason is because, in my estimation after having used Rust for several years now, it's still entirely too complicated for me to gain an intuitive grasp of. The idea of covariant fields in a structure still makes no sense to me after all this time.

(But this hasn't stopped me from making very productive use of Rust, for fun and for profit.)

Where can one go to get an intuitive understanding of lifetimes? The Rust Handbook is right out; I've read that thing cover to cover and still don't get it. I've watched countless Youtube videos on the subject, the best one which makes sense to me is this one, but even here I still have to think long and hard for things to make sense in only some cases. I've read blog post after blog post. And they all seem to say the same things, in the same order, without any actual clarification.

So, in retrospect, I just gave up, and have relied heavily on the Rust compiler's elision support extensively. Am I writing ideal code? Far from it, and I know it. But the code still works, and I'm able to get things done regardless. The foot-bazooka is powerful enough to pull off some spectacular rocket-jumps if you're patient enough with using it.

3 Likes

Yes

And WTF does covariant mean in this context?

1 Like

They mean variance of lifetime parameters, since only lifetimes in Rust have subtyping.

I wish I had a great answer for this question. In an ideal world, it would be a short article that gives an excellent introduction and quickly brings you to an "aha!" moment. Whatever that is, it sounds awesome.

What I can say in short that might be helpful about lifetime annotations (the little 'a things) is that they are just names. In the same way that you name variables and functions. These annotations refer to something else, which are temporal in nature. It's a region of use. It specifies how long a loan can (or must) last, depending on whether it is used as a disambiguation or a constraint.

The way I personally rationalize lifetimes is by thinking of them as stack frames. Lower frames (parents) can be referenced by higher frames (children), but not the opposite [1]. The intuition here is that a parent can't (normally) reference something owned by a child because the child would have to return control flow to the parent. For instance, with the return keyword, destroying its own stack frame and everything in it.

Even more importantly, things created on the heap are rooted to a stack frame somewhere [2]. Value liveness is based in part upon that fact. A lifetime is the duration of a reference that must be no longer than the liveness. But may be (and usually is) much shorter.

This should help explain why lifetimes (except for 'static) are temporary. Even in the unusual case of &'a Foo<'a>, it's expected to be shorter than 'static [3]. That Foo is rooted to some stack frame, either directly in main() or in a function that it calls.

That's a quick and dirty intuitive description. But, you might have to reach your "aha!" moment spontaneously and organically, as most of us have.


On the topic of variance, I am very fond of this article: Variance in Rust: An intuitive explanation - Ehsan's Blog. It sidesteps the whole problem that most readers are typically expected to have a background in mathematics and formal logic. Instead, it presents the topic in a more casual, conversational manner. This is the style of article I would like to see applied specifically to Rust lifetimes.


  1. This is a bit of a fib for the sake of simplicity, please bear with me. A more accurate statement would add some exception like "... without going out of your way to contort control flow and either making the referent 'static or allowing the child to allocate within the parent's stack frame." ↩︎

  2. Except for 'static types. Which are ref-counted, thread-locals, static allocations, globals, leaked heap allocations, etc. ↩︎

  3. It is however not disallowed to be 'static. That's actually perfectly fine. A non-static lifetime is more flexible, because you can guarantee that the owner drops the value. ↩︎

2 Likes

Not really. Most tutorials avoid it like a plague – and for good reason, too!

In an ideal world the key phrase “this is math” wouldn't send people running for the hills. And if you are actually allowed to utter that phrase then everything is very simple and you don't even need that magic “aha” moment.

Alas, we don't live in such a world and after many years in school aversion to math is brought to the level that “this is math” phrase leads not to the enlightenment and “aha” moment but to internal debate about whether it's appropriate to “run for the hills” pronto… or ex post pronto.

Writers of Rust tutorials are, rightfully[1], try to explain things without uttering that “awful” phrase, but it works approximately as well as trying to teach music without being able to utter words “sound” or “pitch”.

Lifetimes were never problematic to me and I suspect that anyone who claims they “couldn't understand” lifetimes is just looking on the absolutely clear and simple explanations while attempting to suppress these explanations and ignore them (as much as possible, anyway).

Ultimately the question that matters is not “what are lifetimes” but rather “what are lifetime annotations”.

And if you accept “this is math” thingie then they are not magical at all. They are similar to these α, β, γ, δ, ε marks that you saw in school when they tried to teach geometry to you (but managed only to cultivate [un]healthy loathing of that subject).

Rust compiler includes is a theorem prover that is there to ensure that your program is “correct” (for some definition of “correct”). After said theorem prover says “Ok, this program is correct, it can be turned into the machine code” program is compiled while mostly[2] ignoring the lifetimes and lifetime annotations.

That's why “make compiler happy by blindly following the instructions in the error messages” works so well: because lifetimes are, technically, not part of your program, but more of “an attached machine-checkable documentation” about you program it doesn't actually matter[3] if that description is correct or not. If compiler is happy, then code works and you happy, too! Even if annotations are all wrong.

And now, when we know what lifetime markups are it's easy to explain other things, too. Including that dreaded covariance/contravariance/invariance thingie.

It's directly related to shared (means: & T) and exclusive (means: &mut T) borrows.

Because exclusive borrows are, well… exclusive – compiler have to pick one specific lifetime for every &'a mut T variable in it's automatically-created proof of program correctness. The exact way it's done is complicated, but the idea is simple: we need to invent one lifetime for all cases where it's used.

That is known as “invariance”. Easy, right?

Because shared borrows are, well… shared – compiler can be more relaxed and is allowed to “imagine” that every shared reference carries not one lifetime markup but many lifetime markups. So when you write &b T there are lifetimes b₁, b₂, b₃, b₄, b₅, b₆, b… as many of them as needed: these are shared references, we can make as many of them as needed… so we may as well imagine that we are passing not one reference with one lifetime but lots of references (and with identical bit pattern in memory) with many lifetimes around.

Again: the exact method of picking these lifetimes could be complicated, but the important thing that every shared reference carries many of them. As many as needed.

This gives us covariance and contravariance: that imaginary “endless pile of imaginary shared references” with different lifetimes becomes “a covariant type” and functions that are ready to accept said “endless pile of imaginary shared references” with different lifetimes becomes “a contravariant type”. Not as easy and invariant types, but still… no mystery.

And that's it. Not much to talk about, really. If you are allowed to tell “how math works” with lifetimes then it's a very handy and easy to understand tool[4]… but in school most of us are just taught to remember lots of scary words without any understanding about concepts that are behind these words… and as a result “this is math“ phrase is a big taboo for the Rust tutorials which makes some pieces of Rust very cryptic to explain.

But they are not intrinsically complex on scary, they are just made that way with our ways of teaching math in schools.


  1. If you don't believe that's the right approach then think about fate of Haskell, that openly embraces math. How much adoption have it got? ↩︎

  2. There some HRBT-related corner cases where lifetimes can actually affect the generated code, but most users never hit them in practice. ↩︎

  3. Again, except for a few very-very rare corner cases. ↩︎

  4. Devil is in details, as usual: we have told, so far, how many lifetimes each lifetime mark can carry, but told nothing about how exactly can we pick them! That could be pretty non-trivial in some cases. ↩︎

1 Like

Variance itself is fairly intuitive once you have enough setup. You've probably even used it without any trouble at all in other languages without realizing it!

For that setup, let's pretend we have a Java like language with traditional class subtyping; and that we have the following types:

class Fruit {}
class Apple : Fruit {}
class Banana : Fruit {}

class Orchard<T : Fruit> {
    T pick();
}
class Salad<T : Fruit> {
    void add(T fruit);
}

Hopefully it's pretty clear that if you need an Orchard<Apple> that using an Orchard<Fruit> is not ok, because it might actually be an Orchard<Banana> and pick would give code that expects an Apple a Banana. But, in contrast, if you need a Salad<Apple>, it is actually fine to give it a Salad<Fruit>, because it just means the code that expects a Fruit will only get Apples. Further, the reverse of both of these have the opposite result: if you need an Orchard<Fruit> then Orchard<Apple> is ok, and if you need a Salad<Fruit> then Salad<Apple> is ok.

This "can use instead of" is called a subtyping relationship: just like Apple is a subtype of Fruit because you can use it anywhere you need a Fruit but not vice versa, Salad<Apple> is a subtype of Salad<Fruit>, and Orchard<Fruit> is a subtype of Orchard<Apple>.

That's the setup, because, and this is the stumbling block of a lot of these explanations, so far nothing I've described has been variance (directly), only subtyping. Variance only applies when you start talking about the generic types, Orchard and Salad; specifically that, when instantiated with a parameter, they have a subtyping relationship based on the subtyping relationship of the parameter, for example, an Orchard<X> is a subtype of Orchard<Y> if and only if Y is a subtype of X, and the reverse for Salad. The former situation, where the relative subtyping is reversed is called "contravariance", where contra- means "against", while the latter is "covariance", co- meaning "with".

This is all a bit heady, but you can really boil it down: some parameters are used for inputs, so the values are going in the same direction, which is covariance, while other parameters are used for outputs, so the values are going in opposite directions, which is contravariance. C# made the clever decision to therefore let you explicitly annotate parameters with the keywords in or out.

There's a couple more cases, where a value is both an input and an output, so the parameter is "invariant" because the generic type is not assignable either way; and where a parameter isn't used at all, so you can put whatever in there and it's still assignable either way, called "bivariant", but that last one is often not permitted.

Bringing this back to Rust, we don't have variance in type parameters (because they might be different sizes and we don't want to force going through a vtable), but we do have lifetime parameter variance, which basically works the same way as type parameters in other languages: inputs are covariant, outputs are contravariant. The thing to consider is if the lifetime is used in a mutable reference, then it's both an input and an output, so it's invariant.

That's a lot, I know, but hopefully it breaks it down enough for it to "feel obvious" at each step?

2 Likes

Uhm… I would say that's another a monad is a monoid in the category of endofunctors, what's the problem case?

If you already know how covariance and contravariance works in C# and you also know how covariance and contravariance works in Rust then your explanation would be very well understood and would explain why two things that are, on the first glance, entirely different and not even remotely similar, have the exact same name.

A bit similar to an attempt to “’explain” to a middle school kid about why -4 divided by -2 is 2 (and not -2) by bringing abel groups and fields. Anyone who can understand such an explanation don't need it!

But yes, I have seen such things being attempted in school – with the end result that nobody understands anything, but everyone remembers lots of names!

1 Like