Practical suggestions for building intuition around borrow errors
Ownership, borrowing, and lifetimes covers a lot of ground in Rust, so this ended up being quite long. It's also hard to be succinct because what aspects of the topic a newcomer first encounters depends on what projects they take on in the process of learning Rust. You picked regexes and hit a lot of lifetimes issues off the bat; someone else might choose a web-framework with lots of Arc
and Mutex
, or async
with it's yield points, etc. (Even with how long this is, there are entire areas I didn't even touch on, like destructor mechanics, shared ownership and shared mutability, etc.)
So I'm not expecting anyone to absorb this all at once, but hope it's a useful, broad introduction to the topic. Skim it on the first pass and take time on the parts that seem relevant, or use them as pointers to look up more in-depth documentation, perhaps. In general, your mental model of ownership and borrowing will probably go through a few stages of evolution, and I feel it pays to return to any given area of the topic with fresh eyes every once in awhile.
1. Keep at it and participate
As others have pointed out, experience plays a big role, and it will take some time to build the intuition. This thread, and asking questions here in general, is part of that process. Another thing you can do is to read other's lifetime questions on this forum, read other's replies in those threads, and try playing with the problem yourself.
- When lost: read the replies and see if you can understand them / learn more
- When you played with it and got it to compile but aren't sure why: "I don't know if this fixes your use case, but this compiles for me: Playground"
- After you've got the hang of it: "It's because XYZ, but you can do this instead"
- Etc
Even after I got my sea-legs, some of the more advanced lifetime/borrow-checker skills I've developed came from solving other people's problems on this forum. Once you can answer other's questions, you may still learn things if someone else comes along and provides a different or better solution.
2. Understand elision and get a feel for when to name lifetimes
Use #![deny(elided_lifetimes_in_paths)]
to help avoid invisible borrowing.
Read the function lifetime elision rules. They're intuitive for the most part. The elision rules are built around being what you need in the common case, but they're not always the solution, like this thread has illustrated.
(Namely, they assume an input of &'s self
means you meant to return a borrow of self
with lifetime 's
, versus returning a copy of an inner reference with some longer lifetime.)
If you get a lifetime error involving elided lifetimes, try giving all the lifetimes names. Refer to the elision rules so you're careful not to change the meaning of the signature while you're doing so.
If you end up with a struct that has more than one lifetime... first pause and reconsider, there's a good chance you're overusing borrowing and/or getting into deep water. But if it is actually needed (or you're just experimenting), give them semantically relevant names and use those names consistently.
3. Get a feel for variance, references, and reborrows
Here's some official docs on the topic of variance, but reading it may make you go cross-eyed. Let me try to instead introduce some basic rules in a few layers. If it still makes you cross-eyed, just skim or skip ahead.
3a. A side-note on syntax and implied "outlives" bounds
A 'a: 'b
bound means, roughly speaking, 'long: 'short
. It's often read as "'a
outlives 'b
". I also like to read it as "'a
is valid for (at least) 'b
".
When you have a function argument with a nested reference like &'b Foo<'a>
, a 'a: 'b
bound is inferred.
3b. A high-level analogy
Some find it helpful to think of shared (&T
) and exclusive (&mut T
) references like so:
-
&T
is a compiler-checked RwLockReadGuard
-
&mut T
is an compiler-checked RwLockWriteGuard
You can have a lot of the former, but only one of the latter, at any given time. The exclusivity is key.
3c. The lifetimes of references
- A
&'long T
coerces to a &'short T
- A
&'long mut T
coerces to a &'short mut T
The technical term is "covariant (in the lifetime)" but a practical mental model is "the (outer) lifetime of references can shrink".
3d. Copy
and reborrows
Shared references (&T
) implement Copy
, which makes them very flexible. Once you have one, you can have as many as you want; once you've exposed one, you can't keep track of how many there are.
Exclusive references (&mut T
) do not implement Copy
. Instead, you can use them ergonomically through a mechanism called reborrowing. For example here:
fn foo<'v>(v: &'v mut Vec<i32>) {
v.push(0); // line 1
println!("{v:?}"); // line 2
}
You're not moving v: &mut _
when you pass it to push
on line 1, or you couldn't print it on line 2. But you're not copying it either, because &mut _
does not implement Copy
. Instead *v
is reborrowed for some shorter lifetime than 'v
, which ends on line 1. An explicit reborrow would look like this:
Vec::push(&mut *v, 0);
v
can't be used while the reborrow &mut *v
exists, but after it "expires", you can use v
again.
Though tragically underdocumented, reborrowing is what makes &mut
usable; there's a lot of implicit reborrowing in Rust. Reborrowing makes &mut T
act like the Copy
-able &T
in some ways. But the necessity that &mut T
is exclusive while it exists leads to it being much less flexible. It's also a large topic on its own so I'll stop here.
3e. Nested borrows and the dreaded invariance
Now let's consider nested references:
- A
&'medium &'long U
coerces to a &'short &'short U
- A
&'medium mut &'long mut U
coerces to a &'short mut &'long mut U
...
- ...but not to a
&'short mut &'short mut U
We say that &mut T
is invariant in T
, which means any lifetimes in T
cannot change (grow or shrink) at all. In the example, T
is &'long mut U
, and the 'long
cannot be changed.
Why not? Consider this:
fn bar(v: &mut Vec<&'static str>) {
let w: &mut Vec<&'_ str> = v; // call the lifetime 'w
let local = "Gottem".to_string();
w.push(&*local);
} // `local` drops
If 'w
was allowed to be shorter than 'static
, we'd end up with a dangling reference in *v
after bar
returns.
I called invariance "dreaded" because you will inevitably end up with a feel for covariance from using references with their outer lifetimes, but eventually hit a use case where invariance matters and causes some borrow check errors, because it's (necessarily) so much less flexible. It's just part of the Rust learning experience.
Let's look at one more property of nested references:
- You can get a
&'long U
from a &'short &'long U
- You cannot get a
&'long mut U
from a &'short mut &'long mut U
- You can only reborrow a
&'short mut U
(The reason is again to prevent memory unsafety.)
3f. Invariance elsewhere you may run into
Cell<T>
and RefCell<T>
are also invariant in T
.
Trait parameters are invariant too. As a result, lifetime-parameterized traits can be onerous to work with. And if you have a bound like T: Trait<U>
, U
becomes invariant even though it's a type parameter to the trait.
GAT parameters are also invariant.
4. Get a feel for borrow-returning methods
4a. Get a feel for when not to name lifetimes
Sometimes newcomers try to solve borrow check errors by making things more generic, which often involves adding lifetimes and naming previously-elided lifetimes:
fn quz<'a: 'b, 'b>(&'a mut self) -> &'b str { /* ... */ }
But this doesn't actually permit more lifetimes than this:
fn quz<'b>(&'b mut self) -> &'b str { /* ... */ }
Because in the first case, &'a mut self
can coerce to &'b mut self
. And, in fact, you want it to, because you generally don't want to exclusively borrow self
longer than necessary. So instead you can stick with:
fn quz(&mut self) -> &str { /* ... */ }
4b. Bound-related lifetimes "infect" each other
Separating 'a
and 'b
above didn't make things any more flexible in terms of self
being borrowed. Once you declare a bound like 'a: 'b
, then the two lifetimes "infect" each other. Even though the return has a different lifetime than the input, it's still effectively a reborrow of the input.
(This can actually happen between two input parameters too: if you've stated a lifetime relationship between two borrows, the compiler assumes they can observe each other in some sense. It's probably not anything you'll run into soon though. The compiler errors read something like "data flows from X into Y".)
4c. &mut
inputs don't "downgrade" to &
Still talking about this signature:
fn quz(&mut self) -> &str { /* ... */ }
Newcomers often expect self
to only be shared-borrowed after quz
returns, because the return is a shared reference. But that's not how things work; self
remains exclusively borrowed for as long as the returned &str
is valid.
I find looking at the exact return type a trap when trying to build a mental model for this pattern. The fact that the lifetimes are connected is crucial, but beyond that, instead focus on the input parameter: You cannot call the method until you have created a &mut self
with a lifetime as long as the return type has. Once that exclusive borrow (or reborrow) is created, the exclusiveness lasts for the entirety of the lifetime. Moreover, you give the &mut self
away by calling the method, so you can't create any other reborrows to self other than through whatever the method returns to you.
5. Understand function lifetime parameters
First, note that lifetimes in function signatures are invisible lifetime parameters on the function.
fn zed(s: &str) {}
// same thing
fn zed<'s>(s: &'s str) {}
When you have a lifetime parameter like this, the caller chooses the lifetime. But the body of your function is opaque to the caller: they can only choose lifetimes just longer than your function body.
So when you have a lifetime parameter on your function (without any further bounds), the only things you know are
- It's longer than your function body
- You don't get to pick it, it could be arbitrarily longer (even
'static
)
- But it could be arbitrarily shorter than
'static
too, you have to support both cases
And the main corollaries are
- You can't borrow locals for a caller-chosen lifetime
- You can't extend a caller-chosen lifetime to some other named lifetime in scope
- Unless there's some other outlives bound that makes it possible
6. Learn some pitfalls and antipatterns
6a. Common Misconceptions
Read this, skipping or skimming the parts that don't make sense yet. Return to it occasionally.
6b. dyn Trait
lifetimes and Box<dyn Trait>
Every trait object (dyn Trait
) has an elide-able lifetime with it's own defaults when completely elided. The most common way to run into a lifetime error about this is with Box<dyn Trait>
in your function signatures, structs, and type aliases, where it means Box<dyn Trait + 'static>
.
Often this means non-'static
references/types aren't allowed in that context, but sometimes it means that you should add an explicit lifetime, like Box<dyn Trait + 'a>
or Box<dyn Trait + '_>
. (The latter will act like "normal" elision in function signatures and the like.)
Short example.
6c. Conditional return of a borrow
The compiler isn't perfect, and there are some things it doesn't yet accept which are in fact sound and could be accepted. Perhaps the most common one to trip on is conditional return of a borrow, aka NLL Problem Case #3. There are some examples and workarounds in the issue and related issues.
The plan is still to accept that pattern some day.
If you run into something and don't understand why it's an error / think it should be allowed, try asking in a forum post.
6d. Avoid self-referential structs
By self-referential, I mean you have one field that is a reference, and that reference points to another field (or contents of a field) in the same struct.
struct Snek<'a> {
owned: String,
// Like if you want this to point to the `owned` field
borrowed: &'a str,
}
The only safe way to construct this to be self-referential is to take a &'a mut Snek<'a>
, get a &'a str
to the owned
field, and assign it to the borrowed
field.
impl<'a> Snek<'a> {
fn bite(&'a mut self) {
self.borrowed = &self.owned;
}
}
And as I believe was covered earlier in this thread, that's an anti-pattern because once you create a &'a mut Thing<'a>
, you can never directly use the Thing<'a>
again. You can't call methods on it, you can't move it, and if you have a non-trivial destructor, you can't (safely) make the code compile at all. The only way to use it at all from a (reborrowed) return value from the method call that required &'a mut self
.
So it's technically possible, but so restrictive it's pretty much always useless.
Trying to create self-referential structs is a common newcomer misstep, and you may see the response to questions about them in the approximated form of "you can't do that in safe Rust".
6e. &'a mut self
and Self
aliasing more generally
This thread has already covered how &'a mut Thing<'a>
is an anti-pattern which is often disguised in the form of &'a mut self
.
More generally, self
types and the Self
alias include any parameters on the type constructor post-resolution. Which means here:
impl<'a> Node<'a> {
fn new(s: &str) -> Self {
Node(s)
}
}
Self
is an alias for Node<'a>
. It is not an alias for Node<'_>
. So it means:
fn new<'s>(s: &'s str) -> Node<'a> {
And not:
fn new<'s>(s: &'s str) -> Node<'s> {
And you really want one of these:
fn new(s: &'a str) -> Self {
fn new(s: &str) -> Node<'_> {
Similarly, using Self
as a constructor will use the resolved type parameters. So this won't work:
fn new(s: &str) -> Node<'_> {
Self(s)
}
You need
fn new(s: &str) -> Node<'_> {
Node(s)
}
7. Scrutinize compiler advice
The compiler gives better errors than pretty much any other language I've used, but it still does give some poor suggestions in some cases. It's hard to turn a borrow check error into an accurate "what did the programmer mean" error. So suggested bounds are an area where it can be better to take a moment to try and understand what's going on with the lifetimes, and not just blindly applying compiler advice.
I'll cover a few scenarios here.
7a. Advice to change function signature when aliases are involved
Here's one of the scenarios from just above. The compiler advice is:
error[E0621]: explicit lifetime required in the type of `s`
--> src/lib.rs:5:9
|
4 | fn new(s: &str) -> Node<'_> {
| ---- help: add explicit lifetime `'a` to the type of `s`: `&'a str`
5 | Self(s)
| ^^^^^^^ lifetime `'a` required
But this works just as well:
- Self(s)
+ Node(s)
And you may get this advice when implementing a trait, where you can't change the signature.
7b. Advice to add bound which implies lifetime equality
The example for this one is very contrived, but consider the output here:
fn f<'a, 'b>(s: &'a mut &'b mut str) -> &'b str {
*s
}
= help: consider adding the following bound: `'a: 'b`
With the nested lifetime in the argument, there's already an implied 'b: 'a
bound. If you follow the advice and add a 'a: 'b
bound, then the two bounds together imply that 'a
and 'b
are in fact the same lifetime. More clear advice would be to use a single lifetime. Even better advice for this particular example would be to return &'a str
instead.
This class of advice can also be relevant because by blindly following the advice, you can end up with something like this:
impl Node<'a> {
fn g<'s: 'a>(&'s mut self) { /* ... */ }
And that's the &'a mut Node<'a>
anti-pattern in disguise! This will probably be unusable and hints at a deeper problem that needs solved.
7c. Advice to add a static bound
The compiler is gradually getting better about this, but when it suggests to use a &'static
or that a lifetime needs to outlive 'static
, it usually actually means either
- You're in a context where non-
'static
references/types aren't allowed
- You should add a lifetime parameter somewhere
Rather than try to cook up my own example, I'll just link to this issue. Although it's closed, there's still room for improvement in some of the examples within.
8. Circle back
Ownership, borrowing, and lifetimes is a huge topic. There's way too much in this "intro" post alone for you to absorb everything in it at once. So occasionally circle back and revisit the common misconceptions, or the documentation on variance, or take another crack at some complicated problem you saw. Your mental model will expand over time; it's enough in the beginning to know some things exist and revisit them when you run into a wall.
(Moreover, Rust is practicality oriented, and the abilities of the compiler have developed organically to allow common patterns soundly and ergonomically. Which is to say that the borrow checker has a fractal surface; there's an exception to any mental model of the borrow checker. So there's always something new to learn, forget, and relearn, if you're into that.)