What is a good mental model of borrow checker?

Edit: added TLDR at the end of this post to summarize points raised during this discussion, that might be helpful if one looks for a quick guide about the topics related to borrowchecker.


Whenever we write any piece of code, we rely on our mental models. We rely on our mental model of the hardware, and map the code we write with hypothetical actions that would be performed by the hardware that is our mental model. Obviously our mental model of the hardware is just that, a Model. It does not correspond to any actual hardware, it does not include weird edge cases that might exist on real hardware, it might include operations that might not exist on real hardware.

Our mental model of the hardware might stem from a abstract machine model that is specified by some programming language specification. We might think some of the expressions are lvalues and can be addressed others are not. We rely on that model even when there are no address of operator in the language we are writing, e.g. any language without pointers.

But even in this case that our mental model is derived from a specification somewhere it does not include all the details of the specification. Specification might mention about expiring xlvalues etc. but that might not be part of our mental model that we use in our day to day programming.

When we write code in a statically typed language, we rely on our mental model of the typechecker. For example we think that if f is of type f<T,U>(T)->U and xis of type T the expression f(x) will be of type U. Or f(x) is only well typed if f is a function with some type fn<T,U>(T)->U and x is of type T.

This mental model might also be derived from a specification of a particular typechecker or an academic paper from type theory, but our mental model does not include the exact algorithm of how this kind of rules are checked how are they represented as data structures by the typechecker.

What is distinctive of Rust is its ownership (destructive moves), life time and borrowing system. I assume that all proficient Rustaceans have a good mental model of the borrowchecker.

That mental model might be derived from your knowledge of the borrowchecker's implementation, from what you gathered from books and tutorials, or from your experience.

Even though I do not yet have such a good mental model, I assume that your mental model also does not include all the details of borrowchecker implementation and nitty gritty details.

I would like to know your opinion about what is a good mental model of borrow checker?

I presume it is more detailed than the following:

  1. each value has an owner.
  2. each variable has a lifetime
  3. lifetime of the reference must be smaller than the lifetime of the referrent
  4. there can be many shared references simultaneously
  5. there can only be one mutable reference at a given time.

This might be a good mental model as a beggining but I assume that a experienced Rustacean has a more detailed more accurate model.

Let me mention my incipient mental model for shared references.

  1. every variable has a lifetime that is determined at the compile time
  2. that lifetime starts when the variable is initialized and definitely ends when the scope of the variable ends or it is moved.
  3. if a is a variable with type T and lifetime 'a then &a returns a reference with type &T: 'a, which means that the the lifetime of the reference is bound by the lifetime of the referent.
  4. A life time bound 'a: 'b is satisfied when the lifetime 'a is completely withing lifetime 'b
  5. So I can scan the source code and think about the region that corresponds to life time of a and region corresponds to lifetime 'b and see that b is totally within a, I have a well typed program.

This mental model might correspond to what the borrow checker actually does, but it does not concern itself with the computational algorithms that is executed by it. It gives a rule of thumb that you can follow while you are writing the code.

However, this model is limited to shared references. For example it does not contain a model of how a well typed program does not have two mutable references at the same time. It does not explain what is the relationship between a reference to a struct and a reference to a field of the struct. It does not explain when a function (possibly unsafe one) accepting a raw pointer and returning two mutable references can be well typed.

What I would like to hear from you is your mental model of the borrowchecker, not how it is actually implemented but what do you assume about it when you write your code, what heuristics that you use when you read a piece of code and guess whether it will pass the borrowchecker or not.

This might sound like a huge open ended question, and questions on forums tend to get most attention when there is a specific problem and two lines of answer might be sufficient to answer it. But I think this is an important question because ownership borrowing and lifetimes are the most distinctive features of Rust and all beginners even veterans programmers of other languages struggle with that. And providing detailed answers to them in forum posts, blog posts, in long forms of writing would be invaluable contribution to the community. And hopefully some of you have a new year resolution to contribute to Rust community.

I really hope to hear your mental model of the borrowchecker!!!


TLDR Roadmap to understand borrows and lifetimes

Later entries in this post makes several good points and contains extremely helpful remarks and examples. Some concepts concepts keep reoccuring which suggest they are fundamental for understanding borrows and lifetimes. The following is my attempt to summarize those concepts and divide them into buckets so that it might be helpful for other Rustaceans. I might later edit it more, if it proves to be useful to others and improvements would be beneficial, you tell me how it can be improved.

  • beginner
    • shared and mutable references must live at most the lifetime of the referent, and references lock the value they point to i.e once you have a shared reference your interaction with the value must be through it (or you can drop the reference and keep interacting with the value directly) and similarly with mutable references.
    • Reborrows if you have a shared reference you can get new shared references through it. and if you have a mutable references you can get shared or mutable references from it, but the same rules of locking and interaction apply. (i.e if you get a mut ref x from a mut ref y all your interactions with the data must be via x until you drop it)
      • Links:
        • Reborrows are ubiquitous: 1
        • Can be visualized as access patterns: 1
        • Reborrows form a stack: 1 2 (to posts explain it in a visual and very accesible manner)
  • intermediate
    • Interior mutability some types give you the ability to mutate the data even when you only have a shared reference to that type.
    • field/borrow splitting borrowchecker undertsands some compound types that you can have two mut references at the same time to disjoint parts of the compound type.
  • advanced
    • two phase borrows borrowchecker can sometimes reinterpret the order when a mut ref and a shared ref to same data came into being, so that patterns like data.set(data.get() can be allowed.
  • common pitfalls
    • droping a reference has no effect on the lifetime of the reference itself.
    • common lifetime misconceptions: 1 (is a classic and must be anthologized and widely read)
12 Likes

There’s really two parts to mental models of the borrow checker in my opinion, and this even relates to the reasons the borrow checker exists in the first place. You can approximate the borrow checker from two sides:

One side is: thing that aren’t allowed. One instance of this is the rule that (active) mutable references must be unique (i.e. unaliased) references to their target. Or that (active) references must always point to valid targets. Another example would be: since ownership implies the ability (and conventionally also the responsibility) to drop the value, and double-frees (i.e. dropping the same value twice) are forbidden, ownership must be unique. In a world without a borrow checker (and ownership&initialization tracking, etc…), such rules would be the bare minimum requirements what the programmer has to avoid to avoid “undefined behavior”… such a world kind-of exists in the form of unsafe Rust… but fortunately outside of unsafe Rust, the borrow checker relieves us of the burden to pay attention to these things in too much detail, and especially once your code compiles fine, you don’t have to think about any of those concerns anymore.

The other side is: simple settings that are safe, in the sense that none of the things that aren’t allowed could ever happen in such a setting. You collect a few of them yourself in your post above, e.g. giving a collection of rules how one could reason about a setting where only shared references are used. The real borrow checker itself does this, too, following a set of rules that makes sure (or at least tries to make sure) that everything that can ever be done will be safe and not violate basic rules such as uniqueness of mutable references, avoiding double-frees or use-after-frees, etc… And one reason for unsafe Rust to exist is because it can help in situations where the rules from the borrow checker are still unnecessarily restrictive, though you are on your own then, which is what makes unsafe Rust quite hard.

Now, most experienced Rust programmers will thus have two mental models about ownership+borrowing. One model is an intuitive understanding of a large set of generally safe patterns; some conservatively simplified (hence possibly slightly more restrictive than the real deal) version of borrow checking rules, though usually in an intuitively understood sense, not rules as an actual algorithm to execute, like the borrow checker is. The other model is for an understanding of the dynamic “what happens and what’s always disallowed” model. You can use this to judge whether unsafe code might be sound, but it’s also very useful to judge whether an implementation strategy you have is clearly wrong and will never be able to work in safe Rust, or whether there is hope in being able to – somehow – convince the rust compiler and borrow checker that what you’re trying to do is fine. One would use this understanding in the design phase; or also when running into complicated borrow-checking errors to ask oneself the question “is what I’m trying here sane in the first place?” (and often enough, the answer will be “Wait, no! There’s this terrible corner case that I hadn’t thought of, but the borrow checker saved me!”)

The learning process is then iterative as well, of course. The mental model of “what isn’t allowed” typically starts out still wayy too permissive, especially with a C background where there’s no analogue of “these kinds of pointers must be unique” or “this API must be safe to use with whatever code the user writes (as long as it’s not using ‘unsafe’)”, which is why inexperienced or even intermediate Rust users are still bound to writing unsafe code that isn’t actually sound, even though they tried; or why they might have a hard time understanding certain scenarios where they just cannot convince the borrow checker that what they’re doing is fine, because it isn’t fine, yet they couldn’t see that: for example because a function signature they wrote would allow use-cases they didn’t think of, maybe just because they got the lifetime annotations in such a signature wrong, etc…

The mental model of “what can the borrow checker accept” also iteratively improves. Generally, you’ll learn about more capabilities of the borrow checker, starting out with a mental model that only knows of a more narrow, more restrictive set of things you know you can pull off without the borrow checker complaining; but of course there’s also the “I don’t know what the borrow checker would think of this” category that just shrinks from both sides, with you learning about new things that it can accept, and things that it cannot.

Even some things that might technically be possible to pull off can turn out to be impractical, or inconvenient, so one might also learn about anti-patterns of things that are possible but tedious, or things that do compile but are almost useless / impossible to use without causing compilation failure down the road.

What nobody, except a compiler developer, will ever have is an actual model of the borrow checking algorithm. Even with a good understanding of what rules are actually in place, and good judgement for most Rust code (as long as it’s not so much that you cannot keep it all in your head) about whether the borrow checker would accept it, you usually don’t know how the compiler actually figures these things out. And it also doesn’t matter: The borrow checker itself is evolving, improving over time. Large past improvements included the “non-lexical lifetimes” feature, and future improvements include a re-design called “polonius”[1].

Also, as I have indicated above, the good thing about the borrow checker is that you can usually ignore it, at least until you run into compilation errors; so the practice of only locally needing to understand some aspects of borrow checking in order to fix problems means that mental models of borrow checking are usually not concerned with getting an understanding of how to comprehensively borrow-check a whole function in your head, but more about how to think in more detail about a specific pin-pointed borrow-checking aspect, e.g. for a handful of accesses to a single variable… and it’s a useful skill to learn how to figure out which parts of your code are possibly relevant or aren’t relevant for a specific borrow checking error, especially in cases when the compiler might unfortunately not be pointing out the relevant connection – say e.g. something like a specific function call that relates or restricts some lifetimes – in the error message.


Finally, a small remark on notation in your question:

the Rust syntax 'a: 'b actually means the opposite. Such a relationship is sometimes called an “outlives” relationship, to help you get it right. Read it as “'a outlives 'b”, and see that it means that 'b is completely within 'a, not the other way around. Similarly where you write “&T: 'a”, the same problem applies, that the typical interpretation of this notation goes the other way.


  1. in case you do any research into the kinds of code examples that that improvement will help make compile, you’ll see that things are getting so complex and subtle, that there isn’t much use in trying to understand the exact rules of the borrow checker anyways; with those hardest cases, if anything, it’s more useful IMO to simply learn to recognize them, and to get an intuition as to what approaches/changes can try to do to avoid them ↩︎

14 Likes

One thing I'd suggest from reading your sketch is to not think of the liveness scope of a value as its "lifetime". They're related - a reference to a variable must not exceed the liveness scope of said variable - but conflating the two can cause confusion.

My current preferred reading of 'a: 'b is "'a is valid for 'b". This also works for types: T: 'a means T (the type) is valid for 'a.

Note how you can put these constraints on lifetimes and types, but not variables / values.


A few things you will probably have to iterate on as your mental model advances are:

  • Reborrows, which are woefully underdocumented. Briefly, &'long mut can be reborrowed as a &'short mut; the short one must end before the next use of the long one
    • and this is transitive
  • shared (&) reborrows of a &mut are more flexible
  • lifetime analysis is code point and flow sensitive (though in my experience this is more crucial to understanding why something worked than failed)

Heres's a post that points to another discussion about mental models, mostly about &mut.

Here's another with a diagram that also considers shared reborrowing.

5 Likes

My mental model of the borrow checker is really simple. It is the model of real world owning and lending of books (anyone remember real books?) It's a model presented by Niko Matsakis in one of his Rust presentations.

Basically:

A data item in Rust is like a book.

Books have owners. Say I own a book. I can do what I like with my book, read it, write in it, ultimately destroy it. I can of course create books for myself.

Books can given away. I could give my book to you. After giving away a book I cannot read or write or destroy it anymore. I do not expect to get the book back. You are then the owner and can do what you like with the book as above.

Books can be loaned to others. I could loan my book to you. You have to return the book at some point. When books are loaned out there are two possibilities:

  1. It is loaned with permission for the borrower to read it and write in it and make changes. Even add or remove pages. But not destroy it. I typically expect the book to come back modified in some way.
  2. It is loaned on the condition that the borrower can read it but not write on it or damage it in any way, certainly not destroy it. I expect you to give back my book undamaged.

In case 1) it only makes sense to loan the book to one person at a time. Having multiple people writing into the same boo could get confusing.

In case 2) it is possible to loan the book to many people at the same time. (Odd for physical books but imagine displaying pages in a shop window or some such). All those people can read it simultaneously but as they are not trying to write in it there is no confusion.

Finally, this all works transitively, you can give away or loan out books with the same conditions as you borrows them.

Of course there is my other mental model. Just write code and have the compiler complain. Tweak the code until the compiler is happy :slight_smile: No seriously. This model is based on real life law. None of us know all the laws of the land. We just do what seems reasonable until lawyers or the police or somebody starts reminding us we are not allowed to do that.

By the way, why do we call it the "borrow checker"? Why not the "loan checker"?

4 Likes

From what I can tell, you mean a formal specification-like collection of legalese paragraphs when you write "mental model". I can only speak for myself, but for me, this would be way too complicated to be of any practical use. Hence, my mental model is nothing like that.

I don't think it's reasonably possible to be explicit about all details to this extent, as the procedure for scanning and evaluating a long list of rules over and over would simply be too time-consuming and too resource-intensive, to the point where I would basically be unable to think about anything else (including in particular the high-level problem I am trying to solve) at the same time.

The way I write Rust code is much less formal and significantly less structured. To begin with, I try to write the simplest, most obviously correct code. This is possible because the borrow checker really isn't magic (unless of course we consider ingenious language design as such). Instead, the borrow checker enforces correctness in a way that is directly related to the physical representation of values in memory.

Borrow checking rules aren't ad-hoc, nor are they derived from some arcane, unfamiliar mathematical theory. They encode simple and hopefully obvious facts about the "physical" reality of memory, which you would have to follow in every other language in order to write correct code anyway – except that other languages don't try to enforce them at compile time, dismissing the responsibility by making memory safety violations your fault. Thus, programmers don't tend to think about them in a principled enough manner.

But if you think a bit about it, "don't use memory that was freed" or "don't point to stuff that's not there" or "don't free memory that is not yours" is not exactly rocket science. So first of all, you should write code that doesn't contradict these obvious principles, for example, by not returning references to locals, and not trying to make values with multiple owners.

And there's this one thing that people miss because they have been taught the wrong thing for decades: that race conditions are a result of multi-threading. This is plainly false, as race conditions exist due to shared mutability of any form, including single-threaded serial code, but nobody teaches this, so most newbie programmers live with the misunderstanding that single-threaded shared mutability is OK. (Except C++ programmers who have been bitten by iterator invalidation more frequently than they would have liked.) Thus, this is an anti-pattern that most people will need to simply un-learn.

The third problem is not thinking generally enough. When people encounter borrow checking errors while using an API specific to Rust's memory management style, they will often not see the forest for the tree, thinking that every such error stems from a separate rule they will need to learn. The reality is, however, that borrow checking looks for a relatively small set of constraints, and any errors you get are eventually derived from this small set.

For instance, people who use RefCell or Mutex will often complain that they are unable to return a reference to the value wrapped inside the cell/mutex. They will note that this is not explicitly documented anywhere, and they will be puzzled by the fact that Ref and RefMut and MutexGuard will not lend a reference to the data for any longer than these lock guards are themselves alive, even though the wrapped value clearly lives longer – namely as long as the cell/mutex lives.

However, a bit of deep thinking would have let them infer that this is not a separate rule, and that this is the way things must necessarily be. If the RefCell or Mutex handed out direct references to its contents, it would stand no chance of knowing when to lock and unlock. Preventing direct references to the wrapped object is in fact the very point of using a cell or a mutex, because the dynamic tracking of borrows is exactly what these types must perform in order to prevent race conditions, thereby guaranteeing memory safety.

There is no borrow checker rule saying that "you can't return a reference to the value in a mutex". Mutex is not a type the borrow checker specially knows about at all. Instead, there is a rule that says that the reference to a part of a value can't outlive that value itself, and by equipping Ref and MutexGuard with a Deref impl, the borrow checker will simply enforce that the references handed out by these guard objects don't outlive the guards, which happens to be exactly what is needed by the mutex or the cell to enforce safety.

Thus, a good amount of these errors can be prevented, and the "why" behind them answered, by studying good API design practice, rather than by memorizing a long list of non-existent "borrow checking rules".

All in all, my advice boils down to:

  1. Write the simplest code you can by default;
  2. Read good-quality, idiomatic Rust code (including the source of the standard library) in order to train your brain's pattern matching engine to quickly recognize patterns as well as anti-patterns. A lot of errors and fighting-with-borrowck can be prevented by recognizing that eg. self-referential structs or &'a mut T<'a> doesn't work, and so you can re-organize your code once you encounter such a construct, even before trying to feed it to the compiler.
6 Likes

I work to a very simple rule: Never write any lifetime tick marks.

Well, of course at some point you have to but I don't like to see them cluttering up my code. The syntax is jarring. And life gets confusing. There are very few tick marks in my applications. As soon as the compiler starts nagging about lifetimes I start thinking of how to do the job without any.

Does that mean my code is suboptimal? Probably. Don't care, it's plenty fast enough. I value simplicity and clarity.

5 Likes

TBH, I've never once thought about it that way. And I think most people don't either -- a big part of the new "NLL" borrow checker is that it doesn't correspond to lexical scopes in the code. The first one did, on the assumption that being able to see those scopes and talk about them would be helpful, but it ended up being more confusing than helpful.

If you're just talking about in terms of your first-instinct mental model, I'd start with something based on the new NLL errors. The old errors ended up saying accurate-but-often-unhelpful things like

6 | }
  | - immutable borrow ends here

But the new ones tell a good story:

error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
 --> src/main.rs:5:13
  |
4 |     let y = &x;
  |             -- immutable borrow occurs here
5 |     let z = &mut x;
  |             ^^^^^^ mutable borrow occurs here
6 |
7 |     println!("y: {}", y);
  |                       - borrow later used here

That leads to a simple guideline like "don't take a mutable borrow of something until you're done with all the existing borrows". And that covers what I need the vast majority of the time.

(And if I need anything more complicated, I rely heavily on just running it and seeing what the compiler says, and using that to figure out what the problem was. You'll see lots of people here say "we can't help with your borrowing problem without a compilable example that can repro the error" so that seems common to me -- though that's also because people who can't fix borrowing errors tend not to understand the problem well enough to be able to give useful prose about it.)

6 Likes

@steffahn I really enjoyed reading your response. It explains very well how one should approach the acquisition of a mental model, namely iteratively and form two perspectives on the borrow checker as to what it enables and what it forbids.

However I would very much appreciate if you could also tell more not just about the process of acquisition but also about the resulting mental model. So, can you tell what specific nuggets of wisdom you now know about borrow checking after you completed a few steps of this iterative process?

For example you say one starts with a permissive mental model (C like thinking) and then learns new patterns that borrow checker forbids. Can you elaborate more what that initial more permissive mental model is and what specific items you add to it to make it more accurate and less permissive?

Similarly, you say one also starts with a very limited mental model of borrow checker not utilizing certain patterns assuming it wont allow it but then one learns new patterns the borrow checker allows. Can you elaborate more what this initial limited mental is and what specific items you add to make it more accureate and more permissive?

@quinedot thanks for pointing out the reborrows, this is one of kinds of the things that I am looking for, i.e. concepts and considerations that one must add to one's arsenal in order to have an accurate model of the borrow checker.

I would very much appreciate, instead of one example, if we can somehow collect a list of borrow checker concepts and considerations that would lead to a good mental model. So what are the concepts and considerations that a beginner are advised to be aware of. Then what are the things that a intermediate Rustacean might learn. And what are the things that one could think about on the more advanced level.

If we can name them and list them (and possibly link places that one can learn about them), then we can have a clear road map. Then we can say here are the items that comprises the mental model of a proficient Rusteacan. Here are the places that you can learn about them.

I agree more things should be in once place, and even have some draft "all about lifetime" skeletons around, but there is a lot and it's time consuming too, at least the way I work (checking myself, providing examples, and providing official citations).

Thank you for your comment. Let me clarify what I mean by mental model, so that we can be on the same page.

I think a good analogy for typechecking (and similarly borrow checking) is code review. It is also better than code review because it is automated and fast. Let assume that we are working with a dynamically typed language. We write a piece of code, and then send it for code review. Eventhough in this dynamically typed language variables do not have a type, if we change the types of variables in a haphazard way, it would diminish the readability and maintainability of the code, and the reviewer would probably point that out and ask for a revision. The language is not typed but there are good code practices that reviewers look after. Just like you say about borrow checking that:

typechecking rules are not arcane or magic. They (at least partially) codify good practices, safety and correctness conditions. Therefore, even though we do not work with a typed language, we might get to know the reviewer, and the things that they care about, and write our code keeping those qualities in mind so that our code is not rejected.

The only thing that changes when we adopt a typed language is that this code review is automated. Instead of caring what the reviewer expects and approves or disapproves we start to care about what the typechecker expects, approves or disapproves. Now that the process is automated one thing we can do is to right a garbage code and let the typechecker yell at us and correct all those mistakes, which we would not have done if the reviewer was not an algorithm but a person.

It goes without saying that after a while every team member gets the now the code reviewer get a feel for what they llook for what they approve what they disapprove so that while they are writing their codes they keep those considerations in mind and instead of sending what they believe must be acceptable they write in a way that the reviewer also would think that the code is of quality, is safe, does not leak memory etc.

And similarly if we adopt a typed language, we get a mental model of the typechecker and keep what it cares in our mind. Instead of throwing arbitrary code at it and getting yelled we write our code in a way that the type errors are minimized.

Now let us talk about Rust's borrow checker. As you said its rules are not magic or arcane, they are best practices in other languages. So, a code reviewer might (and probably would ) pay attention to them and enforce them manually. After a while the team members get a mental model of what the reviewer cares about and they would code in a way that these best practices are observed.

And the borrowchecker is just the codification and automation of the job of the code reviewer. Just like the ideas in the minds of the team members about the things that the code reviewer cares, we can get some ideas in our mind about the things that the borrowchecker cares.

Now assume we only have manual code review, and a new person joins the team and they want to be productive they want their code to be committed to main branch as soon as possible and they ask for your advice. They ask you about the preferences of the code reviewer what do they care about, when do they like a code what do they allow what do they dislike in a code what do they forbid.

If you just say "Write good code, and avoid bad code, you will be fine" even though it is correct you did not help the new member a lot. If you say "initialize every variable at the time that you declare it", it is more helpful because this is something that they can follow. This might give the impression of what they look for is legalese specification but it is not. It is just a question of how to impart our knowledge our mental model of the preferences of the reviewer in the most helpful and actionable way.

In this analogy the mental model is your ideas about the reviewer, and as a new team member I ask you to give me advice, what should I know about the reviewer so that I can be productive, proficient.

The advice you give can be a step by step guide like the one I described it might sound legalistic but it just boils down to "Look for the scope of the variables and make sure that references are droped before the variable itself". You can also say "

so direct the person to new topics that they are not aware of about the things that the reviewe cares about. It would be much more helpful if you could just summarize what they will learn about it but you have your work to do just pointing at the right direction would be much appreciated.

You might also point out some code that the reviewer has already approved and say learn patterns from these.( which is similar to what you say when you say

But this advise is not actionable. The person does not know what the reviewer allows or forbids, just by looking at the code you would not know what are the alternatives that it avoids the please the reviewer.

Moreover I believe that the question has very well defined scope. It does not ask what the reviewer cares about in general. It does not care whether they like factory methods or reactor pattern etc. It only asks what should I know about the reviewer when it comes to borrowing and lifetimes. I would expect there are some small set of rules (which you seem to agree with)

and small set of concepts that one must be familiar about so that they can pass the code review when it comes to only lifetime and borrowing issues.

And it seems that we have not done a good job getting our colleagues up to speed given the amount of lifetime and borrow checking questions in the forums or stackoverflow.

You might say it takes too much effort and time to rectify that and it is not my job. Point would be well taken. But at least we can curate a list of topics and concepts that would lead never Rustaceans to a better place.

And who knows maybe someone can take these Herculean(!) challange.

1 Like

Your “code reviewer” concept is actually a good one for the borrow checker, because one of the facts about the borrow checker is that it never affects how the program behaves — it only either allows or rejects programs. (Type checking, on the other hand, is also type inference and can affect trait implementation selection and method resolution.)

I would say that what the anthropomorphized borrow checker is looking for is:

“You want to use this pointer into some existing data structure? Okay, show me a mathematical proof that this pointer will always be valid for as long as it is going to be used, and nothing else will mutate it. Use really small steps. Use three colors of markers on the source code if you need them to keep things clear.”

The ‘three colors of markers’ are explicit lifetime names. Each function, type, or trait impl that has lifetime annotations (or elided lifetimes!) is an element in the proof. For example, take a simple method:

fn get_foo<'a>(&'a self) -> &'a str {
    &self.foo
}

get_foo's signature with lifetimes is a theorem we can use in the proof of the whole program: “if we call get_foo() then we have a reference with a lifetime as long as the one we started with”. Lifetime-checking get_foo()'s body is the lemma that get_foo() in fact works that way.

Much like a code reviewer, the borrow checker isn't happy if you get too clever. There are lots of programs that are entirely sound, but which the borrow checker will reject, because the types of reasoning the borrow checker is willing to do for you are very narrow. That reasoning is very focused on the primary purpose of references — temporary pointers passed to functions and discarded when the function returns — and often isn't as useful when the situation isn't shaped like a call-stack.

9 Likes

This is a good question. Ingredients in my mental model:

  • "Physical" intuition about what lives where in memory, and for how long. I'll actually have a vague sense of some buffer floating in space, with various pointers attached to it. This applies mostly to big, long-lived heap allocations. I don't know how to train this intuition.
  • More locally, a sense of what a "good" type signature looks like from the perspective of the borrow check, or rather what the red flags are that suggest a signature is not possible to implement sanely. This is very syntactic or even typographical, there's a little warning sign that starts flashing in my head when I see 'a repeated too many times in a function signature. I think maybe this is the kind of thing you can train by working with the compiler for a long enough time, but you do need a human to explain to you why these patterns are bad at least once.
  • A bunch of rules of thumb that I've internalized enough that I don't have to apply them consciously -- when I'm defining a type with a dynamic string field, I just instinctively reach for String and not &'static str. Reading users.rust-lang.org is good for picking this up, I'd say.
  • "Design pattern"-level knowledge of what kinds of high-level code organization do/don't work well with the borrow check. You can't use references to build a graph, if you have a big "context" singleton you should prefer to pass it explicitly down the call stack instead of stashing it in a global, etc. URLO can be good for this, or you can look at std or high-quality community crates for examples of how to solve particular design problems.

Thinking about the notorious "self-referential struct" as an example, I'm able to avoid that particular trap by a combination of "physical intuition" (picturing the struct moving around in memory and causing the internal pointer to dangle; also, visualizing the ownership hierarchy of my data and noticing when it has a back- or cross-reference), an internalized "rule of thumb", and the habit of using "design patterns" that don't cause me to reach for a self-referential struct in the first place.

Which means I'm not really mentally modeling the borrow check, so much as working in a frame of mind that I've developed based on feedback from the borrow check. As steffahn said,

Except that I have a (very vague) sense of the kind of valid, "normal-looking" programs that the current borrow check sometimes rejects, but that are accepted by polonius. That doesn't come up very often.

3 Likes

It's good to have good documentation, but I don't think it's necessarily a bad sign that lots of people are running into problems with the borrow check and asking questions about it. Fighting the compiler is (anecdotally) a necessary stage in the learning process; asking about a specific problem you're having can be a more efficient/pleasant way to build knowledge than reading docs; and at least on this forum people's borrow-check questions usually get a prompt and thoughtful response.

1 Like

This is the kind of the thing I am looking for, a set of reasoning steps that borrow checker takes, and if I take those steps in my mind I would reach more or less the same conclusion with the borrow checker.

Let me flesh it out. So far, what everybody agrees in this post is that you cannot have two mutable references to the same data. And it is right that you cannot have the following.

fn main(){
    let mut v = vec![0,1];
    let f_ref = &mut v[0];
    let s_ref = &mut v[1];
    
    println!("{f_ref:?}{s_ref:?}")
}

but you can have the following

fn main() {
    let mut v = vec![0,1];
    let (f_ref,s_ref) = v.split_at_mut(1);
    f_ref[0] += 1;
    s_ref[0] += 1;
    
    println!("{f_ref:?}{s_ref:?}");
    println!("{v:?}");
}

in both examples f_ref and s_ref are mutable references to the buffer owned by the same vector.

Aside: (that these references are to the disjoint parts of the buffer is not an excuse they are disjoint also in the case that the borrow rejects, also it does not matter if split_at_mut has (possibly) an unsafe implementation, since unsafe does not turn of the borrow checker. And the possible reply that it should have accepted the first version would not relevant, since this is the reviewer that we have and we should understand what they care about before we criticize what they care about.

So, if I assume that the reasoning step the borrow checker takes is "dont let two mutable references to the same buffer" my mental model of borrow checker is not accurate. It must be enhanced somehow. But how? That is the question.

How should I reason about my code so that I reach to the same conclusion with the borrow checker. Since we are talking about reasoning I would like to hear about the premises and inference rules, but if that sounds too formal, I would be also satisfied with a more discursive presentation.

I can assure you 100% that the standard library compiles and passes borrow checking. Thus, you know that whatever it contains (in safe code) is allowed (and idiomatic, as it is written by the language team itself).

There are huge amounts of questions about all sorts of things on Stack Overflow. Raw question count is not a reasonable measure of how hard or under-documented something is. There is no royal road to geometry. To gain knowledge and experience, you will have to practice.

1 Like

The rule is, &mut cannot be aliased. "Aliased" is not formally defined, but there is a general agreement that aliasing is when two things can access the same memory. &mut [0] and &mut [1] cannot do that.

To show it's definitely not UB, you don't need unsafe for this particular example:

fn main(){
    let mut v = vec![0,1];
    let (f_ref, s_ref): (&mut i32, &mut i32) = match &mut v[..] {
        [one, two] => (one, two),
        _ => unreachable!(),
    };

    println!("{f_ref:?}{s_ref:?}");
}

(However, the fact that split_at_mut is safe to call is also a demonstration that it's sound, despite the use of unsafe, given that std upholds Rust's soundess guarantees. [1])

That's enhancement number 1.

So why can't the first one work? Because Vec's IndexMut implementation is not magical, it takes a &mut self which covers all the elements (and the length field and the capacity field); that API says that any mutable borrow obtained via indexing must borrow the entire Vec.

That's enhancement number 2.


I almost went on to mention some indexing which is magical, but I think I'll emphasize a larger point instead: Rust's complexity around borrowing is fractal, and there's often an exception to the exception and so on. It's also somewhat organic: The slice pattern I used in the code above wasn't available before 1.42. I'm pretty sure you needed unsafe to do the slice borrow splitting before that.

Implicit enhancement number 3: There are sound programs the borrow checker doesn't know how to prove. Sometimes the language evolves to allow more proofs. This is also implicit in the unsafe used for split_at_mut -- it's there because the borrow checker couldn't prove it was sound, even though it is, when implemented correctly.

I don't think knowing exactly how the borrow checker is implemented is not as helpful as a good mental model. Also, the implementation i's shifting all the time in minor ways; you'll probably never keep up with everything.


This fractal nature does make it hard to split the (very large) topic into beginner, intermediate, and advance topics, in my opinion anyway.


  1. Possible side-bar here on unsafe and "turning off the borrow checker' and validity vs safety invariants... ↩︎

4 Likes

@quinedot already gave a good answer, but instead of starting out discussing the “fundamentally forbidden” rules of “aliased &muts” which (at least apparently) neither of the versions of the code actually produce, since the two items are different, let me try to outline the borrow checker point of view, as of my understanding.


One important part of basic knowledge is that borrows form a sort of stack. You can have a &mut T reference, reborrow that for a shorter time to a new &mut T (or maybe &mut S …) reference, re-borrow that one again for even shorter; maybe re-borrow it twice, but not at the same time, etc…

So at any given time, with mutable references alone, there’s a stack of borrows from the original variable down to the currently active usable reference that you’ve derived… and if you thing about a whole time-frame, the access-pattern becomes tree-shaped.

// using non-“standard” indentation to emphasize
// the tree-like structure of (mutable) borrows and re-borrows over time
fn main() {
    let mut x = 0;
    x += 1;
        let r1 = &mut x;
        *r1 += 1;
            let r1_1 = &mut *r1;
            *r1_1 += 1;
        *r1 += 1;
            let r1_2 = &mut *r1;
            *r1_2 += 1; // **** (marked line)
        *r1 += 1;
    x += 1;
        let r2 = &mut x;
        *r2 += 1;
            let r2_1 = &mut *r2;
            *r2_1 += 1;
        *r2 += 1;
            let r2_2 = &mut *r2;
            *r2_2 += 1;
        *r2 += 1;
    x += 1;
    dbg!(x); // -> 13
}

By the way, if at any leaf of this tree, the mutable reference is re-borrowed immutably, then such an immutable re-borrow can alias other immutable re-borrows at the same time, giving a tree-shaped (or… “long non-branching (mutable re-borrows) stem, followed by tree of immutable re-borrows, and re-re-borrows, etc…”) borrow pattern even at a single point in time.

fn main() {
    let mut y = 0;
    let mut x = 0;
    x += 1;
        let r1 = &mut x;
        *r1 += 1;
            let r1_1 = &*r1;
            let r1_2 = &*r1;
            y += *r1_1;
            y += *r1_2;
            y += *r1_1;
            y += *r1_2;
        *r1 += 1;
    x += 1;
    dbg!(x); // -> 4
    dbg!(y); // -> 8
}

But without immutable borrows, there is no branching at any given point. E.g. in the // **** (marked line) above, there is a borrow stack of x being (mutably) borrowed by r1, which is re-borrowed into r1_2, which is then probably even re-borrowed again to create the &mut i32 typed self argumet passed to the AddAssign implementation of i32.[1]

So we have a stack x -> r1 -> r1_2 -> re-borrow of r1_2 at this point of executing the +=, there are no branches. Still there are notably multiple references existing at the same time; just all but the top-of-stack are sort-of inactive / unusable at the time until their respective re-borrow ends; this is how the “only one mutable reference at a time” kind of rule manifests.

Now onto yet another way of branching… even with only mutable references, branches in this re-borrowing stack/tree are possible, in two ways: One of them is via function calls. If you call a fn(&'a mut X) -> (&'a mut Y, &'a mut Z) function, then you turned your one mutable reference into two new ones, and you created a branching at the top of your re-borrowing tree. Their lifetimes are connected (they’re the same; but if the function returned e.g. &'b mut Y with a 'a: 'b bound, the effect would be the same) so the two new references replace the old one in your mental borrow tree. That’s what happens with the split_at_mut. Let’s first blow up

let mut v = vec![0,1];
let (f_ref,s_ref) = v.split_at_mut(1);

into more steps…

let mut v = vec![0,1];
let v_ref = &mut v; // line 2
let v_slice_ref = <Vec<_> as std::ops::DerefMut>::deref_mut(v_ref); // the `&mut self` argument for `split_at_mut` // line 3
let (f_ref, s_ref) = <[_]>::split_at_mut(v_slice_ref, 1); // line 3

borrow stack at/after line 2

v <-mut- v_ref

borrow stack at/after line 3

v <-mut- v_slice_ref

and at/after line 4

v <-mut- f_ref
 ^
 |
 +--mut- s_ref

[2]

So well… now we do have more than one borrow next to each other. But still we cannot create more such borrows directly by accessing v, of course. Both f_ref and s_ref do imply full mutable access over the whole of v as far as the borrow checker is concerned (the information relevant for safety that, and how, they’re non-overlapping is hidden from the borrow checker behind a function boundary), which means as long as either of f_ref or s_ref still exists, you cannot access v, in particular you cannot directly create new references to v using syntax such as &v, until both f_ref and s_ref reach the end of their life-time.


Now a short discussion why

    let mut v = vec![0,1];
    let f_ref = &mut v[0]; // line 2
    let s_ref = &mut v[1]; // line 3

(and then accessing f_ref and s_ref) fails. The borrow stack after line 2 is

v <-mut- f_ref

which comes from passing a mutable references to v to the index_mut operation. (It is by the way a general principle that the borrow checker will work on desugared code, so don’t mind me preferring to talk about an index_mut method rather than indexing syntax. We have a simple fn (&'a mut Vec<T>, usize) -> &'a mut T method for indexing, and reason about it like any ordinary method call.)

Now the next line, line 3, accesses v again. Currently f_ref’s re-borrowing of v prevents that, so we must “kill” f_ref (figuratively) and turn the borrow stack back into

v

before being able to execute line 3 resulting in

v <-mut- s_ref

Since f_ref is “dead” now (figuratively), using f_ref in a print statement will result in a compilation error.


I eluded to these borrows borrowing v completely above. There’s also partial borrows. You’d need to make your borrowing stacks/trees even more elaborate to cover those. E.g. if you have a struct foo, you can create both let ref_a = &mut foo.a and let ref_a = &mut foo.b and use them in parallel. Or even borrow on field mutably and another one immutably. I could imagine you might want to write this as

foo <--mut @.a-- ref_a
   ^
   |
   +---mut @.b-- ref_b

If foo also has a c field, it is still not completely borrowed; for every access to foo while it’s partially borrowed, you would need to consider “what parts of foo am I trying to access in what manner (mutably vs. immutably)?” and “what parts of foo are currently borrowed in what manner (mutably vs. immutably)?” and check that the answers to these questions are not involving mutable access to any already borrowed (mutably or immutably) part, and that they don’t involve any access to any part that is already mutably borrowed. Or you’ll need to “kill” existing borrows appropriately.[3]

In fact, the slice patterns that @quinedot demonstrates seem to give the borrow checker a fairly fine-grained view of what parts of the slice are borrowed by what reference. To the point that the following code does compile!

fn main() {
    let mut v = vec![0,1];
    let slice_ref = &mut v[..];
    let (f_ref, s_ref): (&mut i32, &mut i32) = match slice_ref {
        [one, two] => (one, two),
        _ => unreachable!(),
    };
    
    println!("{f_ref:?}{s_ref:?}"); // point A
    let f_ref2: &mut i32 = match slice_ref {
        [one, _] => one,
        _ => unreachable!(),
    };
    // s_ref is still usable!!
    println!("{f_ref2:?}{s_ref:?}");
}

The borrow stack at point A would need to look something like

v <-mut- slice_ref <-mut @[0]- f_ref
                  ^
                  |
                  +--mut @[1]- s_ref

and killing/ending f_ref makes the index [0] of the slice accessible again for the new f_ref2, resulting in the fact that s_ref is allowed to stay alive throughout the second match expression, where the second entry is not touched at all because it’s matched against a _ pattern.

See… things are getting complex, and more complex, and more complex… which is why no-one will have a complete mental model of the borrow checker (I needed to test the above code to find out whether or not it would be accepted, I didn’t know myself beforehand, but experimenting with the compiler’s reaction to toy examples is a great way of learning more…), and no-one will apply all their knowledge of the borrow checker to all their code either, especially when code just works.

Anyways, to mention even more… I have not gone into (and admittedly also not a particularly detailed mental model of) what happens if you put references into other structs. In fact, we glanced over the fact that the split_at_mut function actually returns a single value (&'a mut i32, &'a mut i32), at which point one could wonder “how does this compound fit into this borrowing-stack?” before we break it up into two separate &'a mut i32 value by pattern matching. To give some negative answers: “how does this compound fit into this borrowing-stack?” can not be answered with “the borrow checker just considers the individual fields”, since the fields of a struct (in this context, I would consider a tuple to be merely struct, too) are an implementation detail[4]. If the tuple was a struct Foo<'a>(&'a mut i32, &'a mut i32) instead, then I’d say: treat the Foo<'a> like any other reference in your re-borrowing tree. But what if there’s two lifetime parameters, Bar<'a, 'b>(&'a mut i32, &'b mut i32). Can we get a borrow-DAG[5]? Probably yes!? But moving a &'a mut i32 field out of Bar<'a, 'b> would re-instantiate the position of that &'a mut i32 in the part of the borrow-tree where the 'a lifetime came from[6]. So maybe Bar<'a, 'b> gets one entry for each of its lifetime parameters… of course a fn foo(&'a mut X, &'b mut Y) -> &'c mut Z function with 'a: 'c and 'b: 'c bounds could now create a single lifetime 'c that does combine two different original lifetime, so we do get a DAG after all, for sure. I guess it’s getting really really complicated now, so let me just stop… :slight_smile:


  1. Well… technically the += operator for i32 is magically compiler-implemented and doesn’t use that trait, but one shouldn’t care about this implementation detail. ↩︎

  2. By the way, one could also argue that passing the intermediate &mut references to deref_mut and split_at_mut would add extra re-borrows. This is still talking about mental models, so either approach should be fine; also I don’t think there’s an easy way to figure out what the borrow checker “actually” does, as it shouldn’t make a difference. With this approach, you’d have the following diagrams:

    borrow stack at/after line 2

    v <-mut- v_ref
    

    borrow stack at/after line 3

    v <-mut- v_ref <-mut- v_slice_ref
    

    and at/after line 4

    v <-mut- v_ref <-mut- v_slice_ref <-mut- f_ref
                                     ^
                                     |
                                     +--mut- s_ref
    
    ↩︎
  3. I mean, there’s two ways to think about borrow-checking conflicts of course: Either, you access a borrowed value in a conflicting way, which kills a borrow, and then the killed borrow is later used again; or you determine that each borrow ends (and gets removed from the borrowing stack) on its own whenever it’s last used, and then the conflicting access is already the error. By the way – sticking with the first approach – killing a borrow would always, of course, also kill all it’s re-borrows to the right in the tree, recursively. ↩︎

  4. and the borrow checker must not consider implementation details, otherwise changing such implementation details would be breaking change, and some properties of these “implementation details” would thus become part of the public API anyways ↩︎

  5. directed acyclic graph ↩︎

  6. I could create a code example to “prove” this if you want ↩︎

3 Likes

If it is not too much to ask I would really like to hear about the magical parts of the Rust that one should be aware when they think about borrows and lifetimes.

I am not sure what you mean by the API of IndexMut, but when I think of API's of functions and methods I think their signatures.

Signature of IndexMut is (&mut T, usize) -> &mut T::Output
Signature of split_at_mut is (&mut T, usize) (&mut T, &mut T)

So if returning a mut ref from another mut ref reborrows entire thing the first ref points to (which must be the case for the API of IndexMut guaranteeing such a behaviour), the second one in its signature does not signal it is doing something different.

This example is again eye opening for the same reason, I would have said that if you pattern match a mut slice what you get (at most) a mut ref to some part and since that mut ref has to borrow the entire scrutinee you cannot get another mut ref to any part of that scrutinee.

So, it seems that some language constructions returning references borrow the entire data, (and I had expected this is the only way) and others like pattern matching manage to borrow parts. Is there again somthing "magical" going on? What are the other parts of the language that can introduce this kind of magic?

So what this amounts to is a difference between the "static model" and the "dynamic model".

If you have fn(_: &'a mut _) -> &'a _, the 'a lifetime means that the input reference is unusable until the returned reference expires and the 'a lifetime is allowed to end.

But it doesn't mean that the return reference actually borrows from the input, just that it's allowed to. As an obvious example,

fn example<'a>(&'a mut i32) -> &'a str {
    let s: &'static str = "hi";
    s
}
error[E0499]: cannot borrow `*v` as mutable more than once at a time
  --> src/main.rs:9:5
   |
8  |     let r = example(&mut *v);
   |                     ------- first mutable borrow occurs here
9  |     &mut *v;
   |     ^^^^^^^ second mutable borrow occurs here
10 |     &*r;
   |     --- first borrow later used here

The output lifetime/reference is tied to the input one; any use of the output reference requires the input reference to be valid.

This applies identically to when multiple output references are produced, e.g. fn(_: &'a mut [T]) -> (&'a mut [T], &'a mut [T]). Nothing about the signature cares about where any of the references actually borrow from; the lifetimes just say where the references are allowed to borrow from.

In this case, both output references have permission to borrow from the input reference. Thus, the input reference is locked until both output references are no longer used again, releasing the lock. Whenever either reference tied to the 'a lifetime is used, it asserts that the lock is still active and valid.

But that doesn't force them to be actually referencing the entire input. Again, I could trivially write a function returning (&mut [], &mut []) with the 'static lifetime, and the signature would still force the caller to deal with the possibility that these references require the lock on the input reference, because the signature says they may.

As far as what they actually reference, it does not matter what the input reference borrows; all that matters is the standard rule that no two active &mut borrows may alias (and that references get permission from somewhere, either from the place they borrow or from a parent reference). Having two &mut to different parts of one allocation breaks no rules, so is allowed.

The relevant one here is borrow splitting.

Given

struct Vec2f {
    pub x: f32,
    pub y: f32,
}

it's perfectly valid to split the borrow, e.g.

let v: &mut Vec2f = …;
let x = &mut v.x;
let y = &mut v.y;
dbg!(x, y);

despite the fact we're using v twice here. There's a few specific cases where primitive indexing is potentially allowed to do borrow splitting the same way, but it's essentially only through pattern matching as shown above, not separate expressions like field borrow splitting.

1 Like