Edit: added TLDR at the end of this post to summarize points raised during this discussion, that might be helpful if one looks for a quick guide about the topics related to borrowchecker.
Whenever we write any piece of code, we rely on our mental models. We rely on our mental model of the hardware, and map the code we write with hypothetical actions that would be performed by the hardware that is our mental model. Obviously our mental model of the hardware is just that, a Model. It does not correspond to any actual hardware, it does not include weird edge cases that might exist on real hardware, it might include operations that might not exist on real hardware.
Our mental model of the hardware might stem from a abstract machine model that is specified by some programming language specification. We might think some of the expressions are lvalues and can be addressed others are not. We rely on that model even when there are no address of operator in the language we are writing, e.g. any language without pointers.
But even in this case that our mental model is derived from a specification somewhere it does not include all the details of the specification. Specification might mention about expiring xlvalues etc. but that might not be part of our mental model that we use in our day to day programming.
When we write code in a statically typed language, we rely on our mental model of the typechecker. For example we think that if f
is of type f<T,U>(T)->U
and x
is of type T
the expression f(x)
will be of type U
. Or f(x)
is only well typed if f
is a function with some type fn<T,U>(T)->U
and x
is of type T
.
This mental model might also be derived from a specification of a particular typechecker or an academic paper from type theory, but our mental model does not include the exact algorithm of how this kind of rules are checked how are they represented as data structures by the typechecker.
What is distinctive of Rust is its ownership (destructive moves), life time and borrowing system. I assume that all proficient Rustaceans have a good mental model of the borrowchecker.
That mental model might be derived from your knowledge of the borrowchecker's implementation, from what you gathered from books and tutorials, or from your experience.
Even though I do not yet have such a good mental model, I assume that your mental model also does not include all the details of borrowchecker implementation and nitty gritty details.
I would like to know your opinion about what is a good mental model of borrow checker?
I presume it is more detailed than the following:
- each value has an owner.
- each variable has a lifetime
- lifetime of the reference must be smaller than the lifetime of the referrent
- there can be many shared references simultaneously
- there can only be one mutable reference at a given time.
This might be a good mental model as a beggining but I assume that a experienced Rustacean has a more detailed more accurate model.
Let me mention my incipient mental model for shared references.
- every variable has a lifetime that is determined at the compile time
- that lifetime starts when the variable is initialized and definitely ends when the scope of the variable ends or it is moved.
- if
a
is a variable with typeT
and lifetime'a
then&a
returns a reference with type&T: 'a
, which means that the the lifetime of the reference is bound by the lifetime of the referent. - A life time bound
'a: 'b
is satisfied when the lifetime'a
is completely withing lifetime'b
- So I can scan the source code and think about the region that corresponds to life time of
a
and region corresponds to lifetime'b
and see thatb
is totally withina
, I have a well typed program.
This mental model might correspond to what the borrow checker actually does, but it does not concern itself with the computational algorithms that is executed by it. It gives a rule of thumb that you can follow while you are writing the code.
However, this model is limited to shared references. For example it does not contain a model of how a well typed program does not have two mutable references at the same time. It does not explain what is the relationship between a reference to a struct and a reference to a field of the struct. It does not explain when a function (possibly unsafe one) accepting a raw pointer and returning two mutable references can be well typed.
What I would like to hear from you is your mental model of the borrowchecker, not how it is actually implemented but what do you assume about it when you write your code, what heuristics that you use when you read a piece of code and guess whether it will pass the borrowchecker or not.
This might sound like a huge open ended question, and questions on forums tend to get most attention when there is a specific problem and two lines of answer might be sufficient to answer it. But I think this is an important question because ownership borrowing and lifetimes are the most distinctive features of Rust and all beginners even veterans programmers of other languages struggle with that. And providing detailed answers to them in forum posts, blog posts, in long forms of writing would be invaluable contribution to the community. And hopefully some of you have a new year resolution to contribute to Rust community.
I really hope to hear your mental model of the borrowchecker!!!
TLDR Roadmap to understand borrows and lifetimes
Later entries in this post makes several good points and contains extremely helpful remarks and examples. Some concepts concepts keep reoccuring which suggest they are fundamental for understanding borrows and lifetimes. The following is my attempt to summarize those concepts and divide them into buckets so that it might be helpful for other Rustaceans. I might later edit it more, if it proves to be useful to others and improvements would be beneficial, you tell me how it can be improved.
- beginner
- shared and mutable references must live at most the lifetime of the referent, and references lock the value they point to i.e once you have a shared reference your interaction with the value must be through it (or you can drop the reference and keep interacting with the value directly) and similarly with mutable references.
-
Reborrows if you have a shared reference you can get new shared references through it. and if you have a mutable references you can get shared or mutable references from it, but the same rules of locking and interaction apply. (i.e if you get a mut ref
x
from a mut refy
all your interactions with the data must be via x until you drop it)
- intermediate
- Interior mutability some types give you the ability to mutate the data even when you only have a shared reference to that type.
- field/borrow splitting borrowchecker undertsands some compound types that you can have two mut references at the same time to disjoint parts of the compound type.
- advanced
-
two phase borrows borrowchecker can sometimes reinterpret the order when a mut ref and a shared ref to same data came into being, so that patterns like
data.set(data.get()
can be allowed.
-
two phase borrows borrowchecker can sometimes reinterpret the order when a mut ref and a shared ref to same data came into being, so that patterns like
- common pitfalls
- droping a reference has no effect on the lifetime of the reference itself.
- common lifetime misconceptions: 1 (is a classic and must be anthologized and widely read)