What is a good mental model of borrow checker?

That’s a lovely example to do another round of those diagrams I started in an earlier post above

#[derive(Debug)]
pub struct S {
    a: String,
    b: String,
}

impl S {
    pub fn foo(&mut self) {
        // (1)
        let a_ref = &mut self.a;
        // (2)
        let b_ref = &mut self.b;
        // (3)
        
        *a_ref += "a";
        *b_ref += "b";
        println!("{a_ref} {b_ref}");
        
        let a_ref_two = &self.a;
        // (4)
        *b_ref += "b";
        println!("{a_ref_two} {b_ref}");
        
        let s_ref = &*self;
        // (5)

        println!("{s_ref:?} {a_ref_two}");
    }
}

At (1)

// `self` is the variable of type `&mut S`, not the `S` itself
self

At (2)

// “*.a” stands for the place obtained by first dereferencing, then going to field `.a`
self <--mut @*.a-- a_ref

At (3)

// creating `b_ref` accesses only the “*.b” part of `self` which is disjoint from the
// part mutably borrowed by `a_ref`, which is why `a` can still exist
self <--mut @*.a-- a_ref
    ^
    |
    +---mut @*.b-- b_ref

At (4)

// creating `a_ref_two` accesses the “*.a” part, so `a_ref` must end/die
// and the newly created shared reference a_ref_two now borrows immutably
// from `self` at “*.a”, hence there’s no “mut” but “shared”
self             ( a_ref [now dead])
  ^ ^
  | |
  | +---mut @*.b-- b_ref
  |
  +--shared @*.a-- a_ref_two

At (5)

// creating `s_ref` accesses `self` at `*` (i.e. its dereferencing target)
// which overlaps with both “*.a” and “*.b”. It’s a shared borrow
// so any overlapping mutable borrow conflicts and must die. This
// kills `b_ref`, but NOT `a_ref_two` since the latter has non-`mut`
// access to `self`. Note that we did NOT care about the type of `b_ref`
// here but merely about whether the arrow in the re-borrow diagram
// has a “mut” or “shared”.
self             ( a_ref [now dead])
^ ^
| |
| |              ( b_ref [now dead])
| |
+ +--shared @*.a-- a_ref_two
|
+----shared @*---- s_ref

And an example with insert_then_get

struct S(/* … */);
struct T(/* … */);

impl S {
    fn insert_then_get<'a>(&'a mut self, data: T) -> &'a T { todo!() /* ... */ }

    pub fn foo(&mut self, data: T) {
        // (1)
        let immut = self/*(2)*/.insert_then_get(data);
        // (3)

        let other_immut = &*self;
        // (4)

        (immut, other_immut); // compilation error
    }
}

At (1)

// self: &mut S
self

At (2): We implicitly create a &mut *self re-borrow here which is passed to insert_then_get(data). [Just so no-one blames this re-borrow later: If we wouldn’t re-borrow, self would be moved, so that’s no good either, as we use it later.]

// I’m calling it `insert_then_get::self`, because it’s
// the value of the `self` variable inside the `insert_then_get` function
self <--mut@*-- insert_then_get::self

Now for (3), we need to evaluate the right model of reasoning about function calls. I would agree with @quinedot that for this purpose replacing values passed into the function makes a lot of sense. Note that we are replacing insert_then_get::self with the return value because the lifetimes in the signatures dictate that the returned &'a T’s lifetime is the same (and thus in particular outlived by) the &'a mut self’s lifetime. As mentioned before, more complicated function signatures might involve the need to replace multiple inputs with a single output, which inherits all the edges in the borrow graph so it’s a DAG then, no longer just a tree – or multiple return values[1] could replace a single input, possibly resuling in multiple parallel <-mut- edges that almost look as if they’re conflicting[2].

So at (3)

// the type of `immut` is `&T`, but types don’t matter, instead
// the existing `<-mut@*-` edge is simply kept in place
self <--mut@*-- immut

At (4)

self           ( immut [now dead])
    ^
    |
    +-shared@*-- other_immut

The immut got eliminated since the shared@* conflicts with the mut@*.

Compilation error happens upon accessing immut even though it’s dead.


For comparison, here’s the equivalent code but with an implementation

struct S(Option<T>);
struct T(u8);

impl S {
    fn insert_then_get<'a>(&'a mut self, data: T) -> &'a T {
        (*self).0 = Some(data);
        let &mut S(Some(ref data_ref)) = self else { unreachable!() };
        data_ref
    }

    pub fn foo(&mut self, data: T) {
        let immut = self.insert_then_get(data);

        let other_immut = &*self;

        (immut, other_immut); // compilation error
    }
}

and if we now inline the function call

struct S(Option<T>);
struct T(u8);

impl S {
    pub fn foo(&mut self, data: T) {
        let immut = {
            let self_ = &mut *self; // first function argument
            let data_ = data; // second function argument

            (*self_).0 = Some(data_);
            let &mut S(Some(ref data_ref)) = self_ else { unreachable!() };
            data_ref
        };

        let other_immut = &*self;

        (immut, other_immut); // compilation error
    }
}

it still fails (with almost the same error message)! This demonstrates that the problem was not even merely artificial constraints from the function signature (something that can happen, too, if lifetimes are too restrictive on the caller), but even borrow checking while seeing the body doesn’t help. Only if the re-borrow is eliminated will the code compile

struct S(Option<T>);
struct T(u8);

impl S {
    pub fn foo(&mut self, data: T) {
        let immut = {
            // let self_ = &mut *self; // re-borrow no more!
            let data_ = data; // second function argument

            (*self).0 = Some(data_);
            let &mut S(Some(ref data_ref)) = self else { unreachable!() };
            data_ref
        };

        let other_immut = &*self;

        (immut, other_immut); // works!!!
    }
}

This makes for two more nice cases.


struct S(Option<T>);
struct T(u8);

impl S {
    pub fn foo(&mut self, data: T) {
        let immut = {
            let self_ = &mut *self; // first function argument
            let data_ = data; // second function argument

            (*self_).0 = Some(data_);
            // (1)
            let &mut S(Some(ref data_ref)) = self_ else { unreachable!() };
            // (2)
            data_ref
        };
        // (3)

        let other_immut = &*self;
        // (4)

        (immut, other_immut); // compilation error
    }
}

Let’s skip the most boring steps.

At (1)

self <---mut @*--- self_

At (2): Note that the match accesses self_ by: dereferencing, then going into the only field of the S, then going into the Some variant, and inside of that into its only field. This field is then borrowed immutably by the ref data_ref pattern. Let’s make up some ad-hoc syntax for enum variant access, e.g. .:Some.

self <---mut @*--- self_ <---shared @*.0.:Some.0--- data_ref

Before reaching (3), the inner block ends. This is an interesting new thing: we are dropping self_. There must be some borrow checking rules to determine what happens there. E.g. if some references returned from that block borrowed from self_ directly, that would be bad. However, data_ref borrows only things that are behind at least one level of indirection and thus the borrow-checker allows such references to keep existing even though self_ is gone. Logically, we still have a situation as follows

// `self_` is dropped, but the reference it used to contain is not dead:
// its lifetime, inherited (as an upper bound) on `data_ref`
// is still very much alive

self <---mut @*--- (self_ [dropped]) <---shared @*.0.:Some.0--- data_ref

however, since self_ is out of scope, nobody cares anymore about the details of the ways in which self_ was borrowed. References that re-borrow from self_ can no longer be killed by new accesses to the self_ variable (as self_ can no longer be mentioned in any code since it’s out of scope), so the only way they can still be killed is by any access that would have killed self_.

For illustrative purposes let’s carry on with both interpretations though. Either we still have

At (3)

// `data_ref` was moved into `immut` by the way, which we’ll model by simply replacing
// the name in the diagram

self <---mut @*--- (self_ [dropped]) <---shared @*.0.:Some.0--- immut

Or we can reduce the “(self_ [dropped]) <---shared @*.0.:Some.0--- ” part and simplify to

self <---mut @*--- immut

At (4), the other_immut reference is created, and this operation needs (immutable) acces to *self, which overlaps with the existing <---mut @*--- edge. So in either interpretation, we get either

// `self_` was killed; killing a borrow also kills all its re-borrows recursively
self      (self_ [dropped, now also dead])     (immut [dead])
    ^
    |
    +--shared @*-- other_immut

or in the other interpretation

self              (immut [dead])
    ^
    |
    +--shared @*-- other_immut

Finally, the code that does work

struct S(Option<T>);
struct T(u8);

impl S {
    pub fn foo(&mut self, data: T) {
        let immut = {
            // let self_ = &mut *self; // re-borrow no more!
            let data_ = data; // second function argument

            (*self).0 = Some(data_);
            // (1)
            let &mut S(Some(ref data_ref)) = self else { unreachable!() };
            // (2)
            data_ref
        };
        // (3)

        let other_immut = &*self;
        // (4)

        (immut, other_immut); // works!!!
    }
}

At (1)

// so far, only one write access directly to `self` happened,
// no references were created
self

At (2)

self <---shared @*.0.:Some.0--- data_ref

At (3) (no data with lifetimes was dropped, so nothing to talk about w.r.t. dropping things)

// moved `data_ref` to `immut`
self <---shared @*.0.:Some.0--- immut

At this point, in the previous code, the shared access was not on self directly, but hidden deeper in the tree behind an outer <--mut …-- edge, which is the significant difference resulting in immut surviving the next step.

At (4), the creation of other_immut doesn’t conflict with immut since immut is only an immutable borrow.

self <---shared @*.0.:Some.0--- immut
    ^
    |
    +----shared @*------------- other_immut

  1. still ignoring that there is arguably an intermediate step of receiving a single tuple value and then taking it apart ↩︎

  2. but as you see above, the only time we actually check for conflicts is when creating new borrows, and kill existing borrows overlapping with that new borrow ↩︎

2 Likes