Non-lexical-lifetime on mutable reference scope

I read the RFC about NLL. I know it's an internal thing, and I want to understand how rust can tell the below code violates the borrowing rule through MIR, instead of lexical scope.

fn capitalize(_: &mut [char]) {}

fn bar() {
    let mut data = vec!['a', 'b', 'c'];
    let slice = &mut data[..]; // <-+ 'lifetime
    data.push('d'); // ERROR!  //   |
    capitalize(slice);         //   |
} // <------------------------------+

Motivation

understanding borrowing rules: reborrow and NLL makes it hard to understand borrowing rules.

The lifetime can be defined in finer-grained regions in terms of MIR.

This RFC proposes a more flexible model for lifetimes. Whereas previously lifetimes were based on the abstract syntax tree, we now propose lifetimes that are defined via the control-flow graph. More specifically, lifetimes will be derived based on the MIR used internally in the compiler.


Context

The given example in the RFC:

fn foo() {
    let mut data = vec!['a', 'b', 'c']; // --+ 'scope
    capitalize(&mut data[..]);          //   |
//  ^~~~~~~~~~~~~~~~~~~~~~~~~ 'lifetime //   |
    data.push('d');                     //   |
    data.push('e');                     //   |
    data.push('f');                     //   |
} // <---------------------------------------+

fn capitalize(data: &mut [char]) {
    // do something
}

pseudo-MIR

let mut data: Vec<i32>;
let slice: &'slice mut i32;
START {
    data = ...;
    slice = &'borrow mut data;
    capitalize(slice);
    data.push('d');
    data.push('e');
    data.push('f');
}

The constraints generated will be as follows:

('slice: {START/2}) @ START/2
('borrow: 'slice) @ START/2

My understanding is the lifetime of 'borrow is only at point START/2;


Question

what about below? Just move data.push ahead of capitalize:

let mut data: Vec<i32>;
let slice: &'slice mut [char];
let tmp0: &'tmp0 mut Vec<i32>;

START {
/*0*/    data = ...; 
/*1*/    slice = &'borrow mut data[..];
/*2*/    ... // some dummy code, not using slice nor tmp0
/*3*/    tmp0 = &'tmp1 mut data;
/*4*/    Vec::push(tmp0, 'd');
/*5*/    capitalize(slice);
}

My intuition tells me, borrow should have scope from START/1 to START/5, Within borrow, there shouldn't exist any mutable reference to data.

But following the RFC, it seems, there's no overlap of 'tmp1 and 'borrow,

tmp1: {START/4}
borrow: {START/2, START/5}

There's no multiple mutable access to data and it should compile?

But it doesn't. playground

error[E0499]: cannot borrow `data` as mutable more than once at a time
 --> src/lib.rs:6:5
  |
5 |     let slice = &mut data[..]; // <-+ 'lifetime
  |                      ---- first mutable borrow occurs here
6 |     data.push('d'); // ERROR!  //   |
  |     ^^^^ second mutable borrow occurs here
7 |     capitalize(slice);         //   |
  |                ----- first borrow later used here

Could anyone tell me where I did wrong in following the algorithm proposed in that RFC?

Subtyping produces lifetime constraints, and the inference solver expands the set of borrow points until the constraints are satisfied. See example 3 under subtyping, for example.

The meaning of a constraint like ('a: 'b) @ P is that, starting from the point P, the lifetime 'a must include all points in 'b that are reachable from the point P.

...

To guarantee these properties, we must prevent actions that might affect the borrowed memory for all of the points between P (the borrow) and Q (the use).

So the borrow of data includes the push at 4, because the use at 5 extends the borrow to all points between.

1 Like

Thanks a lot!

I think this is not mentioned in the algorithm section, correct me if I'm wrong; I was under impression that the region (lifetime) of this MIR approach is a set of discrete points, instead of a set of intervals (represented by points).

Probably the liveness section vaguely mentioned what you quoted: slice is used at {2, 5}, which means 2-5.

It's described in the Solving Constraints section:

Once the constraints are created, the inference algorithm solves the constraints. This is done via fixed-point iteration: each lifetime variable begins as an empty set and we iterate over the constaints, repeatedly growing the lifetimes until they are big enough to satisfy all constraints.

The meaning of a constraint like ('a: 'b) @ P is that, starting from the point P, the lifetime 'a must include all points in 'b that are reachable from the point P. The implementation does a depth-first search starting from P; the search stops if we exit the lifetime 'b . Otherwise, for each point we find, we add it to 'a .

In this sense, the lifetime is an interval (or set of intervals) in the control flow graph. In a different sense though, such intervals are a set points in your program.

{2, 5} only "means" 2-5 in the example because the program flow is 2 -> 3 -> 4 -> 5.


If you want to read about another approach to the borrow checking, you can read about Polonius (part 2, part 3).

2 Likes

yeah, I was confused about the term reachable and how two points are considered connected under CFG (in order to define path or graph). I think that's the reason a DFS is used to find a set of points for a lifetime.

In this case, {2, 5}, in order to reach from point 2 to 5, it needs to pass {3, 4}.

adding to my reading list; Thanks for sharing!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.