Find-grained borrowing: why not?


#1

Hi.

Introduction

TL;DR: skip to “the actual question”.

There is something that I do not understand in the rules about borrowing, especially mutable borrowing.

If t has a v field holding a vector, then this is dangerous:

    let p = &t.v[2];
    t.v.push(42);

… because p may become dangling if v is reallocated. This is dangerous too:

    let p = &t;
    q[i] = t;

… because p points to t but t is no longer relevant.

On the other hand, this, unless I am sorely mistaken, is theoretically safe:

    let p = &t;
    t.v.push(42);

This one is obvious: t does not move, a pointer to it is not a problem.

    let p = &t.v[2];
    q[i] = t;

Here, t moves, but it only contains a pointer to the contents of the vector, the contents itself does not move, so the pointer does not become dangling.

In other words, “moving + direct pointer” and “mutating + deep pointer” are dangerous, but “moving + deep pointer” and “mutating + direct pointer” will never cause problem.

A I wrong yet?

Still, Rust does not allow the two later in “safe” code. Writing p = &t “borrows” t, that means taking a kind of lock on t, but not only on the small span of memory that is t itself but also on all memory pointed by t, recursively, stopping only at special objects implemented using unsafe blocks.

I suppose the reasoning is that a direct pointer on t can be used to retrieve a deep pointer, but I find this quite an annoying limitation.

This is especially a problem with mutable borrows: since mutable borrows can only be obtained from mutable borrow, and since Rust does not (yet?) understand partial borrows across function calls, obtaining one on a small bit of a data structure requires laying a giant lock on the whole data structure.

For example, I may have a small array of some kind of items in a data structure, and create a GUI widget for each item to edit the value of a field. The callback for the check box would need a mutable borrow of the item, which is not possible.

I gather the standard solution for that particular issue would be to use Cell or RefCell. But RefCell is dynamically checked. By using it, we are losing the help of the compiler to detect simple mistakes.

The actual question

Yet, I am three-quarters convinced that static checking of more fine-grained borrowing is possible, at least theoretically.

Basically, when the type inference for the whole program has been done, all variable binding has a concrete type, and it is therefore possible to mark, for each point in the program and each variable binding, what parts exactly of the data structure are borrowed by whom and how.

For example, if p is a reference to t, counting as a shallow borrow, then let q = &p.v[2] yields a shallow borrow of t.v[2].

Still, arrays would be a bit of a problem: the compiler can not be expected to prove that i is not equal to j to allow borrowing both cells. Fortunately, iterator loops can give safety to that; actually, this bit already works.

Recursive types may also be a problem; they are the reason I am only three-quarters convinced it would be possible. But I think it would work: lazily unroll recursive types to establish shallow borrows, and if a loop appears, consider the whole branch borrowed. I think this is workable because in my experience, recursive types are mostly used in low-level implementations of data structures like linked list, where unsafe would be acceptable, and for back-references (a class has an array of pointers to the students, each student has a back-pointer to his or her class), for which a specific API could be designed (it would probably be useful even right now).

So, eventually, my question is: Why does Rust not do that?

I am not prideful enough to believe that if I thought of that in a few days of thinking on it, the Rust developers did not in the years of work they put in it. So there must be a reason. For my intellectual satisfaction, I would really like to know that reason.

A few obvious explanations come to mind, but I can not decide which one is correct.

  1. They are working on it, but it takes time. Yay, keep up the good work, thanks!

  2. Actually, it does not work: trivial examples can be done by hand, but turning it into a practical algorithm is impossible, or almost always result in borrowing almost everything anyway.

  3. It is too expensive, it would take minutes for a normal developer’s computer to check a simple non-trivial program and years with all the NSA’s computing power to check a real program.

  4. It breaks separate compilation: building a whole project with its libraries takes an acceptable total time, but any small change requires redoing the expensive work for the whole project, which makes development very annoying.

  5. Type annotations for shallow and deep borrows become very complex, making it almost impossible to write even a simple function signature. (Closures with lifetimes already are quite verbose as it is, and this was also a problem in the RFC for partial borrows across function calls.)

  6. All of the above.

I would really appreciate if someone could answer that question.


#2

The main reason is that tracking components of borrows requires more type allocations than we already have.

Actually, the case for moving borrowing pointers (let x = &*t.ptr; q[i] = t) is stronger - it is just not implemented yet.


#3

I am sorry, but I do not understand what you mean. Can you give more details? What are “type allocations”?