Blog post series: After NLL -- what's next for borrowing and lifetimes?


#21

How about something like this:
The implementor of T (where T is either a type or a type template that has fields) separates (in his mind) the fields of T into N disjoint sets, and picks them some distinct names (namespaced under T). Then the implementor of T can use those names in free/associated function signatures to tell to the outside world which kind of access to those named sets of fields any given function parameter (reference) requires. The notation for this function parameter “reference slicing” could be as follows:

&-a arg: T => arg uses shared references to the fields in set “a”
&mut-a arg: T => arg uses mutable references to the fields in set “a”
&-a-b arg: T => arg uses shared references to the fields in sets “a” and “b”
&-a mut-b arg: T => arg uses shared references to the fields in set “a” and mutable references to the fields in set “b”

In the following example the fields of MyStruct are sliced into 3 disjoint sets, named a, b, and c. The set assignment is shown in the comments of MyStruct declaration. The comments starting with “Requires:” specify what kind of requirements each use of the self parameter imposes on the signature of the function where it is being used.

struct MyStruct {
    counter: usize,         // a
    listener: Sender<()>,   // b
    widgets: Vec<MyWidget>, // c
    attempt: usize,         // c
}

// In each function signature, the "reference slicing" of the `self`
// parameter is a union of all the "requirements" in the function body
impl MyStruct {
    fn increment_counter(&mut-a self) {
        self.counter += 1; // Requires: &mut-a
    }

    fn signal_event(&-c mut-a-b self) {
        self.increment_counter(); // Requires: &mut-a

        self.listener
            .send(()) // Requires: &mut-b assuming fn send(&mut self, ..)
            .expect(
                "msg: {} {}",
                self.widgets.len(), // Requires: &-c
                self.attempt,       // Requires: &-c
        );
    }

    fn check_widgets(&-c mut-a-b self) {
        for widget in &self.widgets { // Requires: &-c
            if widget.check() {       // Requires: &-c assuming fn check(&self)
                self.signal_event();  // Requires: &-c, &mut-a, and &mut-b
            }
        }
    }
}

#22

I wonder how common it is to want a particular set for more than one fn.
I frequently find that my fns want to use different, overlapping sets of variables from self. One might use foo and bar, a second bar and baz, a third foo and baz, etc. If those use-cases are common, then it’s unclear that having labeled sets beyond being able to “use” individual fields helps much.


#23

But even if the functions’ access to fields is so overlapping that you end up having to have one set (of size 1) for each field, you still need those named sets so that renaming or adding fields later on isn’t a breaking change.


#24

Adding an observation to my earlier suggestion: it’s not only the implementor of a type that would benefit from having properly sliced references, but also other users of such type. For example, the following shows how a user of a library would seemingly be able to have two mutable references to a variable at the same time. In reality, of course, those mutable references are sliced in such a way that they are disjoint, which is guaranteed by those function signatures:

// In a library crate `foo`, for some struct `Foo`:
impl Foo {
    pub fn new() -> Foo {...}
    pub fn mutate_stuff<F: Fn(u64) -> u64>(&mut-a self, f: F) {...}
    pub fn mutate_other(&mut-b self, x: u64) -> u64 {...}
}
// In your own crate:
let mut my_foo = foo::Foo::new();

my_foo.mutate_stuff(|a| {
    my_foo.mutate_other(a)
});

#25

So,is anyone willing to make a library solution for implementing views(as an experiment)?If not I will start working on one next week.


#26

I feel like the heaver-weight the syntax is, the more it’s going to be seen as an “advanced” feature and so less people are going to use it by default. If we have explicit views, like:

view Foo for Bar {
    // ...
}

impl Foo {
    // ...
}

then most libraries will just borrow the whole self because that’s by far the easiest solution. My personal choice is the "annotated self" proposal, like so:

fn foo(&mut self: Self { &a, &mut b })

or

fn foo(&Self { ref a, ref b }: &Self)

plus a clippy lint against using the entire struct when some fields aren’t used (which can be silenced by typing:

fn foo(&mut self: &mut Self)

or something like that.

I personally think that this provides the most explicitness while also gently pushing people towards using views when possible, since it is a strictly more ergonomic solution for downstream crates that should only be avoided if you want to give yourself leeway to use other fields in the future. It would be a shame if the majority of crates avoid using this feature because it’s too complex to learn or too much typing.


#27

I have hit this problem in past, but not too much often, and I’ve used one of the workarounds you have shown.

The problem discussed here is a subset of a more general problem (that I think eventually Rust will need to face, just like Ada has had to face), specifying the data a function is allowed to read, write and read&write (beside its formal arguments). This exact and complete specification of the data flow of an impure function is important when you want to understand better the code and it’s necessary if you want to write a formal proof of correctness of a function (as done in SPARK, Why3, etc languages).

You see examples of this more general problem in C++ programs that prepend global variable names with “g_”, so if you see a variable named g_foo in a function, you know that function is reading, writing or reading&writing a global variable. In a language that allows inner functions like D language, a function could use a variable defined in any outer function and not just globals (in Rust you need closures for this). This makes the code less easy to understand, and harder to formally prove correct.

So in Ada-SPARK language they have introduced annotations that allow you to specify exactly all the variables from outer scopes you are using in a function, and if you are reading, writing, or reading&writing each one of them:

http://docs.adacore.com/spark2014-docs/html/lrm/subprograms.html

Rust needs those annotations less often because (I think) global mutable variables and global static values are less common in Rust.

So specifying what self fields a function method is using is just a special case of that kind of annotations (I think Rust will need this kind of annotations when it will want to allow static verification of the code):

#[inout(self.counter, self.listener)]
fn signal_event(&mut self) {
    self.counter += 1;
    self.listener.send(()).unwrap();
}

The variables or self fields are inside [in()], [out()], and [inout()] annotations. Other examples with a static mutable variable:

#![allow(dead_code)]
static mut FOO: u32 = 5;

#[in(FOO)]
fn bar1() {
    unsafe {
        println!("{}", FOO);
    }
}

#[out(FOO)]
fn bar2() {
    unsafe {
        FOO = 2;
    }
}

#[inout(FOO)]
fn bar3() {
    unsafe {
        FOO += 1;
    }
}

fn main() {}

#28

I don’t think that’s sufficient because of inner mutability. An in could mean a write, under that circumstance. I think this is only useful to give extra information to borrowck and is not useful for enforcing program correctness.


#29

Is a fourth annotation like #[in_mut()] enough to solve that?


#30

Would you consider Rc::clone to be mutation? What about RefCell::borrow? They both use inner mutability but expose an immutable interface.


#31

But how is that going to work with traits, where you don’t have any fields? What about when different implementing types use different underlying structures?


#32

That’s an excellent point actually, this interacts poorly with traits. You’d probably want to be able to implement a trait for a view. Hm.


#33

I’m pretty new to Rust (I used it for all of last year’s Advent of Code but that’s about it) and in my first attempt to use Rust for a real project, I ran into this exact issue immediately. After diagnosing the problem (which as you note is not so easy for a newcomer) my first inclination was to move to the “free variables” solution, but honestly it was a big enough speed bump that I put the project aside a couple of weeks ago and haven’t come back to it yet. So that’s a big vote in favor of addressing it in the language.


#34

We simply (?) make it so that a reference with restricted access privileges to T is a subtype of a normal reference to T, and we allow traits to be implemented by using these subtypes. A trait says that in this function, via this parameter, you can have access to the whole of type T, but the implementor of the trait says no thanks, because I can implement my stuff by using just a part of T, so he declares it so in the signature of his implementation of said function.

EDIT: I made a mistake. The subtyping relation actually goes in the other direction: it’s the reference to T that is a subtype of this theorized “restricted reference” to T. But I was correct in my intuitive explanation for why it would be fine for a type to implement a trait by using a restricted reference in place of a regular reference in the trait’s function signature.


#35

This doesn’t help generic code because it won’t know what relaxations a particular impl will allow. The borrow regions approach that @newpavlov linked above is more to the point, but it does introduce yet another dimension to trait design (i.e. designated borrow regions, almost like a form of associated types).


#36

I can see what you mean. And I can imagine it might be quite difficult for someone writing a trait to know exactly what the regions (or “slices” as I called them earlier) should be for each reference parameter in the trait’s functions. And to make matters worse, he can’t turn a regular reference into one with a region after having published his library without potentially breaking code that uses the trait. Maybe it’s not right to put the burden of this on the one who writes the trait. Perhaps it might be possible for generic code to have some way of specifying that its type parameter should, besides being bounded by the trait, be bounded by some, whatever, extra regions. I can also imagine that the syntax for specifying these extra “region bounds” might be horrible.


#37

I’m a beginner with Rust, I’ve written approximately 8000 lines of code until now. I’ve hit this problem immediately, and it only got worse as I was structuring my code. I’ve cited this problem as #1 in a list I did on HN of Rust annoyances (https://news.ycombinator.com/item?id=17673711).

I actually have experimented all three workarounds (views, splitting structs, and free functions); a few times they actually do work in a way that’s still clean in the overall design, but most of the time they are compromises that I need to live with.

I would be glad if Rust solved this problem in a way that doesn’t affect the architecture of the program (that is, the number of different concepts that a programmer reading the code needs to grasp to understand it). To me, views is not a good general solution to this problem, because it still forces to introduce a new “object” (a view) in addition to the original structure, just to make the borrow checker happy. That new “object” will have to be documented, maintained, learnt by users of the code; it adds to the cruft and doesn’t simplify the code.

To clarify, I’m not saying that views are always wrong; I’m saying that using views for working around interprocedural conflicts is not a good solution. There are cases where views are useful per-se, it’s a basic pattern in programming. But using them for interprocedural conflicts still feels like a workaround in most cases.

A good solution would be something that doesn’t affect the architecture of the code. Eg: I’m OK with typing more to explain things to the borrow checker.


#38

This comes up immediately for every new Rust programmer coming from an OO language.

Over time I’ve learned to stay far away from the god class design pattern, and only sometimes I have to write more “static” methods than I’d like, but it’s still an annoyance.

Any solution that requires any extra syntax (&mut self: Self {foo, bar} is not bad) wouldn’t help novice users that much — they’d still hit the problem, and have to seek solution, and remember even more syntax.

I really like the idea of automatically doing partial borrows across private (crate-private) methods. That may make the problem magically go away in many situations by doing a sensible thing exactly like users expect, which is a great win. And it’d reinforce the idea that the restriction is to make public API isolated from the implementation.


#39

It’s a bit too much magic to my liking. It breaks the nice property of Rust functions/methods which allows you deduce how they will play with others just by looking at a signature. In your proposal I will have to look through methods code to understand if a small change will break everything.


#40

Typically people have one impl block per type, so chances are you’d have the relevant source code on your screen already.

But the idea is that you wouldn’t look, you’d just make the change, and it’d be less likely to fail than before.