Blog post series: After NLL -- what's next for borrowing and lifetimes?


#81

I think the tail of this discussion (between @skyschermer and @Pauan) gets at the meat of the issue: you might “know” the code is correct (and it indeed can be correct), but the compiler is unconvinced without seeing the disjointness explicitly. I agree with @skyschermer that proving things correct, in compiler’s language, is not going to come for free nor without some loss in ergonomics. The question is, of course, how to recover some of that cost and loss (and, as I mentioned earlier, this issue is more painful with traits/generic code - I’m somewhat surprised the conversation has stayed mostly about structs).

Also, it’s probably worth reminding ourselves that this issue comes up only when mutable borrows are in play. Obviously mutable borrows are common, but less so than immutable ones; the latter, of course, allow “overborrowing” without any issue. And that also leads to yet another approach (when possible), which is using interior mutability and allowing more aliasing freedom.


#82

For the specific case of interprocedural conflicts in public APIs, I’m very skeptical that new syntax/language features would really be a solution, because none of what’s been suggested so far eliminates the fundamental “speedbump”.

By that I mean: the biggest reason why non-lexical lifetimes, closures capturing disjoint fields and enabling nested method calls are relatively uncontroversial “pure wins” for ergonomics is that–once they’re implemented–the borrow checker errors that motivate them simply never happen again. Normally when you encounter a borrow check error, you have to spend some time and mental energy to understand what it means, figure out what API/architecture implications it has, and either actually redesign your API/architecture or implement a workaround, before you can get back to whatever you were originally doing. In the cases addressed by NLL and friends, that cycle completely evaporates and nobody ever has to think about them again.

But because interprocedural conflicts create a semver hazard, all possible solutions require some kind of explicit opt-in, which means we need to retain the process of getting a compiler error, understanding what workaround or language feature is needed to fix it, and implementing that workaround. To me, that seems like the real cost, and merely making the workaround easier to write doesn’t seem like a problem worth introducing new language features to solve (at least, I haven’t seen any indication in this thread that it’s ever impossible or infeasible to work around in practice).

But I do think there is a potential middle ground between “the code Just Works™ now, no strings attached” and “after suffering the context switch to figure out a borrow checker workaround, this sugar makes implementing the workaround marginally easier”. That middle ground is: we can’t accept this code as-is, but we can tell you exactly what changes will make it acceptable (and with RLS, let IDEs automagically make the change) so you don’t even have to think about it.

So, to be super clear: I do think a language feature/new syntax to address interprocedural conflicts would be worthwhile if and only if adding that feature/syntax would allow us to automagically suggest and apply workarounds that we were unable to suggest and apply without it.

My suspicion is that we should be able to automagically suggest and apply a lot of today’s existing workarounds. For instance, autosuggesting breaking a type into smaller types or using free functions seem doable to me (although I know nothing about rustc diagnostics so maybe this is all nonsense).


#83

While I completely agree with your observation that none of the proposed solutions get rid of that “speedbump” and your reasoning on why they cannot, I don’t think the existing workarounds are practical enough.

But I think your example of automating the workarounds would be an excellent metric of comparing different current and possible future workarounds. I’d argue a workaround that as an automated fix would have to move a method to a free function with extra parameters and adjust all call sites (and possibly their call sites and so on…) is less practical than a workaround that can be applied as a local fix.

In a sense this measures the difference between the programmers mental model of the code and what rust requires us to write. This is exactly where my frustration with the existing workarounds comes from even though I’ve become somewhat fluent in using them. Even with automation this would remain an issue for me, as the problem isn’t so much realizing the workaround in the first place, but continued updating of the code with the workaround in place.


#84

To play devil’s advocate a bit, have you thought about this hinting at your designs not being Rust-friendly? Are you carrying over an implementation approach that you’re overly accustomed to from a different language(s)?

I think in some ways one can draw the analogy to dynamic vs static typed languages; in the latter, you need to go to extra lengths to describe the types, which some fans of dynamic languages don’t like (“I KNOW this object has a field bar, just trust me compiler!”). I suppose having to prove the same thing about borrows/disjointness in Rust is quite similar, if unfamiliar since no other language requires this dimension.

I think it’s worth reflecting on this before we collectively jump to trying to “fix” things in the language.


#85

I have. Over the past 4 years I’ve repeatedly rewritten my SAT solver from scratch trying different approaches of structuring the code and data. Unfamiliarity with the language certainly was an issue in the beginning, but I don’t think that’s my issue now. I haven’t written a (CDCL) SAT solver in a different language before, so I’m not trying to copy an existing design. There just isn’t a clear decomposition into a tree of structures so that all data flow is going outwards and control flow only inwards. (This might be too short of a description, but is the main problem causing me issues with interprocedural conflicts)

A lot of the code is also critical for the performance of the solver, so replacing a certain data structure with a similar more rust friendly data structure isn’t an option if that’s slower. Or at least not if I want to stay competitive with solvers implemented in C++.

My background before having started this project consisted of Haskell, Scala, Ruby, Python and C++ (and some toying with dependently typed languages like Agda and Coq) so I’m used to working with different paradigms.

I have no problem with the fact that I have to prove the absence of data races, in fact that’s one of the main reasons for choosing Rust. I’ve implemented algorithms with many moving parts in C++ before and accidental modification of data that was still referred from a place I’ve overlooked was a common issue. My issue is with not being able to do this concisely in a few places, instead having to pollute most of the tricky parts with code that just passes on references that are not relevant for what the affected code does. Having a way to encode this just as part of the signature instead would solve most of my problems.

Of course I might have overlooked an elegant way to solve this all this time. But even then, given the amount of work I put into finding a solution without finding a satisfying one, I’d argue that rust could and should be improved in this area.


#86

If the references aren’t relevant, why are they being passed? I think it’d be great to have a somewhat reduced example of what you’re referring to - you mentioned graphs earlier in the thread, but I think it’d be good to talk about something more concrete. I think most people use indices to implement graphs anyway, and not references. Have you looked at how crates such as petgraph do graphs?

Passing explicit references in the signature is a way to encode this part, though - if you’re able to get those references to give to the callee, the compiler takes that as proof they don’t alias (for mutable borrows); that’s a form of encoding, no?


#87

That is a good analogy, but it goes the other way too: there are some statically typed languages where the static type system is poorly designed, or very limited, so you need to do weird and/or difficult workarounds just to make the compiler happy. So you spend more time on workarounds than on actually writing useful code.

And then there are really good statically typed languages (e.g. Rust, Haskell, F#, etc.) where you still have to put in some effort, but the effort is much less, and the benefits are much more, because the static type system is well designed and powerful, with fewer limitations.

Of course a solution to disjoint fields will require some changes to the compiler. Of course those changes will require complexity (both in the compiler implementation and also with new language features/syntax).

I don’t think anybody here expects a magical solution that will come at no cost, we all know there will be costs with the solution. The question is whether the benefits outweigh the costs or not. And that question can’t be answered until after the design space has been explored. So let’s start exploring that design space.


#88

They aren’t relevant in the sense that the function doesn’t access them but they are needed for other functions called inside that function. You could say they are an implementation detail of those other functions.

That misses the important parts of my statement, maybe I wasn’t clear enough, but I don’t consider this to be concise as every reference needs to be passed individually. If a view struct is used to bundle them you need to completely destruct and construct a new view struct whenever only a subset of references is passed on. It also affects all call sites and not only the function signatures.

I find it difficult to come up with a good reduced example, as the workarounds are somewhat fine for most small examples, while they turn into a big problem for larger programs I’m working on.

My graph example also uses indices. The references are to the different parts of a graph, i.e. the collection of vertices and the collection of edges and potential further collections of associated weights or other data. Different graph routines require different subsets of read only and different subsets of mutable references to these parts. Petgraph has the exact problem of my example and comes with a workaround, which is a good illustration on why I think this is a problem.

If you want to iterate over a graph structure while mutating weights (or any other associated data, that can be changed without changing the structure of the graph) you cannot use the neighbor iterators you would normally use but are referred to WalkNeighbors helper structure. This is basically an iterator, apart from the fact that you need to pass a reference to the graph for each next call, so it cannot be an iterator and you lose the language and std-lib support for iterators. You also cannot keep a mutable reference to the graph weights across a next call. Unlike iterators it also has to do bounds checks for each next call, as you could change the structure of the graph between two next calls.

Given current limitations of rust, I think this is as good as it gets for a use case like this. And I’m happy to see that petgraph offers this. It’s still one of many paper cuts for implementing algorithms that do a lot of manipulation of graphs and similar structures.

I understand that there are many useful programs that can be written in rust today without encountering this problem at all. If you happen to mostly write those programs it might be hard to see these problems. But different kinds of programs have different demands on the language. I mostly happen to write programs where this is a problem.


#89

I’d say the implementation detail is the fact that the immediate callee doesn’t touch some of these references but rather forwards them to other functions. From the standpoint of the callsite that initiates this call chain, it’s clear what references are needed by the initial callee signature - callee using helper functions doesn’t matter.

Is it “just” a concision issue for you? Some proposals call for the compiler to implicitly do these checks for overlap, at least within private APIs. What if there was some magic button on your keyboard that would autogenerate the callee signature and adjust callsites accordingly? I’m trying to see whether you’re pained by writing this code or reading this type of code or both.

Also, a quick note on the semver hazard that’s been pointed out - we already have a similar hazard with OIBITs (Send, Sync). People work around that by adding static assertions that a type is Send and/or Sync, but the hazard is there. This isn’t to say that should encourage adding more, but just noting that this wouldn’t be the first.

Don’t think you’ll find anyone claiming graphs are easy in Rust :slight_smile:. There’s intrinsic/inherent complexity in them when viewed from Rust’s ownership perspective, particularly if the APIs use exclusive borrows (ie &mut).

To be clear, I’m aware of this issue and have tried helping many people work around it, as well as having run into this myself. As mentioned, I’m playing devil’s advocate to keep us (and myself) honest about the situation.


#90

Fully agree. But, the only universal thing about all languages is that none of them are universally accepted as perfect :slight_smile:. It’s a bunch of tradeoffs. That’s not to say there isn’t room for improvement, but there will always be some things that feel less elegant or have more friction than others. Heck, disjoint borrows isn’t the only such thing in Rust.

I wouldn’t say “of course” because nobody knows what the solution is. As some have mentioned, it may be some form of automated code generation/IDE support, for example. We shouldn’t start out with the premise the compiler and language need to change.

Agreed!


#91

Writing the code isn’t much of an issue, reading it is already more annoying, but changing it is what pains me. Consider the petgraph example again. if I’m iterating over a graph and want to change the code to also update some weights while doing that I have to use a completely different and less ergonomic iteration API. If I’m using my own code such an API might not have been written yet because I didn’t have the need for that before.

I think the automatic adjustment wouldn’t cover cases like that even if there was a magic button. I can see it working for just free functions and passing references directly, but even where that is applicable it doesn’t scale. I have some functions that take view structs with 9 different references and macros to construct them. Passing those references individually is not an option for me.

But I’m claiming they should be easier and I’d like to discuss possible language changes that would make them easier.

I don’t find this very productive, certainly not at this stage. When there is a concrete proposal to change the language it might make sense to see how it compares with the existing workarounds, but without an alternative, to me this seems to be a discussion about whether the problems are real or not.

I’m certainly not starting with this as a premise, but rather as a conclusion of 4 years trying to work with the existing workarounds. Extending a language is always difficult, as language features don’t compose easily so you have to consider interactions with all current and possible future language features. Considering this, extending the language certainly isn’t my first choice for solving a problem, but in this case I exhausted the other choices.


#92

I’m not against other people experimenting with other solutions (like some sort of IDE magic), but I don’t think it will solve the problem.

Part of the problem is indeed generating the boilerplate for the free functions + borrows, but the bigger issue is maintaining all of this boilerplate when changes are needed. An IDE will be a lot less help with that.


I’m not sure if this is the right place to discuss this (maybe I should create a new thread), but here is my idea for one solution to this problem:

Similar to Fn, FnMut, and FnOnce, a new magical Struct trait is added to the language:

impl Foo {
    fn foo<A>(this: &mut A) where A: Struct { field1: u32, field2: bool } {
        this.field1 = ...;
        this.field2 = ...;
    }
}

With impl Trait, this can be made shorter:

impl Foo {
    fn foo(this: &mut impl Struct { field1: u32, field2: bool }) {
        this.field1 = ...;
        this.field2 = ...;
    }
}

The semantics are that the Struct trait is automatically implemented for all struct types (which have the same subset of fields and types), similar to how the Fn/FnMut/FnOnce traits are automatically implemented for closures. These implementations might not need to actually exist (the Rust compiler can special-case it), but at least conceptually they exist.

When calling the above function as Foo::foo(&mut bar), it verifies that bar implements Struct { field1: u32, field2: bool } (in other words, that it has the correct fields and types), and then it passes the entire &mut bar into foo. This is how generics work in general.

Even though it is passing the entire &mut bar into foo, the Rust compiler has special knowledge of the Struct trait, so it knows that foo can only access the field1 and field2 fields, so it can take that into account when doing borrow checking (this allows for disjoint field borrows).

This is basically a very limited form of structural typing (similar to how closures are structurally typed).

There are some significant advantages of this approach:

  • It reuses the existing trait infrastructure. That not only makes it much easier to implement and understand, but it also allows for many cool things, such as:

    • It can be used everywhere: as bounds to structs, enums, functions, methods, etc.

      That means it’s possible to create structs which wrap other structurally typed structs:

      struct Bar<A> where A: Struct { field1: u32, field2: bool } {
         ...
      }
      
    • It interacts well with other traits (e.g. you can have a Struct { ... } + Clone bound).

    • A function/method can return impl Struct { ... } (allowing for field narrowing)

    • Trait objects like Box<dyn Struct { foo: u32 }> allow for dynamically dispatched structural records.

  • Each function/method clearly marks what fields it is accessing, making everything easy to understand (and improving the compiler errors).

    This sort of explicitness is annoying, but it is limited to the type signature (unlike free functions which require boilerplate at each call site!)

  • It’s possible (with some compiler changes) to put commonly-accessed fields into a type alias:

    type MyView = Struct {
        field1: u32,
        field2: bool,
    };
    

    Now you can simply use impl MyView.

  • Because of the way that generics work, foo will not have any runtime cost, because it has been specialized for a specific struct. Which means Struct is zero-cost.

  • It makes it clear that adding new Struct fields to a public API is a breaking change, which is important for semver concerns.

    Obviously this doesn’t apply if you’re using Struct purely internally (in which case you can add new fields freely).


#93

Oh! One thing I forgot to mention: whether a struct implements Struct or not depends on the visibility of its fields.

In other words, if there is a Struct { foo: u32 }, you can only pass a struct where you have access to the foo field.

So you cannot use Struct to access private fields (unless you have access to those private fields).


#94

This seems somewhat similar to the “virtual fields in traits” approach that, IIRC, @nikomatsakis mentioned.

So suppose you have a structural trait bound, and a type matches. How do you actually associate access to those fields to the disjointness/structural property? Clearly we’re not going to allow direct field access, and the struct may have a getter/setter API that enforces some invariant(s), and we don’t want to bypass that.


#95

It’s not really similar at all (except for the fact that they both involve traits in some way).

You would access the fields as this.field1, this.field2, etc. just like a normal struct. Yes it would be direct access (exactly the same as if you passed in a concrete struct).

This is safe because it is using generics, so the function will be specialized for each struct type which is passed in.

Which means the codegen will be exactly the same (no performance cost, no indirection, no temporary “view structs” created, etc.)

In my second post I clarified that you can only pass in structs where you have access to the fields (you cannot use Struct to bypass the visibility system).

So if a struct wants to maintain invariants, it just makes the fields private (which is what it has to do now anyways).


#96

This seems like a lot of machinery for useful, but very limited gain because of direct field access.

But a lot of times you want both: expose something, but maintain invariants.

To me, I’d prefer the borrow regions approach if it was between these two because it allows one to talk about disjointness at a more abstract level, without requiring direct field access; it’s costly in that it also necessitates language changes and is yet more syntax/concepts to learn, but it’s strictly more flexible.


#97

Could you clarify that? The whole problem is the inability to access disjoint fields, my proposal allows you to access disjoint fields. It sounds like you have something bigger in mind.

Could you explain how regions differs from my proposal? What does it enable that cannot be done with my proposal?


#98

This thread has focused a lot on structs, but as I’ve mentioned a few times, the same problem is present with traits: if you have generic code working with some trait, you can run into the exact same problems as with structs (e.g. trait method returns a reference, and you want to hold that reference while calling another trait method that needs &mut self, say).

So the disjointness issue is more general than just structs.

I don’t recall all the details offhand, but in general, you can declare the notion of regions - think something like generic lifetime parameters or maybe associated type. Then, you can annotate methods with the regions they work with. For instance, a method’s return type indicates what region(s) is borrowed. Calling another method that’s declared to not use those regions internally is possible. The compiler then ensures that, if you’re using borrow regions in your APIs and impls, that you don’t overlap/cause a conflict. @newpavlov may be able to talk about that proposal in better and more detailed terms.


#99

I see, but it’s possible to use my proposal to fix that (at the cost of some verbosity and exposure of internal implementation):

impl Trait for Foo {
    fn foo(self: &impl Struct { field1: u32 }) -> &u32 {
        &self.field1
    }

    fn bar(self: &mut impl Struct { field2: bool }) {
        ...
    }
}

I think it will be hard to create a simpler system than this, since fundamentally all disjoint borrows are struct borrows (either internally or in the public API). My proposal at least allows you to expose the internal struct borrows.

I don’t see how that differs fundamentally from my proposal. You can of course annotate which fields a function/method needs:

fn foo(this: &mut impl Struct { field1: u32 }) {
    ...
}

And you can even return a Struct with narrowed fields:

fn foo(this: impl Struct { field1: u32, field2: bool }) -> impl Struct { field1: u32 } {
    this
}

Or provide a method which returns a view on a struct:

fn view(&self) -> impl Struct { field1: &u32, field2: &bool } {
    self
}

You get all of this for free, since it reuses the trait infrastructure.

To be clear: you’re not specifying all of the fields, you’re specifying only the fields your method needs, so that way it can be disjoint with other methods which need different fields.

The Rust compiler has knowledge about the Struct trait, so it knows that if one method takes Struct { field1: u32 } and another method takes Struct { field2: bool }, then they are disjoint and so it allows the borrow.


#100

Oh, I think I see the problem now. Since the visibility is based upon when the struct is passed into the function/method, that means this feature is only useful for internal private APIs.

For public APIs you would be forced to mark the fields as pub, which as you say doesn’t allow for enforcing invariants.

You could still workaround this, by doing something like this:

impl Foo {
    fn as_view(&self) -> impl Struct { field1: &u32, field2: &bool } {
        self
    }
}

Basically, you would provide a method which takes &self and returns a Struct which contains all of the fields for Self. This works because the as_view method always has access to the private fields.

Now it’s possible to expose private disjoint fields without giving mutable access to them (mutation would still be done with methods, since they need to maintain invariants).