How to rewrite this so that mutable borrows don't overlap?

Hi, in trying to write a nom parser that keeps a mutable state, I've come across a problem involving mutable borrows in closures which are executed one after the other, with no overlap. I've reduced my issue to the following simple snippet

// I.e. something like `f().or_else(g)`
fn one_of(mut f: impl FnMut() -> Option<u32>, mut g: impl FnMut() -> Option<u32>) -> Option<u32> {
    if let Some(x) = f() {
        Some(x)
    } else if let Some(y) = g() {
        Some(y)
    } else {
        None
    }
}

fn test() -> Option<u32> {
    let mut n = 0;
    one_of(
        || { n += 1; if n%2==0 { Some(n/2) } else { None }},
        || { n += 1; if n%2==0 { Some(n/2) } else { None }},
    )
}

which rightfully gives an error error[E0499]: cannot borrow n as mutable more than once at a time due to borrowing n in those two closures, which are passed as arguments to one_of. However, looking at the code it's clear that the two mutable borrows of n do not overlap: f() is run, then returns, releasing the borrow, and only after that is g() run.

My question is (1) why does rust not recognise this, but more importantly (2) how could I rewrite this to not run into this error? Let's assume for the sake of argument that one_of is a library function I can't edit :slight_smile:

They do overlap, because the borrows captured in the closures must be valid for the entire call to one_of. There's also nothing in the API of one_of that says f runs before g or vice-versa. In fact, as FnMut implementors, it could run both closures multiple times, interleaved anyway it likes.

More generally, function signatures declare an API contract between the caller and the callee. The callee can only work within the confines of the API (e.g. one_of cannot clone the closures as there is no Clone bound), but they can work within the full confines of the API (e.g. they could call the closures as many times and in any order they wish). And the caller must satisfy the bounds of the API, but cannot impose their own (like "you must call f before g"). This gives function authors the flexibility to change the contents of their function bodies without breaking downstream code [1], and the ability to reason about the validity of functions locally, and the ability to reason about the effects of calling a function local to the call site (without considering the function body) as well.

Even if that wasn't true and Rust did more of a global analysis to allow more programs at the cost of said flexibility, there's no mechanism to statically register a future mutable borrow (a &mut) in your closures -- some sort of borrow that activates only when it's called. You're creating the borrow when you create the closure. So even this doesn't work:

    let mut n = 0;
    let mut lambda_a = || { n += 1; if n%2==0 { Some(n/2) } else { None }};
    let mut lambda_b = || { n += 1; if n%2==0 { Some(n/2) } else { None }};
    lambda_a();
    lambda_b();

You have to instead do this:

    let mut n = 0;
    let mut lambda_a = || { n += 1; if n%2==0 { Some(n/2) } else { None }};
    lambda_a();
    let mut lambda_b = || { n += 1; if n%2==0 { Some(n/2) } else { None }};
    lambda_b();

As the only way lambda_b can validly capture a mutable borrow of n is if there are no other borrows of n still alive (no more uses of lambda_a). Perhaps think of the closures as ad-hoc structs which contain a &mut, created at the definition site.


However, you may be able to get away with some sort of run-time checking (like a RefCell) or other formed of shared mutability (a Mutex or such), though you run the risk of a runtime panic or deadlock depending on what exactly one_of does.

What are your actual types? Not u32 I'm guessing?


  1. i.e. making it no longer compile ↩︎

2 Likes

Generally the way to avoid this issue is to instead pass the mutable reference to the closures as a parameter (FnMut(&mut T) instead of FnMut()), but I'm not familiar enough with nom to know whether that's an option in your case

1 Like

If you're looking for ideas that may be less general or workarounds, here is an old school way of using functions instead of closures, plus a context param. It is less general in that both functions have to use the same context.

    fn one_of<Context>(
        context: &mut Context,
        f: fn(&mut Context) -> Option<u32>,
        g: fn(&mut Context) -> Option<u32>,
    ) -> Option<u32> {
        if let Some(x) = f(context) {
            Some(x)
        } else if let Some(y) = g(context) {
            Some(y)
        } else {
            None
        }
    }

    fn test() -> Option<u32> {
        let mut n = 0;
        one_of(
            &mut n,
            |n| {
                *n += 1;
                if *n % 2 == 0 {
                    Some(*n / 2)
                } else {
                    None
                }
            },
            |n| {
                *n += 1;
                if *n % 2 == 0 {
                    Some(*n / 2)
                } else {
                    None
                }
            },
        )
    }

It's not that great. But it may give you other ideas for your specific case.

1 Like

You get this error, because the analysis isn't smart enough to recognize non-overlapping usage. n is borrowed exclusively for the entire lifetime of its closure, and that's it. There's no fine-grained on-and-off lifetime analysis across functions.

Also Rust analyzes lifetimes across functions based only on function's interfaces, not function bodies. Lifetime errors are always for the worst possible most awful implementation that a function's signature allows in theory. This limitation is by design, because interfaces are supposed to be stable, and changes in implementation are not supposed to break user code.

You can avoid this by using let n = Cell::new(0). &Cell can be mutated and shared. It does have a bit awkward .set/.get syntax, but it compiles to the same exact code as a bare integer.

Wow, explaining like this makes perfect sense. Thanks!

No, the types in this case are Results, not Options, of a non-copy Ok type (tuple of T for the value just parsed and &str for the input still to be parsed) and a non-copy Err.

Re Mutex and RefCell: that is not a very satisfactory option, since we're imposing an unnecessary runtime cost and turning a compilation error into a possible runtime panic.


Actually, after some searching, I found an issue in nom which raises almost exactly the same problem (Thread state through nom combinators · Issue #1419 · Geal/nom · GitHub), even down to the specific use-case (parsing first-order terms where a hashtable from names to numeric identifiers is kept during parsing). The suggestion there is exactly that which @jumpnbrownweasel gave: provide a one_of_with_state function that takes the mut ref to some state and passes it to the closures as necessary. Unfortunately this would mean manually rewriting all combinator functions in nom into alternative X_with_state variants, which is quite cumbersome.

Maybe when I'm more proeficient with rust I'll send them a PR :wink: