Reusing mutable closure arguments


#1

I want an API that handles closures in a consistent manner. The closures need to be mutable, and all uses need to call the same closure (so we cannot clone or copy, or the mutation will not update the correct closure).

As such one of the first problems is re-using the same closure in a call, for example:

    pub fn and(a : bool, b : bool) -> bool {
        a & b
    }
    pub fn test1<R, A>(mut r : R, a : A, b : A) -> A
    where R : FnMut(A, A) -> A, A : Clone {
        r(test1(r, test1(r, a.clone(), b.clone()), a), b)
    }
    println!("{}", test1(and, true, true));

This doesn’t work because although ‘R’ called by mutable reference (FnMut trait), it is still moved as a function argument. So I need to pass by mutable reference, and also introduce some intermediate variables as there can only be one borrow of the function at a time. I also introduced some state to the passed function to check that works.

    pub fn and(a : bool, b : bool) -> bool {
        a & b
    }
    pub fn test1<R, A>(r : &mut R, a : A, b : A) -> A
    where R : FnMut(A, A) -> A, A : Clone {
        let y = test1(r, a.clone(), b.clone());
        let x = test1(r, y, a);
        r(x, b)
    }
    let mut s = true;
    println!("{}", test1(&mut (|a, b| {s = s & a; and(a, b)}), true, true));

However I still have a problem if I want to pass the same closure as two arguments of a function:

    pub fn and(a : bool, b : bool) -> bool {
        a & b
    }
    pub fn test1a<R, A>(r : &mut R, s : &mut R, a : A, b : A) -> A
    where R : FnMut(A, A) -> A, A : Clone {
        let x = s(a.clone(), b.clone());
        let y = s(b, a);
        r(x, y)
    }
    let mut s = true;
    let mut c = |a, b| {s = s & a; and(a, b)};
    println!("{}", test1a(&mut c, &mut c, true, true));

This seems safe to me, as there is no concurrency? Surely the restriction on multiple mutable borrows of the same variable is overly restrictive, and the type system only needs to ensure that a mutable borrow is not held by more than one thread?


Unboxing boxed closures
#2

Manish Goregaokar wrote a good blog post on why shared mutability could cause problems even in single threaded applications.


#3

I guess the issue is that I like Rust so far, as someone who programs both in C++ and Haskell I like a lot of the design choices. I like move semantics by default and making copies (or clones) explicit. I like the type system, but maybe miss side-effect control and effects (PureScript does a good job of creating an eager Haskell like language with effects). I definitely think static guarantees are the way to go.

In the particular example I gave, all accesses to the mutable closure data are through a single function, the calling of which must be serialised, so there is no way it can go wrong in a single thread. Rust really needs an equivalent of the Sync trait for single threaded multiple mutable references.


#4

I suggest you read the blog post I linked to and think a little harder about how it could go wrong in a single thread.

Remember, borrow checker doesn’t know anything about how the function in question is going to call your closures, nor what they are going to do. All it knows is that they both contain mutable references to the same environment. Using one of those mutable references could invalidate some state, including making a reference invalid, that is depended upon in the environment of the other call.

In Rust, the way to allow more than one reference that can alive at the same time to the same item, but rather than being statically checked will be dynamically checked that you don’t actually use it in any invalid way, is to use RefCell.


#5

Yes, I understand how it could go wrong, if the code in the closure were different. What I mean is Rust’s type system is not good enough to distinguish the safe cases from the unsafe cases, so it excludes some valid programs. It is just a shame that it happens to be some of the programs that I would like to write.

Nothing in that example requires dynamic checking, it could be proved statically safe with a better type system. Putting every closure in a RefCell and having dynamic checking would cause an unacceptable performance loss, especially for the majority of cases where the two closures passed are not the same.


#6

Yeah, this is true for every static type system :slight_smile: Good static type systems forbid less valid programs, but they still forbid some of them. There are plans for some improvements of the Rust borrow manager (Borrowck), but some valid programs will probably be still forbidden.

In such cases you can try to wrap your valid code in unsafe{} and do what you know is right. Or you can help the introduction of better and better type systems, dependent typing, proofs, and so on. But in my opinion the complexity of the type system grows faster than the gains.

In some cases I’d like Rust to allow me to write down a small proof that some index can’t be out of bounds, an integer cast is safe, a borrowing is OK, and so on :slight_smile:

If Rust keeps living for 15+ more years then probably some people will build some verifiers (the work has already started, but we’re still rather far from a dependently typed Rust, or something like a Liquid Rust :slight_smile: ).


#7

Could it be proved statically safe with a better type system? I think that’s a fairly big claim; it seems that it would require knowledge of both what the two closures are going to do with their mutable references, and what the function calling them is going to do with the closures. In which case, that’s using information beyond the type system about the behavior of the functions. But if there is an example of a type system that could represent this, I’d be interested in hearing about it.

Secondly, I’m a little bit unclear on whether you are just trying to provide a criticism of Rust, or if you are trying to solve a particular problem. Your introduction of the problem is a little abstract; could you expand on what you are actually trying to do?


#8

Sure, I am trying to implement the examples from chapter 6 of Stepanov’s “Elements of Programming”. I am looking for a better language for generic programming than C++. Rust’s use of traits (resembling Haskell type-classes) seems the right direction to me, and the eager evaluations and support for imperativeness seems good to me too. Basically it seems to be heading in the right direction taking the bits of C++ and Haskell that I would want in a generic programming language. However I am interested in defining canonical algorithms, that is generic definitions that are re-usable in as many situations a possible. This particular code is a reduction of the actual problem to remove the irrelevant bits, it comes from testing collections to see if they are relation-preserving.

As to whether it could be proved statically safe, the proof would lie in the serialisation of access to the mutable variable. The mutable variable is only changed by the closure function, and the closure function mutably borrows the access to its ‘closure struct’. If we can show that only one function at a time borrows this mutable reference then it is okay. In this case the proof would be that the borrows of the two arguments do not overlap in the function definition then it is safe to pass both references to the same data. Basically we can imagine we rewrite the function with a single argument and if that passes the borrow checker, then the two argument version should too.

Here is the function rewritten with a single argument:

    pub fn test1a<R, A>(r : &mut R, a : A, b : A) -> A
    where R : FnMut(A, A) -> A, A : Clone {
        let x = r(a.clone(), b.clone());
        let y = r(b, a);
        r(x, y)
    }
    println!("{}", test1a(&mut c, true, true));

So thats the proof :slight_smile:

Edit: Of course that proof is not using the type system, I just wanted to show a proof was possible. I don’t think it is possible to prove in Rusts type system because the borrow checker is already something outside the type system. My intuition is that it is possible to prove this in the type system, but maybe the types will become too complex to be useful.


#9

That proof does not work.

If you have a single argument, the borrow checker checking the outer function knows that the borrow checker checking the inner function will only have a single reference to the mutable closure, and can ensure that mutable reference is only used once at a time in a way that obeys all of the borrowing rules.

If you have two references to closures in the function signature, then the inner function does not know that they are the same, and so will allow you to use them in ways that it would not allow you to use a single mutable reference. Because the inner function could do this, the outer function can’t allow you to pass in those two mutable references. All of the information on what the functions are allowed to assume about each other is contained in their types, and allowing you to pass in two copies of a mutable reference would mean you could violate that contract.

To get a little more specific, here is the iterator invalidation example from the blog post referenced earlier, translated to passing &mut references to closures in an entirely single threaded example. Of course, this example is entirely contrived, made slightly more complicated by the need for a wrapper type to make it possible to define a closure that could take an argument allowing the closure to be called recursively, and there are actually two places in this example where you have errors due to not allowing multiple simultaneous mutable borrows. But it demonstrates that if you have two mutable closures, it is indeed possible for them to both be active simultaneously even in a single threaded case:

fn main() {
    struct Callable<'a>(&'a mut FnMut(bool, Callable));
    let mut v = vec![1, 2, 3];
    let mut c = |b, mut inner: Callable| {
        if b {
            v.push(1)
        } else {
            for i in &v { 
                inner.0(true, Callable(&mut inner.0))
            }
        }
    };
    fn test<R1, R2>(mut r: &mut R1, mut s: &mut R2)
        where R1: for <'a> FnMut(bool, Callable<'a>),
              R2: for <'a> FnMut(bool, Callable<'a>),
    {
        r(true, Callable(&mut s));
        s(false, Callable(&mut r));
    }
    test(&mut c, &mut c)
}

You can, however, use the same test function successfully with separate closures passed in:

fn main() {
    struct Callable<'a>(&'a mut FnMut(bool, Callable));
    let mut c3 = |b, mut _inner: Callable| println!("c3: {}", b);
    let mut c4 = |b, mut _inner: Callable| println!("c4: {}", b);
    let mut c1 = |b, mut inner: Callable| {
        if b {
            println!("done c1")
        } else {
            inner.0(false, Callable(&mut c3))
        }
    };
    let mut c2 = |b, mut inner: Callable| {
        if b { 
            println!("done c2")
        } else {
            inner.0(false, Callable(&mut c4))
        }
    };
    fn test<R1, R2>(mut r: &mut R1, mut s: &mut R2)
        where R1: for <'a> FnMut(bool, Callable<'a>),
              R2: for <'a> FnMut(bool, Callable<'a>),
    {
        r(true, Callable(&mut s));
        s(false, Callable(&mut r));
    }
    test(&mut c1, &mut c2)
}

You still haven’t quite explained what it is that you’re trying to do. Why is it that you feel like you need to pass in two copies of the same closure to the function, and the closure needs to be mutable? I don’t have “Elements of Programming” in front of me, so I can’t tell what examples you are trying to translate.


#10

I think you misunderstand the proof, which is: if the borrow checker can prove it safe with a single argument (IE we substitute all uses of ‘s’ in the function for ‘r’ and unify the arguments), then it is safe as a two argument function where both arguments are the same. If you rewrite your two arguments examples with a single argument, do they pass the borrow checker?:

    fn test<R1>(mut r: &mut R1)
    where R1: for <'a> FnMut(bool, Callable<'a>) {
        r(true, Callable(&mut r));
        r(false, Callable(&mut r));
    }
    test(&mut c1)

Only if this above function passess the borrow checker would it be safe to call the two argument version with the same mutable reference twice. The above does not pass as you cannot pass the mutable reference whilst calling r by mutable reference. This means it is not safe to pass the same argument twice by mutable reference to this function, but it is safe to do so for the function above. In other words this is a per-function proof that the borrow checker could apply to resolve if it is safe to pass the same mutable reference twice to a particular function (not a proof that it is safe for any function).

As to what I am trying to do, I am trying to declare higer-order functions like “compliment_of_converse” which modifies a relation. In this case I have a function that takes two relations and generates a third. This kind of operator occurs when trying to define something like generic parser combinators. For example have a look at my C++ parser combinator library https://github.com/keean/Parser-Combinators/blob/master/parser_combinators.hpp an example of a binary combinator would be something like combinator_sequence.


#11

Your proof presumes knowledge of the contents of the test function when checking whether an invocation of it is valid. The whole idea of the borrow checker, and type checkers in general, is that they only need to know the signature of the test function, not the body of it. Otherwise, you have to do full program static analysis to determine whether your program is valid, and that can run into the halting problem in the general case.

By the way, since you seem to be editing your post to get the formatting right, you can format code with highlighting by using code fences, and extension to Markdown that makes formatting code a lot easier:

```
fn main() {}
```

Becomes:

fn main() {}

#12

Thanks for the markdown tip, it looks a lot nicer.

Yes, I did say above that this was not a type system check. However i think it could be done in the type system, but that is complicated and not the kind of thing I can think up in such a short amount of time. Linear types probably don’t allow it, as they impose a linear ordering on the mutations that means that a single argument function is not the same as a two argument function where both arguments are the same. This is because they do not account for statement ordering.

One random idea would be to make multiple mutable reference safety a kind of trait, such that:

fn test<R, S>(r : &mut R, s : &mut S) 
where R : FnMut(...), S : FnMut(...), Disjoint(R, S) {
...
}

Where Disjoint is a special trait, that is only allowed if the function body complies with the substitute s for r (or r for s) test. There is precedent for special traits already like Copy.

Now the borrow checker knows it’s okay to pass the same mutable reference twice when they are linked by the Disjoint trait.

Edit: On reflection I think the special trait approach is a neat solution that avoids complicating the type system too much (by introducing Alias types for example), and should be easy to understand. This method does not require whole program analysis, or run into problems with non-termination.


#13

Not every type system, some admit invalid programs, like C and C++, but i do agree with that statement.

It is possible to type every untyped lambda expression in a few different ways with intersection types, my favourite being recursive universally quantified types with rank-0 intersection types. You can see a discussion about this on lambda the ultimate: http://lambda-the-ultimate.org/node/4954

I too would like to write small proofs in the type system, however I am not sure dependent types are the right answer for a language like Rust as they lose the distinction between static and runtime polymorphism. Some kind of refinement types are probably a better solution.


#14

Two quick points:

  • In your first example, the need to use intermediate variables should be removed in the future with MIR.

  • If you do go with RefCell, you don’t have to build it into the higher-order function and thus incur the cost when the closures are stateless or different. Instead you can use a wrapper:

use std::cell::RefCell;

fn main() {
    pub fn and(a : bool, b : bool) -> bool {
        a & b
    }
    pub fn test1a<R, A>(r : &mut R, s : &mut R, a : A, b : A) -> A
    where R : FnMut(A, A) -> A, A : Clone {
        let x = s(a.clone(), b.clone());
        let y = s(b, a);
        r(x, y)
    }
    let mut s = true;
    {
        let c = RefCell::new(|a, b| {s = s & a; and(a, b)});
        let rc = |a, b| { (&mut *c.borrow_mut())(a, b) };
        println!("{}", test1a(&mut &rc, &mut &rc, true, true));
    }
    println!("{}", s);
}

It’s a bit more arcane than I’d like - I’m not sure why (&mut *c.borrow_mut())(a, b) is needed rather than just c.borrow_mut()(a, b) (which produces a somewhat odd error). But it works.

Using RefCell allows rc to be called with an immutable reference; I still can’t actually get two mutable references to it, but I can pass a mutable reference to an immutable reference instead, using two anonymous immutable references (the function traits are automatically implemented on references). Of course, the same trick would work with a truly stateless closure, and if you just want to pass a regular fn, &mut and is enough.


#15

I think that’s an acceptable solution, as the generic function does not need to know its being passed a RefCel. It would be nice if the borrow checker could certify functions as multiple mutable reference safe using special traits though, as that clearly has less runtime overhead in the case you are passing a single closure.

Using specialisation on type bounds, you can provide specialised instances of the generic function with better performance if the closure is stateless, however I would always want to start with the most generic function possible and then specialise.

I have noticed another place where I seem to need an extra &mut, and I posted it in a linked thread. It would be great if someone could explain why the extra level of indirection via a mutable reference is required.


#16

A further thought about the safety of mutable reference arguments: It seems like the problem is with reading and writing to the same reference. In other words just like you can have multiple read references to an object safely, you should be able to have multiple write-only references to an object safely. Problems only occur when read and write references overlap. To me it would seem better to represent the mutability of a type as a trait. So you could have:

fn f<T, U>(t : &T, u : &U) where T : Readable, U : Writeable {
    ...
}

The function would be safe as long as T and U do not overlap (see examples above of substituting one argument for the other to check borrowing. Now I would argue that it doesn’t matter is something is internally mutable, as long as it does not implement Writeable (which provides a write-only dereference of the reference).

In other words it is safe to pass a closure that internally mutates its state as a read-only reference, providing it does not implement the Writable trait, and also the function it is passed two does not require Writeable.

I think by using separate Readable and Writeable traits, combined with the Disjoint trait discussed above, the borrow checker could admit a lot more valid programs without risking safety, and better facilitate generic programming by allowing more general functions to be written.