Borrowing `std::cell::RefMut` mutably and immutably

Hello!

I encountered a strange behavior of the borrow checker when using std::cell::RefMut. Here is the smallest example I can replicate.

use std::cell::{RefCell, RefMut};

fn compute(slice: &[i32]) -> i32 {
    slice.iter().sum()
}

#[derive(Debug)]
struct Something {
    refcell: RefCell<Vec<i32>>,
}

impl Something {
    fn new() -> Self {
        Self {
            refcell: RefCell::new(vec![1]),
        }
    }

    fn update(&self) {
        let mut vec: RefMut<Vec<i32>> = self.refcell.borrow_mut();
        vec.push(compute(&vec[..]));
    }
}

fn main() {
    let s = Something::new();
    for _ in 1..5 {
        s.update();
    }
    println!("{:?}", s);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0502]: cannot borrow `vec` as immutable because it is also borrowed as mutable
  --> src/main.rs:21:27
   |
21 |         vec.push(compute(&vec[..]));
   |         --- ----          ^^^ immutable borrow occurs here
   |         |   |
   |         |   mutable borrow later used by call
   |         mutable borrow occurs here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0502`.
error: could not compile `playground`.

To learn more, run the command again with --verbose.

What is strange is that the immutable borrow &vec[..] should be released before vec.push acquires the mutable borrow.

Indeed, the code compiles if I rewrite it like this

fn update(&self) {
    let mut vec: RefMut<Vec<i32>> = self.refcell.borrow_mut();
    let tmp = compute(&vec[..]);
    vec.push(tmp);
}

or like this (cannot put another link, sorry; just copy-paste it on your own to try it out)

fn update(&self) {
    let vec: &mut Vec<i32> = &mut *self.refcell.borrow_mut();
    vec.push(compute(vec));
}

It's not too great of a deal, but I cannot understand why it doesn't work with RefMut.

This is probably a limitation of the special support for "two-phase borrows". Since there is an extra layer of indirection through the RefMut, the compiler doesn't see that the two-phase borrow rules apply.

1 Like

I found this. Maybe it's related.

In the linked RFC 2025 "nested method calls", there is the example of the code

vec.push(vec.len())

being translated into the MIR

/* 0 */ tmp0 = &'a mut vec;    // <-- mutable borrow starts here
/* 1 */ tmp1 = &'b vec;        // <-- shared borrow overlaps here
/* 2 */ tmp2 = Vec::len(tmp1);
/* 3 */ EndRegion('b);         // <-- shared borrow ends here
/* 3 */ Vec::push(tmp0, tmp2);
/* 5 */ EndRegion('a);         // <-- mutable borrow ends here

Now, I'm not an expert in MIR (actually, it's the first time I see any of it), but I find this translation very surprising. Indeed, since Rust is a language with eager evaluation, I expected function arguments to be evaluated before passing their result to the function itself. Therefore, I expected the MIR to be more similar to

/* 1 */ tmp1 = &'b vec;        // <-- shared borrow overlaps here
/* 2 */ tmp2 = Vec::len(tmp1);
/* 3 */ EndRegion('b);         // <-- shared borrow ends here
/* 0 */ tmp0 = &'a mut vec;    // <-- mutable borrow starts here
/* 3 */ Vec::push(tmp0, tmp2);
/* 5 */ EndRegion('a);         // <-- mutable borrow ends here

This is what

let len = vec.len();
vec.push()

would translate into (I hope!) and I believe the two Rust codes should be equivalent.

What is the problem of having this semantics? First evaluate all the arguments, then borrow the object for the method call.

Ok, I have seen that this section covers the reordering of evaluation. The included example is horribly wrong! How can a.foo({ a += 1; () }) return 3?!? The arguments of a function need to be evaluated before the function itself! This is the first time I encounter something that I find so tremendously wrong about the semantic of Rust... I really hope this detail about evaluation of arguments can be cleaned up in the near future. It would be a shame to be stuck forever with this wart in such a beautiful language

Rust is evaluated left to right, which means that the value for the method receiver will evaluated before the arguments. A simple rule that seems to have gone too far. (I don't like this desugarring either, but we can't change it now)

That's exactly why it works the way it does: because vec is an argument to the function! Personally I would be surprised if the method receiver were evaluated last, because that would mean that in

foo().bar(baz())

the order of evaluation would be baz first, then foo, then bar, which seems much more wrong to me than the contrived example you claim is horribly wrong. Anyway, even if you disagree with me on this point, hopefully you understand that reasonable people may come to different conclusions. Two-phase borrows are basically the easy case because the order of evaluation between vec and vec.len() doesn't really matter.

This reminds me of the whole "sequence point" situation in C, where

a = ++a;

is undefined behavior. The long and the short of it is it doesn't really matter. Code like this is ambiguous at best, and

a.foo(a += 1);

is no better. The only reason it compiles at all is that i32 is Copy -- it's quite different from vec.push(vec.len()). Anyway, I can't think of a reason to write code like that in the first place, so why would you?

1 Like

And if this was not the case, these two calls would have different semantics:

vec.push(vec.len()); // method call
Vec::push(&mut vec, vec.len()); // associated function
4 Likes

The C code is undefined behavior because the C specification is lazy. If the code is to be accepted, the only meaningful thing it could do is to evaluate the rhs first and then assign the result to the lhs. I agree that a=++a might be slightly ambiguous at first sight, but I don't see any hard reason (apart from preference) to make it undefined behavior. a = a++ is a different story though. The rhs evaluates to the old value of a, but then it's arbitrary whether a++ or a = old_value_of_a is executed first. Even in this case the C standard could have decided for one or the other. Making UB allows implementation do do what they prefer, including optimizing the whole statement to a no-op, but is not necessarily any better than defining it with some arbitrary (but reasonable) precedence rules.

That's enough for C, let's go back to Rust.

Regarding the example which is "horribly wrong", I'm ashamed to admit that I didn't pay enough attention to details: I didn't notice that there was a Copy involved; I wrongly assumed that it was like this example, which instead doesn't compile, luckily.

Ok, the whole situation seems more coherent to me now, but still there is something I'm not 100% satisfied with. I'll have to think a bit about it.

In any case, thanks you all for the answers and keep up the great work! :+1:

1 Like