Why can't you use a mutably borrowed value in-between mutable changes?

Let's consider the following code:

fn main() {
    let mut num = 5;
    let numRef = &mut num;
    
    let val = num;
    *numRef = 6;
}

(Playground)

Which will fail to compile with this eror:

   Compiling playground v0.0.1 (/playground)
warning: unused variable: `val`
 --> src/main.rs:5:9
  |
5 |     let val = num;
  |         ^^^ help: if this is intentional, prefix it with an underscore: `_val`
  |
  = note: `#[warn(unused_variables)]` on by default

error[E0503]: cannot use `num` because it was mutably borrowed
 --> src/main.rs:5:15
  |
3 |     let numRef = &mut num;
  |                  -------- borrow of `num` occurs here
4 |     
5 |     let val = num;
  |               ^^^ use of borrowed `num`
6 |     *numRef = 6;
  |     ----------- borrow later used here

For more information about this error, try `rustc --explain E0503`.
warning: `playground` (bin "playground") generated 1 warning
error: could not compile `playground` due to previous error; 1 warning emitted

My question is: why is Rust designed this way? As far as I can tell, it should be safe to read (or immutably borrow) a mutable variable that is mutably borrowed, as long as there are no mutable changes on the variable during the immutable borrows lifetime.

In this example, the value is read in order to copy it to a new variable. Why is reading a variable unsafe when a mutable reference on it exists? What should be unsafe is using that read value after a mutable change was done to the variable, which isn't the case here.

So why doesn't Rust allow reading from mutably borrowed variable in-between mutable changes? Is it because it would be too hard to properly check for this at compile time, or is there an actual scenario where this could read to an error?

This is because &mut is not just mutable, but it primarily promises exclusive access.

Any time where you get &mut you have 100% guarantee that it is the only way to access the object, and there is no other way anywhere else for as long as this reference is usable. This is by design, as such strong guarantee is required for Rust's thread safety guarantees, prevention of iterator invalidation problems, etc.

It may seem silly and overly conservative in a simple function like that, but the rules make more sense as soon as you start passing these references down to functions and other threads. Completely unrelated code could rely on modifying &mut whenever it wants, and you shouldn't be able to observe these changes (possibly half-written) through num.

8 Likes

Completely unrelated code could rely on modifying &mut whenever it wants I of course thought about that, but Rust already checks for that (kind of). If I create a mutable ref, then use it and then stop using it, I can then read the value or create immutable refs (or even a new mutable ref again), so Rust already guarantees that no one will "mutate the object whenever they want" using the first ref.

TBH I never saw how multithreading is done in Rust, but consider the following screnario:

  • You create a mutable variable
  • You give a mutable ref to that variable to some function that doesn't return anything
  • You start reading the variable value or create immutable refs to the variable or to it's data

This is fine, right? So how does Rust know that the function didn't start a new thread, didn't pass the mutable reference there and the new thread is now mutating the variable as you are reading it? Surely there is some mechanism to prevent this. Couldn't the same mechanism be used to ensure that a mutable ref isn't used to mutate an object while there is another active ref on it instead?

Yes, I understand that Rust is designed this way, I'm asking why is it designed this way. I'm sure there is a good reason for it, and I would like to educate myself and find out that reason.

You say such strong guarantee is required for Rust's thread safety guarantees, but without an example it is difficult to imagine why that is the case.

[or] prevention of iterator invalidation problems again, without an example, it's hard for me to imagine why someone else having a mutable reference (and not using it to mutate the object) would cause an issue with an iterator. But I don't know how iterators are implemented, maybe if iterators actually mutate the object (like current index) even when behind a & reference then sure, that would cause an problem, but so would having 2 iterators at once, and I don't know if that is an problem or not.

I would like to understand this, but if you just effectively say "it is necessary because it is necessary" and don't include an example of how exactly would thing go bad if Rust allowed this then I can't understand why it is the way it is.

If you can guarantee that the variable isn't changed, then why don't you just split the borrows? Borrow it mutably when needed, end the mutable borrow, borrow it immutably twice simultaneously (which is allowed), then drop both immutable borrows, and borrow it mutably again.

I guess the compiler could somehow prove in some limited cases that the value behind a mutable borrow isn't in fact changed, but 1. that would probably require global analysis (which is practically unfeasible), and 2. special-casing this would lead to unclear code that doesn't express the writer's intent.

1 Like

Your particular example as well as some cases of modification of an array during iteration are possible to do safely via &Cell<T> as described here without any extra overhead. However, as partly described in the projections section in that blog post, the things you can do with a Cell are rather limited because it has to stop all of the several different ways it can go wrong. Some examples of how it can go wrong can be found here.

7 Likes

That scenario is disallowed because non-'static references can't be sent across threads, where the borrow checker would be unable to reason about them. This can be seen in the 'static bound on the closure passed to thread::spawn().

2 Likes

No it isn't, and it wouldn't pass the borrow checker, because you can't read a variable (including the creation of immutable references to it) if some other user (function) has a mutable borrow to it.

2 Likes

The compiler knows about it from the lifetimes annotated on the reference. If the reference is annotated in the default way, then the function must not access it once it has returned. Alternative lifetime annotations on the function signature would allow the function to pass it to a different thread, but by virtue of being part of the function signature, the caller would know about this and the caller would be prevented from accessing the variable until the thread stops using it. As an example:

use rayon::Scope;

fn use_in_thread<'a>(value: &'a mut String, thread_scope: &Scope<'a>) {
    thread_scope.spawn(move |_| {
        *value = "Hello world!".to_string();
    });
    
    // Replacing the above with this would not compile because the new
    // thread is not associated with the lifetime of the reference.
    //
    // std::thread::spawn(move || {
    //     *value = "Hello world!".to_string();
    // });
}

fn main() {
    let mut my_string = "some value".to_string();
    
    rayon::scope(|scope| {
        use_in_thread(&mut my_string, scope);
        
        // This would fail to compile because the thread can still access it:
        // println!("{}", my_string);
        
        // returning from rayon::scope will sleep until all threads spawned
        // from the scope have returned
    });
    
    // This is fine:
    println!("{}", my_string);
}

In the above example, a mutable reference is passed to a new thread that outlives the method call, but the lifetime ties the reference to the scope of the thread, and once the thread is joined and hence guaranteed to no longer access the string, you can use it again in main.

3 Likes

It exists like this, because a guaranteed exclusive access (AKA no mutable aliasing) is a very useful property. Rust chose to have this limitation to have a useful tool for the borrow checker and safe constructs. It is essential for safety analysis. In lots of places in the compiler, in both safety checks and optimizations, it needs to consider "could anyone else have changed this value in the meantime?". In a do-whatever-you-want language this is a very difficult question to answer with certainty, and it's mostly the reason why static analysis tools for C don't have the same level of guarantees that Rust gives.

Basically, for safety it's a fundamental concept. Just like non-null types eliminate ambiguity about null values, or unsigned integers eliminate ambiguity about negative values, &mut eliminates ambiguity about concurrent access.

As for why the compiler is not smart enough to borrow exclusively, then forget the borrow, and then borrow again. That's because it's just incredibly hard to implement (in a way that's generally usable and can be proven not to have holes if you use with with complex code):

https://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/

It already does something like this, but in very limited circumstances:

https://rustc-dev-guide.rust-lang.org/borrow_check/two_phase_borrows.html

3 Likes

Some other brief thoughts:

  • If you use a shared reference instead of ownership, that would result in aliased memory, and part of the exclusivity guarantee is that such aliasing is instant UB
    • rustc annotates LLVM variables as noalias for example, and who knows what LLVM will do with that
  • If you then give up on aliasing, but still allow shared references, it's pretty easy to create problems using interior mutability
    • E.g. get a mutable pointer to the underlying value, then use a shared reference to drop the value via interior mutability
  • Going back to ownership, you can't move the item or you'd have a dangling pointer, so your example as-is only works for Copy types
  • And creating a copy is probably considered to be equivalent to using a shared borrow anyway
  • The compiler could figure out simple cases and move your copy to before the borrow in this case, but compilers getting too clever in a shuffle-my-code-around-for-me way tends to lead to fragile code bases where seemingly innocuous changes fail to compile, or are just plain difficult to reason about
    • Better to keep the rules simple and consistent. No aliasing is a mindset you want to get in the habit with for Rust anyway.

It is interesting to me that it does compile with the following change. So apparently the compiler is simply checking that all access is via num_ref once it becomes the exclusive owner. If so, that's an easy rule to remember.

    let mut num = 5;
    let num_ref = &mut num;

    //let val = num;
    let val = *num_ref;

    *num_ref = 6;

This pattern actually looks a whole lot like the iterator invalidation problem. Swap the primitive int for something more complex and the compiler's rejection of this code will make a whole lot more sense.

This presentation by Niko highlights it pretty well, in my opinion: C++Now 2017: Niko Matsakis "Rust: Hack Without Fear!" - YouTube

2 Likes

If you can guarantee that the variable isn't changed, then why don't you just split the borrows? Yes of course, I thought about that, but what about this scenario: playground

fn choose_shorter<'a>(a: &'a mut String, b: &'a mut String) -> &'a mut String {
    if a.len() < b.len() {
        a 
    } else {
        b
    }
}

fn main() {
    let mut text1 = "Hello".to_string();
    let mut text2 = "From kajacx".to_string();
    
    let selected = choose_shorter(&mut text1, &mut text2);
    
    // mutate selected
    selected.push_str(" appended");
    
    // read text1 and text2
    println!("text1: {}, text2: {}", text1, text2);
    
    // mutate selected again
    selected.push_str(" appended again");
    
    // don't use the previously read value of text1 or text2
    // (that would be bad, because a or b might have been mutated since the value was read)
    // read text1 and text2 again instead
    println!("text1: {}, text2: {}", text1, text2);
}

Here you can't simply drop the reference and make a new one, because the length of the texts changed, so you couldn't even compute it again. I guess you could save which text was shorter into a boolean and then reconstruct from that, but that is additional overhead that wouldn't be necessary if rust allowed reading from a value that has a mutable ref on it.

Also note that Rust already allows reading values from mutable refs, even passing "readonly" refs to an object that has an active "unique" mutable ref on it: playground

fn read(a: &String) {
    println!("Value is: {}", a);
}

fn main() {
    let mut text = "Hello".to_string();
    let text_ref = &mut text;
    
    text_ref.push_str(" world");
    
    // 2 "readonly" refs can exist WHILE a mutable ref exists as well
    let read1: &String = text_ref;
    let read2: &String = text_ref;
    read(read1);
    read(read2);
    
    text_ref.push_str("!");
    
    // read(read1); Error: value was mutated between obtaining a readonly ref and using it
}

So if it isn't a problem to create a & ref from a mutable ref and even pass it to functions, why would it be a problem to create a & ref from the object that the mutable ref points to and pass it to functions as well, using the same rules (the mutable ref (or the object itself) cannot be used to mutate the data while the borrowed & is active). Rust already checks for this when you derive & refs from the mutable ref, and it doesn't require global analysis.

you can't read a variable (including the creation of immutable references to it) if some other user (function) has a mutable borrow to it. Actually you can, see here: Rust Playground

No it isn't, and it wouldn't pass the borrow checker yes it does, see here: Rust Playground

Right, you can if the immutable reference is derived from the mutable reference. However, you must create it from the mutable reference to do this, and the mutable reference cannot be used until you stop using the immutable references again.

H2CO3 was referring to your comments about multi-threading here.

3 Likes

The standard library currently doesn't provide a way to do this, but rayon does using scoped threads. In addition we want to add scoped threads back into the standard library.

1 Like

Disclaimer: I'm yet quite new to Rust, so I don't really have a deeper understanding on how some internals work. But from what I have seen, the borrowing system in Rust sometimes is more conservative than it needs to be (in theory). That is because judging about whether something is safe can become incredibly complex (or even impossible, due to the halting problem or Gödel's incompleteness theorems, if I understand it right).

From Splitting Borrows in the Rustonomicon:

The borrow checker understands some basic stuff, but will fall over pretty easily.

One example where the borrow checker was "improved" was the non-lexical lifetime checking. Before these were introduced, the compiler would complain about certain things being invalid that now are valid.

I'm not sure if that was a good comparison, as there might be more implication about aliasing that I miss here perhaps, which require to make your provided program failing (given the current implementation of the Rust compiler). But my point is: The borrow checker isn't a tool to catch all correct cases (that'd be impossible), but a tool to catch all wrong cases regarding conflicting access. Therefore, it will always be a bit more more conservative than it needs to be (in theory or practice, just like it was before non-lexical lifetimes were introduced, for example).

Does that make sense? :thinking:

Yes it does. I always knew that checking for access like this would be more complicated, but I don't think it would be much more complicated. However, there is a bigger problem, which I will describe in another reply.

Your particular example as well as some cases of modification of an array during iteration are possible to do safely via &Cell<T> as described here without any extra overhead. However, as partly described in the projections section in that blog post, the things you can do with a Cell are rather limited because it has to stop all of the several different ways it can go wrong. Some examples of how it can go wrong can be found here.

Ok, so if I understand correctly, the holder of the mutable ref can lend the ref as immutable to someone, and they can mutate it using Cell (the the data contains cell of course) and the holder need to keep it in mind that the data is possibly mutated. However that it is still manageable.

What isn't manageable is if someone else (say, the original owner of the object) lends an "immutable" ref to someone for them to modify it using Cell without informing the mutable ref holder (which is effectively what I was trying to do). Do I understand correctly?