Immutable references

fn main()
{
    let mut x = 10;
    let xr = &mut x;

    println!("{}", x); // Will not work
}

How come this is not allowed? I understand it I can do this

fn main()
{
    {
        let mut x = 10;
        let xr = &mut x;
    }
   

    println!("{}", x);
}

But why can't I modify the x variable if I have a mutable reference attached to it?

When you wrote

let xr = &mut x;

you expressed, that xr should have exclusive access to x, i.e. xr is the only way to modify or even read anything from x. Later, when you want to print the value behind the variable x, you need to be able to read it, but as long as xr has exclusive access to x, you can't do that.

If you had written

let xr = &x;

instead, then your code would work. &x denotes read-only or shared access to the variable x. As long as you have read-only access to x, no one can get exlusive access via &mut x. However, you can have as many read-only references as you like and println! doesn't need to modify the value it wants to display, so that's fine.

Your 2nd example can't possibly work. x will be dropped at the following } and cannot be used by println!, anymore. When you wrap something in curly brackets, you create a new lifetime and that lifetime is valid until the right curly bracket }. If you write

let x = 10;
{
    let xr = &mut x;
}
println!("{}", x);

instead, it'll work. The variable x is in the same scope as its usage by println! and the variable xr and therefore the exclusive access to x is dropped at the inner }.

3 Likes

I just realised I had a typo. Thanks for pointing that out.

Yes I am well aware that this is what is happening and I am not allowed to do this, however my question is why am I not allowed to still access x? What risk is involved?

& and &mut are designed to work across threads. While in single-threaded code, there is not as much value in &mut being exclusive, in multi-threaded code, it'll prevent data races, because when you have a &mut of something, no other thread can have a &mut to the same value.

If you need to be able to write to the data from multiple points in your program in the same thread, you should take a look at Cell, RefCell and UnsafeCell in std::cell. UnsafeCell is the only way to obtain a &mut from a &. Cell and RefCell use UnsafeCell internally and provide a safe wrapper around UnsafeCell. The reason this works is compiler magic. UnsafeCell has a language attribute added to it's type declaration which allows it to do what it does, because the compiler treats it different from other types. If you try to obtain a &mut from a & in any other way, your program will have undefined behavior. There are no exceptions.

1 Like

Did you actually try running your code in the playground? Because this actually does work just fine.

1 Like

That's very wrong, to the point where numerous bugs and security vulnerabilities in the past have unfolded stemming from simple, single-threaded shared mutability.

The prime example of why shared mutability is incorrect is the phenomenon of so-called "iterator invalidation" (the term comes from C++). Consider what would happen if the following code were allowed:

let mut v = vec![1, 2, 3];
let ptr_three = &v[2];
v.push(4);
println!("{}", *ptr_three);

Here, we obtain a pointer to an element of the vector. Then, we push an element onto the vector. This can cause the vector to reallocate its buffer (if it exhausted its capacity), invalidating the previously-obtained pointer. Now dereferencing that pointer (by trying to print its pointed value) is undefined behavior.

A myriad of variations on this theme can be, and have been, written. Multi-threading is definitely not needed for the violation of the RWLock pattern to go horribly wrong.

Furthermore, borrowing semantics is not even sufficient to guarantee correctness in the presence of multi-threading, while also being insufficiently expressive to allow for many real-world use cases of shared memory. That's why we have additional abstractions, like the marker traits Send and Sync and lock types such as Mutex (the latter of which circumvents the borrowing system and uses unsafe under the hood).

10 Likes

Forgive me for not understanding but what is wrong with having multiple &mut to a normal integer variable and what is wrong with modifying a normal variable's number, how does it cause a data race?

It actually doesn't work with me, maybe I have an old version of Rust or something not too sure.

Why can't the pointer's address just change then?

Well, why could it change? You have a pointer pointing somewhere. A pointer is, at a low level representation, just something like an integer index into the address space of the program. Once you obtained a pointer, it's not being "tracked" or anyhing, it's not magic. So the situation after the reallocation is akin to having an integer index into an array which doesn't exist anymore. It's as simple as that.

1 Like

Yeah your crate needs to be Rust 2018 for this to work, unless you have Rust 1.40 (or 1.41 later today).

1 Like

what do I type under depedencies to use the Rust 2018?

This is not a dependency, but rather a package property - look here for reference.

1 Like

With this example that youu have provided, why can't this rule just be applied to vectors but not integers, cause if I have multiple mutiable references to a mutable integer variable, can this still cause datarace and if so how does it?

Rust, as a language, offers two novel paradigms:

Now, you may ask why Rust was designed this way, and the answers can be pretty interesting, but it does not change the fact that Rust works this way. An exclusive reference, by design, must be exclusive.

To rephrase your question:

what is the point of exclusive references for simple types such as integers? More generally, how useful can it be to have a &mut reference on something that is Copy?

At that point, multi-threading is the answer: contrary to other languages, Rust opts into thread-safe paradigms by default (i.e., thread-safety is opt-out rather than opt-in). This raises two questions:

  • how is &mut needed for thread-safety / to avoid data-races?

    Well, you can use exclusive references to mutate data (that's why the are called "mutable references" in an overly simplifying manner, and why the carry mut in their rather than the more apt uniq...) in an unsynchronized manner (e.g., imagine doing *xr += 1; in your example).

    So now imagine doing let xr1 = mutable_ref_to(x); (pseudo-code), and let xr2 = mutable_ref_to(x);
    If that were possible, then you would be able to go and use xr1 in one thread to do *xr1 +=1 and xr2 in another thread and do *xr2 += 1. This would result in a data race, meaning that the value of x "afterwards" would be undefined.

  • how do I opt-out of thread safety? How do I get non-exclusive mutable references to simple types such as integers (or other Copy types)?

    The answer is to stop using exclusive references, and to instead wrap your type (e.g., and integer) in a Cell use a shared reference (& _) to that Cell:

    use std::cell::Cell;
    
    let x = Cell::new(10);
    let xr = &x; // "mutable but non-exclusive reference to `x`"
    println!("{}", x.get()); // 10
    xr.set(42); // mutate!
    println!("{}", x.get()); // 42
    
    • (For those wondering if &Cell contradicts what I said about the data race in my first point, know that it is not possible to Send a mutable reference of type &Cell to another thread because Cell is not Sync, i.e., precisely because Cell offers "shared mutation" (also called interior mutability) that is not Sync-hronized / thread-safe.
8 Likes

lets say that x = 1 and both xr1 and xr2 (they are both mutable references) are pointing to x, right, and I tried to do *xr2 += 1 and *xr1 +=1, you claim that x's value would be undefined, how come it won't equal to 3?

If += is not atomic, instructions could be interleaved and create a value that is unpredictable. For example, if we decompose it as

_1 = load(x)
_2 = _1 + 1
store(x, _2)

We can observe the following concurrent execution:

_a1 = load(x) |                // _a1 = 1
_a2 = _a1 + 1 |                // _a2 = 2
              | _b1 = load(x)  // _b1 = 1
              | _b2 = _b1 + 1  // _b2 = 2
store(x, _a2) |                // x = 2
              | store(x, _b2)  // x = 2

This is a totally valid execution that results in one of the increments being eaten. It gets even worse if the operation could leave the value in a partially written state, potentially causing corruption.

5 Likes

If you're really interested (and not just wanting to push your thoughts), check Deadlock Empire for the collection of somewhat interactive examples. They are in C#, but they do illustrate the idea.

4 Likes

I did not say, that &mut is useless. I just said, it is not as useful as it is for multi-threaded code and that is logically correct, due to the reasons I've already stated. However, I didn't elaborate why it matters, so let me do that, now.

Your example is good for showcasing what would happen, if the immutable borrows, that we use today, would be able to mutate, as well and how that simply can't work. However, it leaves the question of, how a third, new type of borrow might behave, open. For what I imagine, it'd be one that inherits the property to be shared from & and the property to mutate from &mut, but is neither Sync nor Send. & and &mut would still behave as they do today.

Before diving into that mysterious new borrow type, lets examine the problem &mut is trying to solve: it is that of changing lifetimes. Whenever you mutate a value, that contains a unique pointer, the implicit lifetime of the unique pointer is over and the implicit lifetime of the new unique pointer begins. Any non-unique pointers to the value the unique pointer was pointing to become dangling. This is what happens when you have a pointer to one of the elements of a vector and you push new elements to the vector. It has to reallocate memory and may invalidate the old memory.[1]

As already mentioned in this thread, the pointers do not automatically change their value to the new memory address. What &mut does is, it makes sure that no other non-unique pointers exist at the time it was created. For that to work, we have to attach explicit lifetimes to pointers. This is the difference between & and const *.[2]

Without explicit lifetimes and &mut in particular, Vec would be impossible to implement without a RefCell. That would, in return, transform every lifetime error into a runtime error, while both consuming more memory and being slower. No one would want to work with a new abstract language, that is just another worse-performing C without any outstanding benefits. We have plenty of those, already.

This is where &mut comes into play. Due to its guarantee of being the only pointer to a piece of memory, you can reallocate memory without having to worry about leaving dangling pointers behind, unless you messed up your own implementation in an unsafe code block.

This also implicitly answers the question above. What would be so bad about simply mutating some integer on the stack through an immutable borrow. The correct answer is: Nothing. There is no reason for splitting borrows into immutable and mutable for this simple task, but we don't have & and &mut for the simple tasks, we have them for the difficult ones. Using Cell is a zero-cost abstraction for what you want to achieve and we would rather add a bit of verbosity to the simple cases than having to deal with headaches for the complex ones. This is the trade-off Rust chose and IMO, it's worth it.

Getting back to the mysterious new borrow type. A single-threaded version of &mut could work like RefCell in debug mode, but be proven to work correctly during compile time when using optimizations for a stable build and replace the RefCell (includes runtime checks) with an UnsafeCell (no runtime checks). This would obviously not work for all situations, but in those cases where it cannot be proven, the compiler would emit an error and you'd have to either use the &mut we all know or RefCell, if the compiler can't reason about that, as well.

Now, why would this only work for single-threaded, but not for multi-threaded code, you may ask. From a logical point of view, this could also work for multi-threaded code, but due to how multi-threading can shuffle code execution between threads in any possible way, the cases to test for correctness grow exponentially with code size and at some point become practically impossible to check during compilation.

In a nutshell, the advantage of the new borrow type is, that it works like RefCell, but without the added runtime cost of RefCell. It'd help getting rid of manual optimizations, where RefCell is ditched in favor of UnsafeCell without having to resort to unsafe code.

[1] Technically speaking, reallocation may return the same memory address, if it finds enough free space after the already allocated space to grow to the new size. In practice, you have no other choice to assume, that the worst-case will happen, i.e. new memory is allocated with the new size, values are copied from the old to the new space and the old memory deallocated.

[2] After compilation is done, there is no difference between & and const *. Only during compilation are lifetimes tracked for &, but not for const *.

That's a non sequitur. There's no such type in Rust. Why would I need to worry about it, and why would I need to incorporate the explanation, and take into account the behavior, of a made-up type that does not exist, when pointing out that &mut does have its use, regardless of the number of threads in a program?

I'm done arguing about this.