Why can't I change a reference?

I am from a C++ background and have been reading this: A half-hour to learn Rust
(Don't worry this is not the only thing I use to learn Rust)

In there:
let mut x = 42;
let x_ref = &x;
x = 13;
println!("x_ref = {}", x_ref);

This will give: cannot assign to 'x' because it is borrowed

Why is that? I expected x_ref to be 13 afterwards, as it is just referencing the same memory as x.
Maybe my question is: Why is a reference different to borrowing?

How do I circumvent this? How can I create a Variable that can be read by multiple other variables?

I'd say this is one of the fundamental properties of rust's borrow checker: any point of memory an either have one or more immutable references, or at most one mutable reference. It does not allow having a mutable reference, or mutating, any memory which is borrowed by an immutable reference.

As for what exactly this means, why rust does this, how to get around it, and how to program with this limitation, I have to defer to other resources. I've linked some at the bottom of this post.

I don't really understand this question, because I believe they are the same, at least in rust. Every reference always borrows.

Note that you can read from multiple variables. You're free to read x here while x_ref exists. Rust only disallows writing when multiple references exist.

Might I recommend some other resources on this? I don't think I can explain the full situation very well myself.

If you have any questions about those articles, or have any trouble internalizing them, feel free to ask! I just think they can explain the situation better than I can.

5 Likes

Basically it's because an &i32 requires the target to be immutable for the duration of that reference, so it would be UB to modify x while an immutable reference to x exists.

There's a great blog post on this topic, which I think should be required reading for anyone learning the borrow rules for the first time: The Problem With Single-threaded Shared Mutability - In Pursuit of Laziness

The gist is that this design eliminates a class of undefined behavior with mutable aliasing. The concept is well-known outside of Rust, as well. In this case, with Java, though the language does not protect you from the types of mistakes explored here: Reading 9: Mutability & Immutability

2 Likes

If you are coming from C++, you surely are familiar with the concept of iterator invalidation:

#include <vector>
#include <iostream>

int main() {
    std::vector<int> v { 42 };
    int *ptr = &v[0];

    // This is OK:
    std::cout << "before reallocation: " << *ptr << '\n';

    // This is UB because `v` has been modified; probably crashes
    v = std::vector<int> { 13 };
    std::cout << "after reallocation: " << *ptr << '\n'; // undefined

    return 0;
}

The whole point of the ownership and borrowing system in Rust is to catch this kind of shared mutability bugs and make it a compile time error, instead of causing undefined behavior.

10 Likes

The same problems exists in e.g. Java, except you just get an ConcurrentModificationException when you modify a collection while iterating it. Rust doesn't let you compile code that would throw that exception in Java.

8 Likes

Since we're listing examples, and there's some C++ background, another big one is references to things which are deallocated. For instance, this fails at compile time rather than accessing deallocated memory at runtime:

let v = vec![1, 2, 3]; // allocated with initial capacity of 3 i32s
let r = &v[0];
v.push(4); // forces reallocation
println!("{}", r); // would read uninitialized memory
3 Likes

The previous posts explain the motivation and reasoning behind Rust's design (which is indeed very interesting). But really, one must understand that, due to that motivation ("shared mutable state is bad") Rust offers a design with very specific rules, and so these rules must be followed, even if they may be overly restrictive in some cases. The rule here is the following:

  • & _ is a shared reference: the pointer can be copied around, so it is perfectly valid to have multiple such pointers pointing to the same value (we say the pointers are aliased). Such pointers / references are the ones used by "all" other languages.

  • &mut _ is a "Rust invention" / specific to Rust (and maybe to some other language; but the truth is most other languages do not have such a construct / reference): it is an exclusive / unique / unaliased reference / pointer. It could have been named &uniq.

    By very definition, by construction, there cannot be other potentially active / usable pointers pointing to the same value.

Back to your example, the other design choice of Rust is to make mutation, "by default", happen through these exclusive &uniq references (that's why such references are named mut; but, imho, that was a mistake, as this (recurring) thread proves).

When you have:

let mut x = 42;
let x_ref = &x;
x = 13;
println!("x_ref = {}", x_ref);

you are doing:

  1. let mut x ...
    You say that x is a value you may want to take exclusive references to (the main purpose of doing that being to mutate it in the default fashion, hence it being named mut).

  2. let x_ref = &x;
    You get a shared reference / "pointer" to x, which is held (and thus usable) up until the last time it is used:

    println!("x_ref = {}", *x_ref); /* read-deref */
    
  3. x = 13
    This is using the default mechanism to mutate x: that of going through a &mut exclusive reference. In other words, that line is sugar for:
    *(&mut x) = 13;

And there lies the problem: we are attempting to get an exclusive reference / pointer to x, despite there being another active / usable pointer around (x_ref). Hence the error.


Another way of seeing this is to imagine Rust's & / &mut system as a compile-time RwLock (multiple readers, single writer) that throws a compile-error whenever the code would deadlock:

let x_ref = &x; // acquire a read lock
*(
    &mut x // try to acquire a write / exclusive lock
) = 13;
println!("x_ref = {}", x_ref); // last usage of the read lock;
                               // the lock is only released afterwards

Now, you may have noticed that I have insisted on x = 13 being the default way to mutate a value, which hints at there being other ways to do it.

Indeed, if C / C++, for instance, which do not posess that &uniq reference abstraction, is able to do:

int x = 42;
int * x_ref = &x;
x = 13;
printf("x_ref = %d", *x_ref);

then Rust ought to also be able to express that, right? And indeed, it does; it's just that you won't be able to use the x = 13; mutation sugar, or the *x_ref read-dereference sugar, since those are reserved to the idiomatic by-exclusive-reference-mutation and by-shared-reference -read patterns respectively. Instead one must use more explicit constructs, mainly that of single-threaded by-value mutation, which is what the above C/C++ code is relying on:

// wrapper for single-threaded by-value mutation
use ::std::cell::Cell as SingleThreadedMutable;

let x = SingleThreadedMutable::new(42);
// shared reference (same as in C)
let x_ref = &x;
// by-value mutation through another shared reference (safe because thread-local)
x.set(13); /* sugar for (&x).set(13) */
println!("x_ref = {}", x_ref.get()); // outputs 13
10 Likes

I am always amazed by how much effort the rust community puts into helping each other.

However the C++ examples with the Vector don't really hit my point. Its true and it makes sense to me that it should be the same way in rust because of reallocation. That does make sense


But integers when assigned do not need to reallocate.
memory_howIthinkitworks
This is a different story with multiple threads, but this is a single thread example.
In general this is understandable with complicated types, that do reallocation when assigning new values, so old pointers start pointing to invalid memory.
But a simple integer is a bit restrictive isn't it? It doesn't change its location in memory once I overwrite it.
Of course the rules might be generalizing the problem. But then I think it would be better if it applies only to certain traits.

Maybe I didn't understand what you tried to tell me and need to learn more about borrowing to be able to understand the problems with my design. I now have a bunch of recommended references. Maybe the words that rust specifies do not convey what it does to memory under the hood :thinking:?

1 Like

The point of borrowing and references is that they are supposed to guarantee safe shared access (in the case of &T) or unique access (in the case of &mut T). These types of reference are mutually exclusive and involve fairly strong guarantees that protect against various problems (as in the iterator invalidation example), but can also be used to safely permit optimisations (as with the const and restrict keywords in C, but with the conditions checked by the compiler to rule out undefined behaviour).

You are correct that it makes little sense for i32, but the answer is to just not use references with types like that (small types that implement Copy). A reference to something like i32 is rarely seen for this reason, and usually just occurs when using a generic function that doesn't necessarily know it is dealing with a Copy type (and there are functions such as Iterator::copied() to deal with such cases). Instead of using a reference, you could instead just copy the value for an i32. Mutability through shared references is provided for by things such as Cell and RefCell, but it isn't clear that you need that in this case.

4 Likes

I beg to differ. Treating types uniformly is a massive win in terms of learnability, fewer bugs, and even things under the hood like optimizations and compiler complexity.

I literally never had the need to violate the RWLock pattern even with simple integers. It just doesn't come up unless you're doing something weird that you shouldn't be doing anyways. Granted, probably we could special-case some types, but why bother with all that complexity, when one could just learn the idioms of the language and write beautiful, regular code?

15 Likes

I think you're right-on here: it's a generalization. This helps learnability as @H2CO3 mentioned, but it also helps reading code. If you take a & reference to something, you know the thing underneath won't change (unless the types uses an escape hatch).

In practice, having rules that apply uniformly to everything makes understanding code simpler, and means you don't have to think differently when dealing with integers vs. allocated types. Rust actually focuses on this a lot: Box<T>, for instance, behaves exactly like a T in pretty much every way, except for the fact that it's heap allocated. It's created just like any other wrapper, and you don't have to worry about freeing it since it's owned just like the value inside it.

With that said, your point here is valid:

And this is why Rust provides escape hatches. @Yandros mentioned Cell earlier: it is essentially you telling rust "this type is simple, I don't want regular borrow rules".

It is truly zero-overhead: it's just a type change, no extra data is stored. The only differences is that it allows you to mutate simple things without messing with mutable borrows.

Just like you mentioned, too, Cell requires a specific trait: Copy - a trait for things which have no destructors and can be memcopied to create a new instance.

The only difference is that this is an explicit opt-in allowance. It's kind of like how Rust makes mutability opt-in: sure, there would be less typing if every variable was mutable, but it makes it harder to reason about what the code is meant to do. That extra code to mark a type as mutable, or as escaping-the-borrow-rules-due-to-a-simple-type means the reader has fewer possibilities to keep in their head at any given time.

6 Likes

Apparently, Cell doesn't actually require Copy. It's just that you can only call get on it if it contains copy data.

3 Likes

Yeah. For non-Copy types, you can use swap instead.

2 Likes

The argument to treat single-threaded shared mutable state the same as multi-threaded shared mutable state was already made, and I provided a link to one such argument.

The real case against the code in the OP is that the compiler wants to optimize around the guarantee that data cannot be changed behind &T (except for UnsafeCell). RalfJung has written about optimizations and undefined behavior in the past; this article comes to mind: https://www.ralfj.de/blog/2017/07/14/undefined-behavior.html

And these posts on the LLVM blog are invaluable:

8 Likes

Now that I got all these reasons I finally understand. I also see now that this is actually not that far from the vec example after all. Its not about that the memory it points to is valid, but rather what value it points to. It could be a pain to debug something like that.

I also agree, that steepening the learning curve is not really helpful at this point.

Meanwhile I have encountered the reverse case:

struct Obj{
    v: Vec<i32>,
}

impl Obj{
    fn bla(&self){
        println!("bla");
    }
}

fn main() -> (){
    let mut obj = Obj{v: vec![34, 53, 11, 4]};
    obj.v.iter_mut().for_each(|it| {
        println!("val: {}", it);
        obj.bla();
        *it += 1;
    });
}

The obj is being burrowed mutably for the iterator. Inside the closure the same obj is being used immutably. How do I tell the compiler, that obj.bla() is safe to use? It won't affect the iterator in any way.
This of course is converted to its base issue.
In a real world application v corresponds to a buffer of pixels and bla() would make sure that the passed pixel is within bounds.

As far as I understand Cell won't help here, because I would need a get(), but Copy is not implemented. It wouldn't make sense to clone the whole struct anyway, because it would then clone the entire vector as well.

1 Like

Calling instance methods within a mutable iterator is indeed challenging. In the contrived example, bla could be moved to a bare function, or even a static method and called with Obj::bla() so that it does not borrow self. There was some open discussion on this specific class of issues a while ago: Blog post series: After NLL -- what's next for borrowing and lifetimes?

3 Likes

The case with Obj::bla() is like you said only a reasonable approach in that example.
So I wanted to give a more accurate representation of the situation.

struct Img{
    pixels: Vec<i32>,
}

impl Img{
    fn in_bounds(img: &Img, x: usize) -> bool{
        if x < img.pixels.len(){
            return true;
        }
        false
    }
}

fn main() -> (){
    let mut img = Img{pixels: vec![34, 53, 11, 4]};
    img.pixels.iter_mut().enumerate().for_each(|it| {
        let (x, val) = it;
        if Img::in_bounds(&img, x + 1) {
            println!("x + 1 is within bounds");
        }else {
            println!("x + 1 is out of bounds");
        }
        
        *val += img.pixels[x+1];
        println!("val: {}", val);
    });
}

I think this is one of the situations which really is a downside of rust's ownership model. Specifically, the most idiomatic solution to this is avoiding it through a different design.

Luckily, it is pretty much always possible to split data out in such a way that you don't have to mutate a struct you're also reading data from. Sometimes that means creating sub-structs for the portion of the data you need to access separately. If it's simple data you're reading, you can have a method which returns a different Copy structure which can be used separately. I think this is the best solution for your last post.

But in general, the solution is to design your data different so you don't need to do this. I would say it's one of the biggest ways rust differs from classical OOP besides the lack of inheritance. In rust, creating data structures can be just as much a pragmatic design decision as it is a decision to group similar data together.

For another example, this is why the ECS pattern is so widespread in rust game development. The ownership rules and mutation rules make it almost impossible to do use OOP-style-stored game data.

I could point to some arguments that the style rust enforces is better, or cleaner, or leads to fewer bugs. However, I think all of those reasons are second to the fact that rust's rules make it very inconvenient to work with this type of system. Rust strongly discourages having two pieces of data, which need simultaneous write or read/write access, stored in the same place.


Having said that, I don't want to discourage your usage of Rust! When I say the solution is to design differently, that's because a different design is almost always possible - and when it isn't, there are further workarounds available. I've used Rust enough that this kind of thing is natural, but it does take some time getting used to. And I'd definitely say the pain of changing design patterns is worth the reward of safety and clear ownership.

As a last note - changing the code up entirely isn't necessary very often. It might lead to a more idiomatic solution, but usually there's also a "quick fix" possible. Like in your last example, you can pull out the size before iterating. Or in the previous one, make the non-static method static. If you're designing a game engine, the best solution is to simply use ECS - but if you're writing a utility function which happens to access two bits of data, just getting both bits earlier is easier than creating two new structures for the two sides.

If you want to look at some of Rust's history with this debate, I recommend looking up "partial borrows". There have been a number of proposals to allow not exactly something which fixes your code, but something similar. I think this issue, rfcs/issues/1215, is one of the earliest discussions, and most other proposals are linked there.

6 Likes

When exclusive-access mutation is too cumbersome, you can often opt out of its "power & responsibility" and go back to good old shared mutable state (in a single-threaded context).

use ::core::cell::Cell;

struct Img {
    pixels: Vec<Cell<i32>>,
}

impl Img {
    fn in_bounds (img: &'_ Img, x: usize)
      -> bool
    {
        x < img.pixels.len()
    }
}

fn main ()
{
    let img = Img {
        pixels: [34, 53, 11, 4].iter().copied().map(Cell::new).collect(),
    };
    img.pixels.iter().enumerate().for_each(|(x, val)| {
        if Img::in_bounds(&img, x + 1) {
            println!("x + 1 is within bounds");
            val.set(val.get() + img.pixels[x + 1].get());
        } else {
            println!("x + 1 is out of bounds");
        }
        println!("val: {}", val.get());
    });
}

But as it's been mentioned, usually there are ways to tackle the issue and still be able to use exclusive access references. Here, for instance:

  • mixing direct usage of a struct internals (a Vec here) with partial abtraction, such as Img::in_bounds(&img, x + 1), performs indeed poorly in Rust, at least when exclusive (&mut) references are involved.

  • *val += img.pixels[x+1]; this line could actually lead to a data race, provided the iteration is refactored into happening in parallel, despite the functions having the same signatures: Playground (in Rust, an iterator is able to yield access to its different elements concurrently, so the borrowing restriction here is justified).

  • One good way to avoid issues with borrows that are too restrictive is to sometimes use indices instead of references:

    struct Img {
        pixels: Vec<i32>,
    }
    
    fn main ()
    {
        let mut img = Img {
            pixels: vec![34, 53, 11, 4],
        };
        let len = img.pixels.len();
        (0 .. len).for_each(|i| {
            if i + 1 < len {
                img.pixels[i] += img.pixels[i + 1];
            }
            println!("val: {}", img.pixels[i]);
        });
    }
    
4 Likes