Unsafe rust and the borrow checker (multiple mutable borrows)

I've read about unsafe rust and that it does not disable the borrow checker. I thought that implies that I cannot have multiple mutable references to the same memory location. Even if I use unsafe blocks, but I think I've managed to get two mutable borrows of the same memory location:

fn main() {
    let mut i1 = 1;
    let pointer = &mut i1;
    let pointer2: &mut i32;
    unsafe {
        pointer2 = &mut *(pointer as *mut i32);
    }
    println!("Set *pointer2 = 2");
    *pointer2 = 2;
    println!("==> Value of {:<8} is {}","pointer", pointer); // Outputs 2
    println!("==> Value of {:<8} is {}","pointer2", pointer2); // Outputs 2
    println!("Set *pointer = 5");
    *pointer = 5;
    println!("==> Value of {:<8} is {}", "pointer", pointer); // Outputs 5
    println!("==> Value of {:<8} is {}", "pointer2", pointer2); // Outputs 5
}

Playground-Link: Rust Playground

Did I tricked the borrow checker or is this a legal operation in unsafe rust? Are there any "legal" use cases for this?

Greetings Seb

Try running this with Miri (under the tools tab). You'll see that this isn't allowed like this.

2 Likes

What is miri? I've asked because I'm wondering why the compiler does not complain about multiple mutable borrows (as it usually does). If the borrow checker does his work, it should complain about it, shouldn't it?

What is meant by "unsafe doesn't disable the borrow checker" is, that anything that is available in safe Rust and doesn't compile there still won't compile in unsafe Rust. What unsafe Rust does is increase the available features and only those additional features aren't part of the feature set, that is borrow-checked.

3 Likes

Programming language rules form the basis of a contract between the programmer (you) and the compiler. When the programmer breaks the rules an optimizing compiler is permitted to make any transformation when compiling the program. Often at first the program appears to work as the programmer expected, but that's just luck and appearances; there's no guarantee that the rules violation hasn't introduced malware vulnerabilities or subtle timing bugs, or that recompilation after a code change or new compiler release won't introduce such misbehavior.

In safe Rust the compiler checks that the programmer has not violated the contract. unsafe does not permit you to violate your contract with the compiler; it merely tells the compiler that, in limited circumstances, you have taken on the responsibility for checking that the contract is upheld.

To understand this better, think of the compiler as providing an initial checking phase followed by a subsequent optimization phase. If you use unsafe and lie to the compiler, claiming that you have upheld the language contract when you have not, the compiler optimization phase can, and often will, rearrange and optimize your code in a way that causes it to malfunction. That's not the compiler's fault, because it was caused by your own failure to uphold the language contract after you had used unsafe to partially take on that responsibility.

2 Likes

Consider the following small change to your code: (Playground):

#[inline(never)]
fn set(p: &mut i32, q: &mut i32, o: i32, v: i32)
{
    while *p == o {
        *q = v
    }
}

fn main() {
    let mut i1 = 1;
    let pointer = &mut i1;
    let pointer2: &mut i32;
    unsafe {
        pointer2 = &mut *(pointer as *mut i32);
    }
    println!("Set *pointer2 = 2");
    *pointer2 = 2;
    println!("==> Value of {:<8} is {}","pointer", pointer);
    println!("==> Value of {:<8} is {}","pointer2", pointer2);
    println!("Set *pointer = 5");
    set(pointer, pointer2, 2, 5);
    *pointer = 5;
    println!("==> Value of {:<8} is {}", "pointer", pointer);
    println!("==> Value of {:<8} is {}", "pointer2", pointer2);
}

what would you expect the output of this program to be? (Hint: run with "Release" in the playground)

2 Likes

Ah, that makes sense, but I think the documentation is misleading.

I see the point (I think). The compiler assumes that *p never changes it's value because *p should be an exclusive mutable reference and therefore replaces the while loop with a loop {}. But I know that we have structures to create "multiple" mutable references by using constructs like Arc<Mutex<Box<T>>>. How is this possible if this should be against the rules? (I know only one thread at a time can access the value, but doesn't this make it possible to have multiple mutable references)?

This sounds a little bit like circular reasoning. If the compiler enforces me to always uphold the contract (it doesn't let me do unsound things) but allows me to do this in unsafe code while holding me responsible for not violating the contract at the same time, why should somebody use unsafe rust at all if he or she is not allowed todo more things than in safe rust.

So the question is what am I allowed to do in unsafe rust which I'm not allowed to do in safe rust while upholding the contract with the compiler?

1 Like

In an unsafe block you gain the following "superpowers":

Dereference a raw pointer
Call an unsafe function or method
Access or modify a mutable static variable
Implement an unsafe trait
Access fields of unions

All other rules of safe Rust are respected, but if you violate those rules via one of the above actions, it's on you and not the compiler.

2 Likes

Here are some answers that I've posted before in URLO, which address your last two posts, plus a great Quote of the Week from @alice.

2 Likes

Thank you, I will look at the posts. I'm just a little bit confused, because there is a split_at_mut(idx) function in the standard library and it looks like it violates these contracts, doing similiar things like my code, but it seems they do not break optimization:

...
fn main() {
let mut a = [1,2,3,4,5];
let (b,c) = a.split_at_mut(3);
...

You need to look at the implementation of split_at_mut() to see why this is so: here and here. This is a good example of a programmer using unsafe correctly to perform a function too complex for the Rust compiler front-end to analyze while still complying with all the input requirements of the LLVM backend.

1 Like

I've inspected the source and it takes &mut self pointer (the slice) converts it to a raw mutable pointer and calls:

which returns a reference which is dereferenced and then mutable referenced again. What's the difference? Why is it safe to do that in these functions but not in my code?

The soundness rules for Rust must always be followed, even in unsafe code. If they are not followed, your unsafe code is simply incorrect. However, there are cases where the borrow checker is too strict and rejects code that does follow the soundness rules.

In the case of split_at_mut, it is correct because the two halves do not overlap with each other.

2 Likes

These structures make sure that there references are not coexisting (i.e. one of them is destroyed before other is created). If they didn't, yes, they would be unsound.

It’s not forbidden to have two mutable references to the same target if one is a so-called “reborrow” of the other. This can be done in safe rust

fn main() {
    let mut i1 = 1;
    let pointer = &mut i1;
    let pointer2: &mut i32;
    pointer2 = &mut *pointer;
    println!("Set *pointer2 = 2");
    *pointer2 = 2;
    /* no good */ // println!("==> Value of {:<8} is {}","pointer", pointer); // Outputs 2
    println!("==> Value of {:<8} is {}","pointer2", pointer2); // Outputs 2
    /* but it’s okay here */ println!("==> Value of {:<8} is {}","pointer", pointer); // Outputs 2
    println!("Set *pointer = 5");
    *pointer = 5;
    println!("==> Value of {:<8} is {}", "pointer", pointer); // Outputs 5
    /* no good */ // println!("==> Value of {:<8} is {}", "pointer2", pointer2); // Outputs 5
}

the point to notice here is that pointer is not usable while pointer2 ist still alive. This is why the first commented-out line won’t work, because pointer2 is used in the following line, so it must be still alive, and also why the final commented-out line won’t work. If you try adding either of the lines in you’ll get an error from the borrow checker. Now, a mutable reference created indirectly through a pointer that you got from another mutable reference, i.e. what your original code does, is totally fine. It’s going to behave like a re-borrow. However, the intermediate step of having a pointer takes the borrow checker out of the picture. While this code

fn main() {
    let mut i1 = 1;
    let pointer = &mut i1;
    let pointer2: &mut i32;
    unsafe {
        pointer2 = &mut *(pointer as *mut i32);
    }
    println!("Set *pointer2 = 2");
    *pointer2 = 2;
    /* no good */ // println!("==> Value of {:<8} is {}","pointer", pointer); // Outputs 2
    println!("==> Value of {:<8} is {}","pointer2", pointer2); // Outputs 2
    /* but it’s okay here */ println!("==> Value of {:<8} is {}","pointer", pointer); // Outputs 2
    println!("Set *pointer = 5");
    *pointer = 5;
    println!("==> Value of {:<8} is {}", "pointer", pointer); // Outputs 5
    /* no good */ // println!("==> Value of {:<8} is {}", "pointer2", pointer2); // Outputs 5
}

is totally fine, uncommenting one of the two commented-out lines will not result in a compiler error, yet it’s still something you’re disallowed to do. [“Disallowed” means it’s UB (“undefined behavior”) i.e. the compiler can do any kinds of nonsensical “optimizations” to your code, as explained above in the thread.] You can test the above code with miri and see how it only complains once you uncomment at least one of the two commented out lines.

A way of phrasing the rules for raw pointers here is: If you use a mutable reference again (in this case the reference pointer) that was used to create a raw pointer (the pointer as *mut i32), then that pointer is invalidated. Once that raw pointer is invalidated, you must no longer use any mutable reference that was created from dereferencing it anymore either (i.e. pointer2 must not be used anymore after pointer is first accessed again after the creation of the pointer as *mut i32 raw pointer).

As you can see, rules for raw pointers (*mut T) in Rust are really complicated and not even 100% finalized yet, so using unsafe in Rust is kind-of hard and in general, especially for some high-level application programming it’s a good idea to not use it if you don’t have to (and you “have” to use it way less commonly than you might think). Also, if you’re taking the source-code of the standard library as an example for unsafe Rust code, the standard library sometimes does things that you must not do as a user. It’s not the case for this example of split_at_mut, but it’s a useful thing to know. The standard library has some extra power in the form of: It’s written by the authors of the compiler and distributed and versioned together with the compiler, so they can rely on implementation details that are subject to change in the future and write code that’s “officially” UB but the compiler doesn’t do any “bad optimizations” to it yet. (In those cases, if user code was to use the same kind of code, it could break and introduce miscompilation in future compiler versions; the code in the standard library would be replaced / adapted together with those kinds of compiler updated.)

3 Likes

Another important point is that only &mut T references must be unique. There are other types of references such as &Cell<T> or Arc<Mutex<T>> or *mut T that do not have to be unique even if they allow mutation to happen through them.

OFFTOPIC:
I think that is something that really bothers me about rust. Because in my personal opinion a language should make it possible to write an efficient Standard Library without the need for "marker traits" or special attributes. It should not be tightly coupled to the compiler (as far as I know you do not need any special handling on the compiler side to be able to compile the C std library).

To quote @alice reply, I just think this is some sort of special handling. I'm wondering why this structures aren't built in types of the language if they cannot be fully reimplemented by anyone.

I'm used to code in C and it's really easy to do worksplitting (if you know what you're doing) and makes me really stumble upon rust, that I'm not able todo work splitting without many "workarounds" (which can be unefficient or unsound) to make the compiler happy. This is some reason I think you often have to use unsafe (to write efficient code).

I think this example is well known* (and does what it should do without any data races) but is not allowed by the compiler and requires the developer to use something like scoped threads or some more boilerplate (like Arc and Mutex) to achieve the same goal. None of them is really needed to ensure
correctness and safety in this case. Sidenote: Just a minimal example, normally you would operate on disjunctive chunks of data and do heavy operations in the threads

use std::thread;

// Multi-Threaded: does not work
fn main() {
    let mut full_work = [1,2,3,4,5];
    let mut h = vec![];
    for i in &mut full_work {
        h.push(thread::spawn(move || {
            *i = 1;
        }));
    }
    for i in h {
        i.join().unwrap();
    }
    println!("Full work: {:?}", full_work);
}

/* // Single threaded: works
fn main() {
    let mut full_work = [1,2,3,4,5];
    
    for i in &mut full_work {
        *i = 1;
    }
    
println!("{:?}", full_work);
}
*/
  • meaning I'm not the first one to stumble upon this

Well, they are needed because the compiler doesn't understand in detail what code does. It can only do “so much” with static analysis, it cannot - for example - know where in your code you're guaranteed to have called .join on every element in a vector before leaving the function. Hence it rejects the program, because sound static analysis tools need to be conservative. Note that you can add unsafe code to your example in order to promise to the compiler that you aren't doing anything wrong. (E.g. you can extend the lifetime of i.)

Using unsafe code is always tricky. Even your “well known” “and does what it should do” example above doesn't do what it should do AFAICT. (Explanation follows, please correct me if I'm wrong.) When the OS fails to create a thread, then one of the thread::spawn might fail and panic with the effect that previously spawned threads aren't joined but detached while the main thread unwinds out of the main function. Any of the detached threads still running that and accessing i would be a case of use-after-free. The example would be even more clearly wrong if this wasn't in the main function, but I believe even for main there is a point in time in the execution where the stack frame of main is dropped but unwinding has not finished and terminated the program yet.

The reason why people prefer to use things like scoped threads is because we like Rustʼs feature of “fearless concurrency” which means not having to deal with subtle bugs like the one described above. The creators of crossbeam::scope can handle all the subtle corner cases for us and we can stay within the realm of safe Rust. (Also, scoped threads are arguably even easier to use because you don't need to handle the vector fo handles and the calls to join yourself.)