UnsafeCell implementation details

As far as I understand the UnsafeCell opts-out of the rust immutability guarantee for shared references by setting the noalias tag in llvm. This has been re-enabled with Enable mutable noalias for LLVM >= 12 by nikic · Pull Request #82834 · rust-lang/rust · GitHub, but how did it work before? I can't find the code in the rust compiler where this is handled and godbolt returns the same llvm ir for these two examples:

use std::cell::UnsafeCell;

pub fn safe_cell_example() {
    let x = UnsafeCell::new(42);
    let ptr = x.get();
    unsafe {
        let ref1 = &*ptr;     // LLVM knows this memory might
        *ptr = 13;            // be aliased, so it preserves
        println!("{}", ref1); // operation order
    }
}

and

pub fn unsafe_example() {
    let mut x = 42;
    let raw = &mut x as *mut i32;
    unsafe {
        let ref1 = &*raw;     // LLVM might optimize/reorder these
        *raw = 13;            // operations because it assumes
        println!("{}", ref1); // no aliasing possible
    }
}

With my understanding I should see the !noalias attribute on the bottom example, but it outputs the same llvm ir for me.

I am not an expert in what Rust does with LLVM exactly, but I believe the primary place noalias is used is function parameters. Write functions with parameters and you will see that only the second one has noalias.

use std::cell::UnsafeCell;
use std::ptr::addr_of;

pub fn unsafe_cell_example(x: &UnsafeCell<i32>) {
    let ptr = x.get();
    unsafe {
        let ref1 = &*ptr;
        *ptr = 13;
        println!("{}", ref1);
    }
}
pub fn unsound_example(x: &i32) {
    let raw = addr_of!(*x) as *mut i32; // writing through this pointer is UB
    unsafe {
        let ref1 = &*raw;
        *raw = 13;
        println!("{}", ref1);
    }
}

The PR you're linking has to do with the annotations that apply to mutable references. However, UnsafeCell only changes anything for shared/immutable references. So the PR is not relevant.

Also, your code is not okay. Run it under miri and get:

error: Undefined Behavior: trying to retag from <1569> for SharedReadOnly permission at alloc767[0x0], but that tag does not exist in the borrow stack for this location
    --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2404:1
     |
2404 | fmt_refs! { Debug, Display, Octal, Binary, LowerHex, UpperHex, LowerExp, UpperExp }
     | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     | |
     | trying to retag from <1569> for SharedReadOnly permission at alloc767[0x0], but that tag does not exist in the borrow stack for this location
     | this error occurs as part of retag at alloc767[0x0..0x4]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1569> was created by a SharedReadOnly retag at offsets [0x0..0x4]
    --> src/main.rs:7:20
     |
7    |         let ref1 = &*ptr;     // LLVM knows this memory might
     |                    ^^^^^
help: <1569> was later invalidated at offsets [0x0..0x4] by a write access
    --> src/main.rs:8:9
     |
8    |         *ptr = 13;            // be aliased, so it preserves
     |         ^^^^^^^^^
     = note: BACKTRACE (of the first span):
     = note: inside `<&i32 as std::fmt::Display>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2394:71: 2394:78
     = note: inside `core::fmt::rt::Argument::<'_>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:177:76: 177:95
     = note: inside `std::fmt::write` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1190:21: 1190:44
     = note: inside `<std::io::StdoutLock<'_> as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:1839:15: 1839:43
     = note: inside `<&std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:792:9: 792:36
     = note: inside `<std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:766:9: 766:33
     = note: inside `std::io::stdio::print_to::<std::io::Stdout>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1122:21: 1122:47
     = note: inside `std::io::_print` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1232:5: 1232:37
note: inside `main`
    --> src/main.rs:9:9
     |
9    |         println!("{}", ref1); // operation order
     |         ^^^^^^^^^^^^^^^^^^^^
     = note: this error originates in the macro `fmt_refs` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

The problem is that ref1 is a normal immutable reference to a type that doesn't involve UnsafeCell. Therefore, the value it references must not change between creation and last use of the reference. Since *ptr = 13 happens between the creation and last use of ref1, you have UB.

For the second example this was my intention. What I am trying to see is how a raw pointer (*mut T) is different from a UnsafeCell. So, why is the second one UB, but the first one isn't?

Changed the examples to:

use std::cell::UnsafeCell;

pub fn main() {
    let x = UnsafeCell::new(42);
    let ptr = x.get();
    unsafe {
        let ref1 = &x;     // LLVM knows this memory might
        *ptr = 13;            // be aliased, so it preserves
        println!("{:?}", *ref1.get()); // operation order
    }
}

and

pub fn main() {
    let mut x = 42;
    let raw = &mut x as *mut i32;
    unsafe {
        let ref1 = &*raw;     // LLVM might optimize/reorder these
        *raw = 13;            // operations because it assumes
        println!("{:?}", ref1); // no aliasing possible
    }
}

It seems that in LLVM the ref1 of the first example is completely optimized away. Is that the "trick" that makes UnsafeCell work?

Godbolt: Compiler Explorer

In your original post, both are UB. Your updated examples are a different story.

1 Like

Yeah, goofed up on the first post, the &*ptr obviously has nothing to do with UnsafeCell anymore. My intention was to do something like in the second post, so seeing the difference between &UnsafeCell<T> and *mut T

In the first example ref1 does not point to the contents of the refcell. The actual access occurs only later with the *ref1.get().

In the second function however ref1 accesses the pointed value immediately by creating a reference to, and while that reference is "live" you modify the value it points to.

I would suggest you to stop thinking in terms of LLVM as what is UB is defined in terms of a Rust-specific abstract machine and might not be reflected in the LLVM output that rustc generates.

2 Likes

Actually make sense, because they both run fine, it's just miri that acts up. Your answer also explains why I couldn't see ref1 in the IR. Thanks everyone for the help!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.