As far as I understand the UnsafeCell opts-out of the rust immutability guarantee for shared references by setting the noalias tag in llvm. This has been re-enabled with Enable mutable noalias for LLVM >= 12 by nikic · Pull Request #82834 · rust-lang/rust · GitHub, but how did it work before? I can't find the code in the rust compiler where this is handled and godbolt returns the same llvm ir for these two examples:
use std::cell::UnsafeCell;
pub fn safe_cell_example() {
let x = UnsafeCell::new(42);
let ptr = x.get();
unsafe {
let ref1 = &*ptr; // LLVM knows this memory might
*ptr = 13; // be aliased, so it preserves
println!("{}", ref1); // operation order
}
}
and
pub fn unsafe_example() {
let mut x = 42;
let raw = &mut x as *mut i32;
unsafe {
let ref1 = &*raw; // LLVM might optimize/reorder these
*raw = 13; // operations because it assumes
println!("{}", ref1); // no aliasing possible
}
}
With my understanding I should see the !noalias attribute on the bottom example, but it outputs the same llvm ir for me.
I am not an expert in what Rust does with LLVM exactly, but I believe the primary place noalias is used is function parameters. Write functions with parameters and you will see that only the second one has noalias.
use std::cell::UnsafeCell;
use std::ptr::addr_of;
pub fn unsafe_cell_example(x: &UnsafeCell<i32>) {
let ptr = x.get();
unsafe {
let ref1 = &*ptr;
*ptr = 13;
println!("{}", ref1);
}
}
pub fn unsound_example(x: &i32) {
let raw = addr_of!(*x) as *mut i32; // writing through this pointer is UB
unsafe {
let ref1 = &*raw;
*raw = 13;
println!("{}", ref1);
}
}
The PR you're linking has to do with the annotations that apply to mutable references. However, UnsafeCell only changes anything for shared/immutable references. So the PR is not relevant.
Also, your code is not okay. Run it under miri and get:
error: Undefined Behavior: trying to retag from <1569> for SharedReadOnly permission at alloc767[0x0], but that tag does not exist in the borrow stack for this location
--> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2404:1
|
2404 | fmt_refs! { Debug, Display, Octal, Binary, LowerHex, UpperHex, LowerExp, UpperExp }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to retag from <1569> for SharedReadOnly permission at alloc767[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of retag at alloc767[0x0..0x4]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1569> was created by a SharedReadOnly retag at offsets [0x0..0x4]
--> src/main.rs:7:20
|
7 | let ref1 = &*ptr; // LLVM knows this memory might
| ^^^^^
help: <1569> was later invalidated at offsets [0x0..0x4] by a write access
--> src/main.rs:8:9
|
8 | *ptr = 13; // be aliased, so it preserves
| ^^^^^^^^^
= note: BACKTRACE (of the first span):
= note: inside `<&i32 as std::fmt::Display>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2394:71: 2394:78
= note: inside `core::fmt::rt::Argument::<'_>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:177:76: 177:95
= note: inside `std::fmt::write` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1190:21: 1190:44
= note: inside `<std::io::StdoutLock<'_> as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:1839:15: 1839:43
= note: inside `<&std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:792:9: 792:36
= note: inside `<std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:766:9: 766:33
= note: inside `std::io::stdio::print_to::<std::io::Stdout>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1122:21: 1122:47
= note: inside `std::io::_print` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1232:5: 1232:37
note: inside `main`
--> src/main.rs:9:9
|
9 | println!("{}", ref1); // operation order
| ^^^^^^^^^^^^^^^^^^^^
= note: this error originates in the macro `fmt_refs` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
The problem is that ref1 is a normal immutable reference to a type that doesn't involve UnsafeCell. Therefore, the value it references must not change between creation and last use of the reference. Since *ptr = 13 happens between the creation and last use of ref1, you have UB.
For the second example this was my intention. What I am trying to see is how a raw pointer (*mut T) is different from a UnsafeCell. So, why is the second one UB, but the first one isn't?
Changed the examples to:
use std::cell::UnsafeCell;
pub fn main() {
let x = UnsafeCell::new(42);
let ptr = x.get();
unsafe {
let ref1 = &x; // LLVM knows this memory might
*ptr = 13; // be aliased, so it preserves
println!("{:?}", *ref1.get()); // operation order
}
}
and
pub fn main() {
let mut x = 42;
let raw = &mut x as *mut i32;
unsafe {
let ref1 = &*raw; // LLVM might optimize/reorder these
*raw = 13; // operations because it assumes
println!("{:?}", ref1); // no aliasing possible
}
}
It seems that in LLVM the ref1 of the first example is completely optimized away. Is that the "trick" that makes UnsafeCell work?
Yeah, goofed up on the first post, the &*ptr obviously has nothing to do with UnsafeCell anymore. My intention was to do something like in the second post, so seeing the difference between &UnsafeCell<T> and *mut T
In the first example ref1 does not point to the contents of the refcell. The actual access occurs only later with the *ref1.get().
In the second function however ref1 accesses the pointed value immediately by creating a reference to, and while that reference is "live" you modify the value it points to.
I would suggest you to stop thinking in terms of LLVM as what is UB is defined in terms of a Rust-specific abstract machine and might not be reflected in the LLVM output that rustc generates.
Actually make sense, because they both run fine, it's just miri that acts up. Your answer also explains why I couldn't see ref1 in the IR. Thanks everyone for the help!