This code is definitely unsound and I know very well I shouldn't ever be writing anything like it and I will not:
fn main() {
let atomic = std::sync::atomic::AtomicU8::new(42);
std::thread::scope(|scope| {
scope.spawn(|| {
let value = unsafe {
*(&atomic as *const _ as *const u8)
};
dbg!(value);
});
atomic.store(42, std::sync::atomic::Ordering::Relaxed);
})
}
However, I wonder why is it specified as undefined? I know undefined behavior enables optimizations but I can't imagine an optimization that could take advantage of this specific UB in the specific case when the value being written is the same as the value already in the memory.
The reading code will not just skip the read operation because of previous synchronization (by calling spawn here, but it could just as easily be Acquire on a different atomic). The writing code will not corrupt the value because the value is the same.
In other words, if someone were to propose to make the behavior defined what would be an argument against doing so?
And if you're wondering, when this would be useful in practice, in no_std, code it could be used as a poor man's Once that makes all threads that attempt to initialize it simply compute the same value and store it and only after at least one thread stored it completely is the value marked as valid for reading. This still wastes some computations (multiple threads doing the same thing) but it doesn't have priority inversion problem which makes it much less bad. One way to workaround it is to just read the bytes using atomics every time but that is wasteful for performance and may prevent passing a pointer to it to foreign code.
If a value is written using an atomic and the value being overwritten is the same then concurrently reading the value using non-atomic should just return the same value, provided the read is somehow synchronized with whatever put that value in the atomic before. Basically the duplicate write is no-op.
provided the read is somehow synchronized – how exactly do you propose to guarantee that it's synchronized? Again, x86 is a weirdo, but even it allows one to read things in a very strange order.
Most other architectures are worse.
Some embedded CPUs provide such warranties, but most “big” CPUs don't.
It would be really strange to add to the language something that couldn't be used on most platforms where Rust used today.
Since you are hitting non-portable corner case the safest way to do that with asm. Then you are bound by your CPU rules and not by Rust rules.
I can't imagine an optimization that could take advantage of this
Every write to every byte through any pointer takes advantage of this!
If writing through a non-atomic pointer type had to do something specific in relationship to other atomics and other threads, then the optimizer could not reorder or simplify any code that uses pointers.
But because the optimizer can assume that no other thread is actively reading or writing memory accessible through non-atomic pointer, it only needs to reason about local single-threaded side effects of using the pointer.
*x = 1;
if *x != 1 {
// UB allows compiler to assume this is dead code
}
*x = 1; // UB allows removing the first two lines
*x = 2;
*x = 3;