Correctly aliasing a *mut T

My friend had some unsafe code that wrote pixels into a buffer on one thread and displayed the buffer on another thread, and is fine with data races / incomplete writes causing glitchiness. But the code aliased the buffer as a Vec in both threads which seems like a violation of the aliasing rules. Out of curiosity, I tried to make a version that doesn't break the rules.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=70b339e4991e0791808955b6f4a5dbfa

Is it ok to mutate a *mut ptr from a &WriteBuffer or should there be an UnsafeCell involved? Did I mess up any other rules?

1 Like
  1. ReadBuffer/WriteBuffer allows use-after-free access of the underlying buffer in safe context. It's alway a bug in unsafe context to make it possible to violate memory guarantee of the Rust language in safe context.

  2. Data race is UB, and UB means far more than runtime glitches. If not explicitly stated, compiler aggressively optimize the memory access with assumption that this memory region is not utilized by other thread. This allows huge optimizations like merging redundant read/write, or eliminate them all if the value can be calculated upfront.

  3. Modifying data behind &T without through UnsafeCell is insta-UB. There's no exception of this rule.

As a conclusion, your buffer types should contains &[AtomicI32] or Arc<[AtomicI32]>, without any unsafe involved. As you mentioned you don't care the glitches here, I recommend to use atomic::Ordering::Relaxed for maximum performance.

9 Likes

I removed my answer. Data races are UB. https://doc.rust-lang.org/reference/behavior-considered-undefined.html

You always need some synchronization primitive to write and read the same memory location from different threads.

Thanks to matklad.

Data races are still UB though, you can't really not care about them. You must use AtomicU32 and Relaxed loads/stores. If I am not missing something, the code in the playground contains UB.

1 Like

I'll relay the atomics advice, but I'm curious about how to use *mut T correctly. I expected it would be possible for the compiler to re-order reads/writes, but I'm surprised by:

This allows huge optimizations like merging redundant read/write, or eliminate them all if the value can be calculated upfront.

I can imagine the compiler reasoning that writes to a Unique<T> can be elided if the pointer is dropped before being read. But the nomicon seems to state that aliasing *mut T is allowed so I wouldn't expect the compiler to ever assume that a bare *mut T is unaliased.

Threads probably complicate the question, so suppose I have a *mut u32 and send a copy to a c library that runs on the same thread. Is the Rust compiler free to reorder writes to that pointer? Can it reorder them past ffi calls? Can it elide writes entirely if it thinks they will never be seen? What is the correct way to alias a pointer in a single-threaded setting to guarantee sane behaviour?

And given that, from the point of view of the compiler looking a single function, is there a difference between a pointer being aliased in a c library vs being aliased by another thread (in terms of code generation at least - obviously without atomics there will be hardware-level store/load reordering)?

1 Like

If the compiler can prove that the *mut T came from a Unique<T>, and that it wasn't leaked outside the current context between the conversion from a Unique<T>, it is still free to make those optimizations.

2 Likes

IIRC, the way this one is handled is that if you send an *mut T to a function of unknown implementation, the compiler will assume that anything can happen to that pointer during execution of that function, and that it may be aliased from that point onwards. After all, the unknown code could have maliciously stored a copy of the pointer in a global variable, and subsequently try to use that copy in a different opaque function.

No-aliasing optimizations on *mut T will only trigger when the compiler can "see" every use of a pointer sufficiently well to prove that playing with pointer accesses and transforming the code accordingly is 100% invisible to the current thread.

As an aside, note that it is surprisingly hard to get to the point where a function is 100% guaranteed opaque to the compiler these days:

  • You cannot assume that written in C == opaque, because cross-language LTO is a thing nowadays. You need to control the build toolchain or use dynamic linking to be able to reach this conclusion.
  • You cannot assume that rust's #[inline(never)] == opaque, because the compiler might add some internal pointer aliasing metadata to the function while processing the program in order to perform those sorts of optimizations without inlining.
  • Even with inline assembly, if you don't use absurdly broad clobbers like gcc's memory, the compiler can still assume a lot about what's going on inside of the assembly snippet.

Basically, if the compiler knows about all uses of a pointer in the current thread, then it assumes that it knows about all uses of said pointer in all threads unless atomics are involved (volatile also has this effect, though its semantics differ in other respects).

1 Like

This article seems relevant, they don't name Rust but I assume it still applies.

1 Like

If you really want to go unsafe, as long as the pointer originated from a shared / aliased reference (&_) to an UnsafeCell<T>, then you can cast the *const UnsafeCell<T> to a *mut T and have this pointer be aliased and also allowed to write to the pointee. But if a write happens while another thread accesses it in parallel, then that is still UB. Relaxed atomics seem like the most sensitive to do first.

1 Like

As a fun "data races are never okay" Rust story, I once wrote an abstraction that purposely triggered an observable data race if accessed concurrently in order to check that some synchronization primitives that I wrote did protect against data races as intended. One test made sure that the data race was triggered on an unprotected variable of that type, another made sure that the data race was not triggered on a variable of that type protected by my synchronization primitive.

Net result: the compiler "optimized" the test that made sure that the data race could occur into an infinite loop, under the assumption that the data race could not happen.

Since then, I've redesigned that abstraction to write to two atomic variables spread very far apart in memory and check that the two atomics have consistent values upon readout. Works like a charm.

4 Likes

IIRC, the way this one is handled is...

Ah, that's totally clear now. Thanks :+1: