use std::thread;
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering::Relaxed;
static mut DATA: u64 = 0;
static READY: AtomicBool = AtomicBool::new(false);
fn main()
{
let t1 = thread::spawn(||
{
unsafe { DATA = 123 };
READY.store(true, Relaxed);
});
let t2 = thread::spawn(||
{
let id = thread::current().id();
while !READY.load(Relaxed)
{
println!("waiting....from the thread {:?}", id);
}
println!("{} is loaded from the thread {:?}", unsafe{DATA}, id);
});
let t3 = thread::spawn(||
{
let id = thread::current().id();
while !READY.load(Relaxed)
{
println!("waiting....from the thread {:?}", id);
}
println!("{} is loaded from the thread {:?}", unsafe{DATA}, id);
});
t1.join().expect("Thread is panicked!");
t2.join().expect("Thread is panicked!");
t3.join().expect("Thread is panicked!");
}
I think this code snippet has data races.
Here is my reasoning: There is no way the t2 and t3 threads can access the DATA without t1 completing its execution. When t1 completes its execution READY becomes true, then the t2 and t3 threads can access the DATA. Both threads can access the DATA at the same time.
This is racy. You can ask miri on this by the way. If you are on the playground, in the upper right hand corner you have a dropdown button "Tools" where you find miri. When you run it, it will give you an error message saying there is a race condition on DATA.
Your reasoning has a gap in it; t1 executes READY.store(true, Relaxed) before it completes. At some time after t1 executes READY.store, but before t1.join() returns, all of the work done by t1 will complete.
However, it's permissible in the Rust abstract machine's memory model for READY.store(true, Relaxed) to be visible to t2 and t3 while t1 is still making DATA = 123 globally visible, and thus for the reads of DATA to race with t1's write to DATA.
If you use a Release ordering store, then the model of the abstract machine guarantees that after a later Acquire ordering load from the same atomic, all previous writes done in the thread that used a Release ordering store will be visible to the thread performing an Acquire ordering load.
And, indeed, if I make that change to jofas's Playground, you'll see that Miri is happy that the race has gone away.
When working with multiple threads and atomics you have to be careful because you might see old values when you read a memory location. Think of it like this: every byte of memory has a history of values that is written to it. Every write adds something to the history. The thing is, with atomics, there's no guarantee that you see the latest value. You could see some earlier value in the history. The only guarantee is that you can't go backwards in the history, so if you see a new value, then later reads on the same thread won't see an older value afterwards.
But this guarantee about not going backwards is specific to one memory location. When you include more than one, it falls apart. You might read X and see a new value, then read Y and see an older value. As long as X and Y are different memory locations, this is possible. In practice this can happen if the compiler reorders the read of X and Y, or if the CPU is reading from a cache and happens to read Y from a cache that still has an old value, but read X straight from main memory.
All of this also applies to non-atomic reads. It's just that if there's more than one possible choice in the history, then the read is a data race, which is immediately UB.
Let's take your code as example. Yes, the last thread might see the new value of READY. But what stops it from seeing an old value of DATA? Nothing. Because your code has two allowed choices when picking from the history of DATA, it's a data race.
If you change the atomic to use acquire/release, then it's okay, because if an acquire load sees the value written by a release store, then anything after the acquire load sees anything that happened before the release store. I'm this case, the write to DATA happened before the store, so the read of DATA must see the write. Therefore, there's now only one possible choice in the history, meaning there's no data race.
After t1 execution with Release ordering on READY, READY is now true, and DATA is 123. All other threads that access these two variables will observe the latest writes/updates.
The DATA is not atomic nor mutex, and two threads-- t2, t3--can now access it at the same time. Accessing the DATA by t2 and t3 exactly at the same is unlikely here as number of threads is too low, but it is likely when number of threads is many. Don't you think it is a race as two threads race to access it?
As afetisov comment mentioned that simple load of aligned data type are atomic on most CPU architecture so above mentioned race(if it is indeed race) is okay.
A mutex internally uses an atomic operation called compare-and-swap. Basically, you give it an old value and a new value, then it updates the value only if the old value is correct. If many threads call compare-and-swap in parallel, then only one of them will successfully perform a swap, and the others will fail because the old value will be wrong.
There's a variety of algorithms for mutual exclusion, but the most common way to implement it is to use compare_exchange atomics with the right ordering, plus OS support for sleeping.
For the simplest possible approach (real mutexes use more complicated approaches), you have a simple AtomicBool, set to false when the mutex is unlocked, true when it's locked. Locking the mutex is done via:
This uses primitive operations on atomics to get the behaviour we want, plus the spin_loop hint to tell the compiler that someone else should get the CPU if we fail to acquire the lock.
A real mutex does a few more complicated things:
Instead of using the spin_loop hint, you'll park your thread and enqueue yourself for unparking on unlock.
The lock state is now more complicated; typically, you track both locked/unlocked and someone parked or not in a single atomic.
Unlock will unpark a thread upon unlocking if there's someone parked.
You may have a fairness mechanism that ensures that someone can't grab the lock if unlock is going to unpark a thread.
This extra complexity exists so that you don't use CPU busy-waiting for the lock, but you also have a cheap lock if there's no other threads contending for the same lock; for the best case, where one thread takes the lock, does work, then releases the lock before another thread tries to take it, you want locking to be as close to free as possible, but you don't mind paying some costs if two threads want the same lock concurrently.