Unfortunately, explaining atomics in a way that is both accurate even in edge cases and simpler than the full standardeeze (or if you're both unlucky and picky, a CPU diagram) is a difficult question, and people who put the effort into writing such learning material generally want to get paid for the work.
Now we have two hard problems 
Typically, the preferred answer is to provide wrappers that implement the patterns at the higher level,
for which the Rust API docs are sufficient, and for which a wrapper that doesn't expose the complexity atomic ordering is fundamentally trivial (which is probably why nobody actually provides this as a specialized type)
You don't want to use purely AtomicBool
, because then the waiting thread has no way to wait other than spinning, and spinlocks are bad (same link as previous). But there's a library for that:
This allows you to do condvar/futex wait
/notify
on an AtomicU32
rather than a Mutex
.
In general, you only have synchronization from stores to loads anyway. The other three options fundamentally don't make any sense, because they have no way to tell whether the previous operation that they'd theoretically by synchronized with happened. (fetch-op/compare-and-swap/etc style operations are logically both a load and a store.)
At least from my understanding of atomics, anything that isn't satisfied by AcqRel
is in the PhD-level complexity to show that the used synchronization is actually sufficient. (That said, SeqCst
is still a safer default, because it means the reordering model of atomics actually works, whereas that isn't the case for AcqRel
since threads can observe synchronized events in different orders.)
Mixing Relaxed
with AcqRel
is probably achievable, but I'd stay away without specific reason, not the least because x86_64 concretely doesn't offer weaker memory ordering that AcqRel
.