What @dodomorandi outlined is essentially correct.
The important bit to understand is that the compiler is allowed to reorder certain operations or to condense multiple instructions, or unroll loops, etc. This is, in part, how optimization works. The only rule is that it cannot change the ‘end result’ for some limited area of code (I’m being hand-wavy on purpose, the details are far too much).
The CPU is allowed to do the same sorts of things. What the
Ordering enum does is explain to the CPU which rules are applied to which operations. Understand that this is very difficult to wrap one’s brain around. I always refer back to references before reconsidering which ordering is necessary.
SeqCst requires that the CPU not perform any reordering with respect to that particular instruction and memory location - it erects a fence at that instruction, preventing it. It also ensures that all other thread’s
Acquire-marked reads, and
SeqCst marked reads or writes are visible to this operation.
Acquire ensures that writes to a memory location that are semantically before the acquire-marked instruction cannot be moved after it (but it forms no barrier against reads). All
SeqCst-marked writes from other threads will be visible.
Release ensures that reads from a memory location that are semantically after the release-marked instruction cannot be moved before it (but forms no barrier against writes). It also ensures that this operation will be visible to
AcqRel combines Acquire and Release, which is very similar in effect to
SeqCst, but not the same.
AcqRel in effect forms one-way barriers against writes and reads - writes cannot move after the marked instruction, and reads cannot move before. It ensures that the operation has visibility of other thread’s prior
SeqCst- marked writes, and is visible to their
SeqCst- marked reads
Relaxed applies no rules to the operation. The only advantage of the atomic type here is that it should require that the read or write is completed atomically - it’s not possible for two threads to write or read to the location at once (for values which are not loaded in a single instruction, which depends on the platform). It is possible that
Relaxed-marked operations do not see other thread’s changes, or that other threads do not see
Relaxed-marked changes, until ‘eventually’. It is also theoretically possible that the CPU reorder
Relaxed operations in a single thread with respect to one another.
What all this means is that the CPU is limited in how it is allowed to optimize your code to ensure that your code operates as expected. You’re just providing the rules.
In general, however, for x86-based systems (including x86_64), because of the rules provided by the architecture,
Acquire is free on reads, and
Release free on writes. There is no performance penalty paid, because the architecture will behave following those rules anyway.
If there is any doubt,
SeqCst is the best bet, but in general:
Acquire for reads,
Release for writes, and
AcqRel for read-modify-write (like
fetch_add). That will nearly always handle what you need to happen.
If that was in any way unclear, refer to the links dodo provided - they’re excellent! (though at least as dense as this - they’re the technical references I used to check myself here)
Normally (for other languages), I’d write a disclaimer about other, non-atomic, types here, but Rust’s guarantees do a pretty good job of preventing you from messing it up outside of