I am a little confused about memory reordering when it comes to SeqCst. The documentation for Ordering at Ordering in std::sync::atomic - Rust says that SeqCst behaves like Acquire/Release/AcqRel, along with extra features.
This would lead me to think that the following is true. Suppose in one thread T we have a Relaxed store of 666 to an AtomicU64 A, followed by a SeqCst store of 777 to an AtomicU64 B.
Suppose in another thread U, we do a SeqCst load on B and get the same 777 value. We subsequently do a Relaxed load on A.
Are we guaranteed to see 666? If not, why?
ChatGPT suggests using a Release store on A in T, and an Acquire load on A in U. I am not convinced that this would make any difference.
If ChatGPT is correct, why would Acquire/Release prevent reordering after/before the SeqCst operations, and why would the Relaxed operations potentially be reordered?
If, as in the linked documentation, SeqCst works like Acquire/Release/AcqRel, then why would we need Release/Acquire on A instead of Relaxed?
The reason why I'm asking this is that I am storing (Relaxed) to an atomic before calling send on a tokio watch sender, and loading (Relaxed) from the same atomic after the changed method on the watch returns. I know that tokio watch uses SeqCst internally. Therefore I would hope that the Relaxed store to the atomic before calling send on the watch, would be visible to the Relaxed load called after "changed" has returned.
If what I am currently doing does not solve this, what is the best way to enforce memory ordering on atomics when using tokio watch?
yes you are right, the result is the same, as far as A and B are the only concern. the store of 666 to A and the load from A is synchronized with B, so both the load and store can be Relaxed. I think the AI just isn't smart enough to understand the context.
the memory order is good, but why don't you just send the value through the watch channel directly?
This is guaranteed, but it has nothing to do with the extra properties provided by SeqCst, and it doesn't even have anything to do with access to A being atomic. This same thing can be achieved by using an ordinary write on an u64 (in an UnsafeCellA, or behind *mut u64 shared mutable pointer), followed by a Release write of 777 to B: AtomixU64. If the other thread does Acquire-load on B (and makes sure not to touch A in case the 777isn’t there[1]) and then reads A, that's completely fine (i.e. not UB), and guaranteed to give the 666 as far as I'm aware.
with a Relaxed atomic used instead, this condition/restriction becomes unnecessary, i.e. you don't get UB if you load from A unconditionally, and only reason about the “if one read has given 777 then the other must have given 666” afterwards ↩︎
Thanks for the explanation. Sending the value through the watch is a good idea, but the atomic is shared elsewhere. It is easiest just to use an atomic currently.
SeqCst is stronger than AqRel, so yes it does also act as Release&Acquire do, for stores&loads, respectively. The exact extra guarantee it gives is a bit subtle to describe once you're getting into how it interacts with other non-SeqCst operations, though generally you can think of it as that if you really needed the simple "everything happens in a single global order across multiple threads and across multiple atomic variables", then all the operations you want to reason about this like that would need the SeqCst added.
(In particular, if you only ever use SeqCst on a single atomic variable, then it doesn't give any extra guarantees to begin with; this is because for each atomic variable individually, there already always exists a single global sequential order in which all operations on that variable operate, including even the Relaxed ones.)
As a side note, one way to think about this "extra guarantee" that stuck with me:
SeqCst is the same as (Acquire for loads/Release for stores) with one additional guarantee: SeqCst stores won't be reordered with subsequentSeqCst loads.
In other words, when thinking "Do I need SeqCst or is Acquire/Release enough?" you need to check if you need to prevent stores reordering with subsequent loads. If yes, use SeqCst. Otherwise Acquire/Release is enough. (As you mentioned, this is only relevant when considering stores/loads to different atomic variables. When dealing with one, SeqCst is never necessary.)
I'm far from expert on this, so I'd appreciate a reality check on my thoughts.
I can't really judge this statement as I have no idea what you mean with "subsequent" here, or with "be reordered" -
I can't really come up with any way of defining these terms that would make your description all that accurate actually though so maybe if we talk about it more in depth, and your "reality" might possibly indeed end up being "checked"
I mean, your description is definitely true though in emphasizing that it's about the interaction of multiple atimics - and I would like to add multiple atomics that are both / all used for synchronization purposes, too. E. g. Arc with its weak counter and strong counter is actually quite a pain to reason about its implementation because there are two atomics and no SeqCst.
what you mean with "subsequent" here, or with "be reordered"
Looking at this code:
static A: AtomicUsize = AtomicUsize::new(0);
static B: AtomicUsize = AtomicUsize::new(0);
fn t1(){
A.store(1, Release);
let lb = B.load(Acquire);
}
fn t2(){
B.store(1, Release);
let la = A.load(Acquire);
}
By subsequent I mean: in the source code of both threads, load comes after store. There is a store, and then a subsequent load. I'm not a native speaker (obviously ), so not really sure what's the best and unambiguous word to describe this.
By reordered I mean: for t1 memory model allows reading from B before storing to A (for t2 reading from A before storing to B). Which means, we can end up in situation that both la and lb are 0.
And replacing Release/Acquire with SeqCst prevents "stores to be reordered with subsequent loads", which in turn means that la and lb cannot both be 0.
quite a pain to reason about
Yeah, I didn't mean to suggest that reasoning about shared memory is simple. Even when just using SeqCst, considering all possible interleaving is can be very challenging. With weaker orderings, brain can very quickly start to melt .
Ah, thanks for elaborating! Only this added detail & concrete example allowed me (and presumably others) to really understand your spoint. I guess this captures a good mental model for some of the strengths of SeqCst. It looks like I didn't quite notice the (important) detail that your statement was specifically about store-then-load.
I suppose your characterization will, rather than stating all (or some) of what's necessary to go from "mere acqrel" up to seqcst, be more fitting for going from "full seqcst" down towards the full aq&rel-subtleties&weirdness, but only partially getting there. And this makes sense - assuming memory of always being some global thing, and merely explaining "reordering" of source code is probably a way easier mental model to reason about. It's an operational model still, whereas the real definitions from C++ specs read more like a set of axioms, which one then uses too derive desirable properties of the abstraction one is trying to build, like Mutex or Arc which use atomic internally.
(Still, I would be surprised though if it's actually any sufficient for arguing some Acquire-Release-using code to be correct. I. e. there might be examples where SeqCst isn't sufficient but the mental model of "SeqCst+reordering in compiler optimization" fails to demonstrate the problem; though I don't have such an example ready off the top of my head, unfortunately. I'm also curious if there are any complete, operational-style descriptions of what the abstract C++ model memory really "looks like").