Ordering::Relaxed vs Acquire+Release?

So far I've been essentially just using Odering::Relaxed for all my AtomicBool needs without much afterthought. Would using Aquire with Release be a better option since it seems to be slightly more strict about how it handles loading and storing?

This is not an easy question to answer, because this stuff is complicated. But you're doing the right thing by going into it and trying to learn more.

  • No single ordering can be recommended as the always correct one
  • That said, many Rust coders are very careful/defensive in how they pick atomic orderings, so that they prefer to pick the "strongest" one - SeqCst - when in doubt. To pick the weakest one would be the least careful! (With this said, not everyone agrees that it's smart to put a single ordering as the strongest one.)

The Ordering is important because it deals with the question of synchronization - what is the value of everything else in relation to your atomic? What can you know for sure about the state of other variables (state of your program) when you read true or false from your Atomic Bool?

I imagine you have a model of expectations for what true and false in this storage location means for your program. I mean, when you read true - what can we conclude about what's happened with the other variables?

Using the orderings correctly means that you can have a correct model that can answer that question - if you use the orderings incorrectly, then synchronization in your program does not work as intended.

There are examples of when Relaxed is the correct ordering. That would be when you're not sychronizing with anything. Maybe you just want to count the number of times something happens and not make decisions based on that. Then Relaxed will work fine.

5 Likes

Thanks for the insightful reply.

I am making decisions based on the state of the bool, more specifically whether a loop should break or not in a separate thread (GUI sets, loop breaks and restarts with new values from GUI). Other than the option chosen in a menu changing there's really no particular state I can guess the application is in. Everything else is read again and then the loop starts.

We can't really review your code, even if it was posted it's maybe not something people want to do - it's rare to get a voluntary definite sign-off from someone on such a complicated question. :slightly_smiling_face:

I would suggest you should probably upgrade all the orderings you use there to SeqCst. If your program is moving at the speed of a GUI, surely there is no performance to lose there? Then after that read more in the documentation and what it links to.

The lock metaphor is quite useful to use to understand Acquire/Release (find more about that in the nomicon). And locks - mutexes - are also synchronization privitives. If you use those, they have related guarantees, they are used to ensure synchronization.

Then to ponder: read how? The synchronization situation determines what kinds of values you actually read. The two threads might have different ideas of the state of these variables, you need to use your model, the one that is supported by the synchronization primitives :slight_smile:

3 Likes

To expand on this, I think the two school of thoughts are, first:

Always pick SeqCst unless benchmarks prove that a weaker ordering is indeed faster. The reasoning being that it is always correct to replace weak orderings with SeqCst, so by picking SeqCst you are more likely to arrive at a correct problem.

second:

Always pick the weakest ordering that makes the program correct. The reasoning is that arriving at a correct concurrent program by chance is highly unlikely. To write the correct program, and to keep it correct under modifications, you need to understand why it is correct, and to have an informal proof of correctness. It is easier to prove that things are correct if the most lax orderings are used. For proving Relaxed, you say that no synchronization is necessary at all (*), for Release/Acquire you need to check that a pair of potentially racy operations on two threads synchronize via a specific store release / load acquire pair of operations, and SeqCst is notoriously hard to prove, as it establishes a global property, and only relatively tricky algorithms rely on it.

(*) Relaxed is easy for informal reasoning, but pretty hard for formal one, as there's an issue of out-of-thin-air values.

6 Likes

Maybe you're using a Mutex here or something, and then the whole situation is much simpler. The necessary synchronization should then already be there.

Is this a then just draw the rest of the owl situation? :slightly_smiling_face:

is a nice blog post about it: the summary at the bottom is a nice "first approach" to understanding atomics.

  • (memory acts as if with a thread-local (CPU-local) transient "buffer" it needs to flush or update (the CPU cache) —the "cheating" part in the blog—, atomics would not go through that cache —but even then the "timeline" of each atomic does not have to be consistent with that of another; that's where this model falls apart a bit…—, and in order to deduce the state of other non-atomic data (or other relaxed-atomic data), you need the read/write on the current atomic to update/flush the cache => hence the purpose of the Acquire and Release orderings).

In that very blog post, we have:

  • with the information being fully contained within the atomic, may I add; in this instance, the stop unit signal itself (bool ~ Option<()>, so you can use an AtomicBool with relaxed memory to send a () signal to another thread).

Let's see a counter-example, now:

// thread that "sends the `Stop`" signal
RETRY_COUNT.fetch_add(1, atomic::Ordering::Relaxed);
MUST_RETRY.store(true, atomic::Ordering::Relaxed); // send the stop signal
// UI thread
let cached_retry_count = RETRY_COUNT.load(…);
assert!(MUST_RETRY.load(…), false); // cached state is "old enough"

// …

if MUST_RETRY.swap(false, atomic::Ordering::Relaxed) { // if `MUST_RETRY` changed…
    let new_retry_count = RETRY_COUNT.load(atomic::Ordering::Relaxed);
    assert!(new_retry_count >= cached_retry_count + 1);
}

I believe, IIUC, that the assertion there could fail: while the reads to RETRY_COUNT, themselves, will be well-defined to be a monotonically increasing counter; nothing guarantees that will necessarily read the "latest snapshot" of that value.

The RETRY_COUNT would here be a "channel" through which to send counter increments, and since the "counter increment info" is not part of the MUST_RETRY "channel" of () / unit stop signals, then Relaxed memory operations on them gives no guarantee w.r.t. the progress of these (thus concurrent) channels.

But there is a hierarchy in my example: it only makes sense to read the RETRY_COUNT if a MUST_RETRY signal is sent; thus, it would make sense to make the MUST_RETRY force the threads to "synchronize their watches" w.r.t. the RETRY_COUNT timeline (and any other timelines, for that matter):

  // thread that "sends the `Stop`" signal
  RETRY_COUNT.fetch_add(1, atomic::Ordering::Relaxed);
- MUST_RETRY.store(true, atomic::Ordering::Relaxed); // send the stop signal
+ MUST_RETRY.store(true, atomic::Ordering::Release); // send the stop signal
  • any past mutations in this thread are "flushed" into main memory, as well as their "timelines" (e.g., RETRY_COUNT increment)
  // UI thread

- if MUST_RETRY.swap(false, atomic::Ordering::Relaxed) {
+ if MUST_RETRY.swap(false, atomic::Ordering::Acquire) {
      let new_retry_count = RETRY_COUNT.load(atomic::Ordering::Relaxed);
      assert_eq!(new_retry_count, cached_retry_count + 1);
  }
  • "refreshes" / "reloads" the "cache" of such mutations / "timelines", making following reads (such as that of RETRY_COUNT) showcase the same causality / ordering consistency as that in the other thread when it sent the signal.

Back to your use case

This.

If you don't like the ergonomics / noise of using a Mutex, then, use a more appropriate synchronization primitive than a Mutex. As I've mentioned in great lengths in my post, your AtomicBool is actually an AtomicCell<Option<()>>, which, with

struct RetrySignal;
  • or type RetrySignal = ();

reads as AtomicCell<Option<RetrySignal>>.

You are then storing a Some(RetrySignal) inside it, only for the UI thread to do if let Some(RetrySignal) = <load…>.

With this view,

what you have here is a (coalescing) channel

So you could use an actual (non-coalescing) channel, if you don't mind that coalescing difference:

// thread that sends the stop:
let _ = sender_part_of_the_channel.send(RetrySignal);
// UI thread
match receiver_part_of_the_channel.try_recv(RetrySignal) {
    Ok(RetrySignal) => { /* retry … */ },
    Err(TryRecvError::Empty) => {},
    Err(TryRecvError::ChannelClosed) => { /* handle this case as well */ },
}

EDIT: I think the idea was to restart, not to retry, so feel free to imagine all my mentions to TRY / Try are actually mentions to START / Start :sweat_smile:

5 Likes

Mutex aren't used in my program, unless it refers to something I'm glancing over in the AtomicBool.

For more context the program I'm writing is used to control effects on a keboyard. The (relevant to dealing with restarting the thread) data is acquired here which is essentially just a [u8;12] RGB array + an extra u8 for the speed. Given they are only read after either a Message::UpdateEffect or Message::Refresh + the current loop exiting I believe its safe to assume both threads will hold the same data since any other change will also trigger a new reading of values.

The loops regarding to custom effects (aka spamming "please change color" signals) in the program are while loops with some probably redundant if checks managed by the AtomicBool set as mentioned above.

Hopefully that clears some things up. Still want to thank everyone in this thread, I'm gonna have some homework to keep me entertained for a few days :eyes:.

I get the using a channel part, but what does "Coalescing" mean? Otherwise thanks for the in-depth answer as well.

It's a nuance I realized right when finishing my first version of my post, which I think was important enough to deserve at least that parenthesized mention.

The difference may or may not matter in your case depending on how the UI and everything else works.

Let's imagine that one can "click", on the UI, twice on the "reset" button (or whatever it is), fast enough for the second click to happen before the reset / restart actually happens (this may be impossible or not depending on how the UI works).

  • If that button only does MUST_RESTART.store(true, …);, then the second call is a "no-op"1 / the operation is "idempotent" until the restart happens (and clears MUST_RESTART).

    1 modulo memory orderings / fences side-effects

  • But if that button does send an actual MustRestart signal across a channel, then the second click would enqueue a second such signal, and the UI thread, in its .try_recv() call, would only unenqueue / pop / consume one of those two signals, and thus one would remain in the queue, only to be triggered very quickly after the UI restarts and calls .try_recv() again. This would be a behavior whereby these signals would not coalesce / merge into one, and if the UI allowed it, it could lead to minor UI bugs (e.g., the double restart could lead to some stuttering).

    The AtomicBool implementation does not suffer from this problem because the boolean state is no actual queue (and thus not an actual channel): while it could conceptually be viewed as a channel, it would be one where the messages would coalesce.


See also the docs for ::tokio's Notify and its .notify_one() method (I obviously thought of this abstraction for your use case, but it's an async-targeted one; I would be curious to know of a non-async variant of it).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.