Understanding Acquire and Release memory ordering

Can someone please explain how this differs from when we would have used Relaxed? Thank you!

#[test]
fn test() {
    use std::sync::atomic::AtomicUsize;
    use std::sync::atomic::Ordering::{Release, Acquire};

    let x: &'static _ = Box::leak(Box::new(AtomicUsize::new(0)));
    let y: &'static _ = Box::leak(Box::new(AtomicUsize::new(0)));

    let h1 = std::thread::spawn(|| {
        let r1 = y.load(Acquire);
        x.store(r1, Release);
        r1
    });

    let h2 = std::thread::spawn(|| {
        let r2 = x.load(Acquire);
        y.store(42, Release);
        r2
    });

    let r1 = h1.join().unwrap();
    let r2 = h2.join().unwrap();

    println!("r1: {} r2: {}", r1, r2);
}

this example has many problems.

the "synchronize-with" relation of an atomic variable only applies for specific values of the variable. in your code, you didn't check the loaded value at all.

here's an minimal example of a Release/Acquire pair:

static x: AtomicUsize = AtomicUsize::new(0);
static y: AtomicUsize = AtomicUsize::new(0);

// thread 1
thread::spawn(|| {
    if y.load(Acquire) == 42 {
        // synchronized with thread 2 in this branch
        let x1 = x.load(Relaxed); // whatever memory order, doesn't matter
        assert_eq!(x1, 1);
    } else {
        // no synchronization **at all** inside this branch
        let x2 = x.load(Relaxed);
        // x2 can be either 0 or 1
    }
});

// thread 2
thread::spawn(|| {
    x.store(1, Relaxed);
    y.store(42, Release);
});

here's what happens in this example:

  • in thread 2, the store to x "happens-before" the Release store of 42 to y because of the Release order;

  • in thread 1, the Acquire load of 42 from y "happens-before" the load operation of x1 because of the Acquire order;

  • the Acquire load of 42 in thread 1 "synchronize-with" the Release store of the same value in thread 2

  • transitively*, the store to x in thread 2 "happens-before" the load operation of x1 from x in thread 1 (the memory order does not matter, Relaxed is just fine).

in your original example however:

  • in thread 1, the Acquire load of r1 from y "happens-before" the store to x;

  • in thread 2, the load from x "happens-before" the Release store of 42 to y;

  • if thread 1 saw the value 42 in r1, then the Acquire load "synchronize-with" the Release store in thread 2;

    • in this case, x.load() happens-before y.store(42) in thread 2, which "synchronize-with" r1 = y.load() in thread 1, which "happens-before" x.store(r1) in thread 1
    • so the result would be r1: 42 r2: 0
  • if thread 1 saw the value 0 in r1, then the Acquire load does NOT synchronize-with thread 2;

    • so it must "synchronize-with" the initlal store of 0
    • the result would be r1: 0 r2: 0

in your example, the memory order for x does not matter, it will not change the outcome.

2 Likes

I think @nerditation answered your question elaborately.

I'm not trying to answer your question directly, but I've seen a strikingly similar example to yours in Rust Atomics and Locks in chapter 3 at the end of Relaxed Ordering. I recommend reading this chapter, and the whole book in general, to get a better understanding of atomics, processors, and OS primitives

2 Likes

hello, thank you for answering my question. In the 4th point of your example you mention that the load operation of x1 from x in thread 1 happens before the store to x but then the assert would fail, wouldn't it; as the value of x1 will be 0?

Oo thanks a lot for the resource!

also, what a clear explanation! I think I understand it know. Thank you once again!
Just one more question, how's the combination of these two orderings different from SeqCst?

sorry, it was a typo. I edited my previous post.

if you apply point 1 through 3 transitively, you would get: "the store to x in thread 2 'happens-before' the load from x in thread 1", not the other way around.

a Release/Acquire pair is a little bit weaker than SeqCst, at least in theory.

for an atomic store operation, if the memory order is Release, then the compiler must ensure all memory operations before it will be visible to other threads which "synchronize-with" the Release store using a paired Acquire load. the compiler is allowed to "reorder" memory operations after the store, given there are no data dependency or other synchronizations. however, if the store is SeqCst, then the compiler cannot reorder memory operations after the store.

similarly, for an atomic load, if the memory order is "Acquire", it prevents the compiler from reordering the operations after the load, but allows for the ones before it. SeqCst prevents operations both before and after from being reordered with regard to the store itself.

in practice, the difference may only be significant on architectures with weak memory order, where the compiler might need to insert certain barrier instructions in order to ensure the correct order in the source code.

memory barriers are usually very expensive to execute, because it might need to flush/invalidate the local cache lines and reach the last level cache in order to maintain cache coherency, compared to "normal" memory instructions (that only access local cache).

1 Like

understood, thanks a lot!