Can Acquire load to atomic be reordered before Release store to unrelated atomic?

There are atomics V1 and V2 with values 0 each.

Thread T1 does this:

// Step S1_1
V1.store(1, Release);

// Step S1_2
if V2.load(Acquire) == 1 {
   // Step S1_3
   println!("Thread 1")
}

Thread T2 does this

// Step S2_1
if V1.load(Acquire) == 0 {
    // Step S2_2
    V2.store(1, Release);
    // Step S2_3
    if V1.load(Acquire) == 0 {
        // Step S2_4
        println!("Thread 2")
    }
}

I want to get any of this results:

  1. print both Thread 1 and Thread 2
  2. print only Thread 1
  3. print nothing.

I want to avoid this situation: printed only Thread 2.

In other words, if step S2_4 happens, step S1_3 must happen as well.

Can architectures with weak memory model move load of V1 in step S2_3 before store of V2 in step S2_2? From single-threaded model of execution, this would not change the outcome so I fear of such case.

If that reordering possible, how it can be prevented? Do I need to use SeqCst memory fence between S2_2 and S2_3?

2 Likes

Let's test it! Open the following in the playground. Then run it using miri by pressing Tools->Miri.

use std::sync::atomic::{AtomicU8, Ordering::*};

static V1: AtomicU8 = AtomicU8::new(0);
static V2: AtomicU8 = AtomicU8::new(0);

fn main() {
    for _ in 0..100 {
        println!("====");
        V1.store(0, Relaxed);
        V2.store(0, Relaxed);
    
        let j1 = std::thread::spawn(|| {
            // Step S1_1
            V1.store(1, Release);
            
            // Step S1_2
            if V2.load(Acquire) == 1 {
               // Step S1_3
               println!("Thread 1")
            }
        });
        
        // Step S2_1
        if V1.load(Acquire) == 0 {
            // Step S2_2
            V2.store(1, Release);
            // Step S2_3
            if V1.load(Acquire) == 0 {
                // Step S2_4
                println!("Thread 2")
            }
        }
        
        j1.join().unwrap();
    }
}
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
Thread 1
====
Thread 2
====
Thread 2
====
Thread 2
Thread 1

As is clear from this output, printing only "Thread 2" is possible.

One possible fix is this:

// Step S1_1
V1.store(1, Relaxed);

// Step S1_2
if V2.fetch_add(0, Release) == 1 {
   // Step S1_3
   println!("Thread 1")
}
// Step S2_1
V2.swap(1, Acquire);
// Step S2_2
if V1.load(Relaxed) == 0 {
    // Step S2_3
    println!("Thread 2")
}

Here, we use fetch_add(0) in place of load() and swap(1) in place of store(1). This is so that we can use Acquire semantics on the store and release semantics on the load (normally this is not possible).

Now, to prove that the above works, we must show that in all cases, we either print "Thread 1", or we don't print "Thread 2". Let us do that. First, note that either V2.fetch_add or V2.swap comes first in the modification order for V2:

  • If V2.swap comes first, then V2.fetch_add returns 1, so we print "Thread 1". In this case, we are okay.
  • If V2.fetch_add comes first, then the load in V2.swap sees the store from V2.fetch_add, and due to the acquire-release semantics, anything that comes before V2.fetch_add is visible after V2.swap, so the load on V1 sees the store on V1, thus it will not print "Thread 2". In this case, we are okay.

Since it is okay in both cases, we are done.

9 Likes

Excellent, thank you!

Is it often to abuse modification operations for release reads and acquire stores?

No, I don't think this trick is especially common. I think we only do it in one place in Tokio, and Tokio uses a lot of atomics.

Usually, it is easier to combine the two atomics into one, which also solves the problem.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.