Instruction reordering in Relaxed memory orderin

Rust code snippet:

fn a1() {
    X.fetch_add(5, Relaxed);
}

fn a2() {
    X.fetch_add(10, Relaxed);
}

Is compiler reordering possible for this code snippet so that function a2() executes first when a single thread executes both functions?

My understanding is that reordering of functions is not possible because Relaxed memory ordering ensures the happens-before relationship in a single-threaded program. Both functions contain an atomic operation with Relaxed memory ordering which ensures that adding 5 happens before the addition of 10 to the X.

However, my confusion is that both atomic additions are inside the functions, which might play a role in reordering them if anyway compiler finds that executing a2() might make the program first or the CPU finds the a2() at store buffer or registers before the a1() somehow. Also, the compiler or CPU might not deal with these separate atomic operations directly as they are inside of functions that have their stack frames and other properties.

There are two rules that restrict arbitrary reordering:

  1. It must be impossible to detect, from within the current thread only (not communicating with other threads), that reordering of code belonging to the current thread has happened.
  2. No "happens-before" relations between code running on two different threads are broken by the ordering.

As long as these two rules are met, the compiler and CPU are both free to reorder your code as much as they want to.

So, looking at the case you described, with the following as the thread code:

fn thread_1() {
    check_x();
    a1();
    check_x();
    a2();
    check_x();
}

The three calls to check_x will always see X modified in order - a1 first, then a2. Nothing else is permitted, since any other order for modifications to X will result in thread_1 being able to see that it's being executed out of order.

However, given a second thread:

fn thread_2() {
    check_x();
    check_x();
    check_x();
}

It is permitted for thread_2 to observe a2 executing first, then a1 executing second, even though within thread_1, they appear to execute in the other order.

4 Likes

The happens-before relation is an abstract model, the generated code is still subject to the as-if rule. Also, on a single thread accesses are always ordered as in source code, Relaxed and atomics are not required. Atomics and ordering matters only for inter-thread synchronization of accesses.

In particular, the compiler can not only reorder your fetch_adds, but also fuse them or even eliminate them, as long as it doesn't change observable behaviour (the latter is very unlikely if the variable has cross-thread accesses, but the first two are entirely possible).

Instruction reordering is the wrong way to think about it. It's confusing and unhelpful. It's much more helpful to think about when writes become visible. Consider these two threads:

// thread 1
println!("{}", X.load(Relaxed)):
println!("{}", Y.load(Relaxed)):
// thread 2
println!("{}", X.load(Relaxed)):
println!("{}", Y.load(Relaxed)):

On a third thread, you first write to X and then to Y. Imagine that thread 1 and 2 are both calling the same function so we know they have the same instructions in the same order. However, it's possible for thread 1 to think that X was updated, but Y wasn't, and for thread 2 to think that Y was updated, but X wasn't. So did we write to X or Y first? The question is not useful. Forget about it. In reality, thread 1 simply saw the write to X before it saw the write to Y, and thread 2 saw the write to Y before it saw the write to X. This kind of delay where writes are not immediately visible is a much better way to think about it.

So in your example, the real question is "could a thread see the write in a2 before it sees the write in a1?". In this case the answer is no, because both functions access the same atomic, and all threads are guaranteed to see changes to the same memory location in the same order. But if a2 had instead written to Y, then it absolutely can happen that other threads sees a2 before a1, even though you executed a1 first.

9 Likes

There is a nuance here: atomic accesses on their own are not considered observable effects, and the code as written in OP doesn't do anything else with the read values.

Disregarding synchronization, the situation is the same as reading and writing ordinary variables in a single thread. If you don't do something to actually observe the intermediate values (have some data depend on them, or use them in I/O), the compiler is free to reorder, merge, split, duplicate, or modify them in any other non-observable way.

2 Likes

How does the compiler know that an atomic variable has cross-thread access in Rust(given that it uses Relaxed memory ordering) so that it can be more conservative regarding reordering?

The compiler can see all the source code it's working on, and can make a decision based on what it can see; if it knows, following the rules of the Rust abstract machine, that it can see all the code that can access a given atomic variable, it is allowed to optimize based on the knowledge it gains from knowing what all the atomic accesses are.

It's probably worth pointing out at this point that the orderings are part of a memory model; to make concurrency something we can reason about, we've come up with a model of how the world "should" work that we expect our languages and hardware to implement. They don't directly represent what happens - they represent the rules that tell the Rust compiler what our code "means", and the compiler is required, by the Rust language, to translate our code into something that respects those rules.

1 Like