Rust atomics and locks (chapter 3 example clarification)

I'm reading the excellent Rust Atomics and Locks book by Mara Bos, and I am currently on chapter 03. I am slightly confused by an (allegedly contrived) example provided in the Relaxed Ordering subsection called "Out-of-thin-air Values" . The example provided is:

static X: AtomicI32 = AtomicI32::new(0);
static Y: AtomicI32 = AtomicI32::new(0);

fn main() {
    let a = thread::spawn(|| {
        let x = X.load(Relaxed);
        Y.store(x, Relaxed);
    });
    let b = thread::spawn(|| {
        let y = Y.load(Relaxed);
        X.store(y, Relaxed);
    });
    a.join().unwrap();
    b.join().unwrap();
    assert_eq!(X.load(Relaxed), 0); // Might fail?
    assert_eq!(Y.load(Relaxed), 0); // Might fail?
}

The author states:

It seems easy to conclude that the values of X and Y will never be anything other than zero, since the store operations only store values that were loaded from these same atomics, which are only ever zero.

If we strictly follow the theoretical memory model, however, we have to come to terms with our cyclic reasoning, and come to the scary conclusion that we might be wrong. In fact, the memory model technically allows for an outcome where both X and Y are 37 in the end, or any other value, making the assertions fail.

I am still not sure how the values could be anything but 0 here. I've reread the explanation provided but I'm none the wiser.

The explanation provided in the book is:

Due to the lack of ordering guarantees, the load operations of these two threads might both see the result of the store operation of the other thread, allowing for a cycle in the order of operations: we store 37 in Y because we loaded 37 from X , which was stored to X because we loaded 37 from Y , which is the value we stored in Y .

Is it implying that the value 37 could be stored
by another thread?

If I understand correctly, this isn't something that can happen, but only something that cannot be proved to never happen (although in fact it never does):

the possibility of such out-of-thin-air values is universally considered to be a bug in the theoretical model, and not something you need to take into account in practice. The theoretical problem of how to formalize relaxed memory ordering without the model allowing for such anomalies is an unsolved one.

I haven't read that section (yet), but you might also have a look at this wiki page on C++ memory order, which aims to resemble what the C++ standard says (which is also what Rust adheres to). In particular:

(since C++14)

Even with relaxed memory model, out-of-thin-air values are not allowed to circularly depend on their own computations, […]


While Mara Bos writes,

Fortunately, the possibility of such out-of-thin-air values is universally considered to be a bug in the theoretical model, and not something you need to take into account in practice.

the C++ standard (since C++14) seems to explicitly forbid these out-of-thin-air values (which is actually an extra rule added by the standard, if I understand right).

I think this is exactly the point: this kind of restrictions cannot (yet) be enforced "from the first principles", they can only be explicitly forbidden.

1 Like

Take the following pseudocode:

x = 0, y = 0

Thread 1:
    local = x
    y = local

Thread 2:
    local = y
    x = local

We would intuitively expect both x and y to end up as 0, and regardless of the order in which the threads are executed, any reads from these variables at any point would always return 0.

However, as pointed out above, if we use relaxed memory ordering, the C++/Rust memory model previously allowed the machine to invent numbers out of thin air, i.e. it could compile down to something like this:

x = 0, y = 0

Thread 1:
    local = x
    y = local

Thread 2:
    local = 37
    x = local
    local = y
    if local != 37:
        x = local

Now this raises a few questions:

  1. Why would the compiler do something like this?
  2. Why is it allowed?
  3. Does it cause problems in theory?
  4. Does it cause problems in practice?

The answers as I understand them are:

  1. The compiler at compile-time or the hardware at run-time is allowed to speculatively return a value for y if it has some reason to believe that a particular value is likely, e.g. based on analysis of past return values. Therefore, this is a type of optimization to potentially speed up the code's run-time performance.

  2. This optimization used to be allowed because it didn't violate the theoretical model of relaxed memory ordering, which only requires that all threads see the same order of changes to x and y. The "optimized" code doesn't break the rules since both threads either see 37 or 0.

  3. Therefore, we have a "bug" in the theory, because this simple definition of relaxed memory ordering permits us to produce an obviously wrong result, but we also don't want to throw away the nice, simple definition because it's still useful in many ways. There hasn't been any agreement on a replacement definition.

  4. In practice, this kind of bug doesn't occur because compiler writers are told to explicitly check for and forbid these circular read dependencies, even though they are permitted in theory.

The whole area of memory ordering guarantees is tricky because of the plethora of different compiler systems and hardware architectures available, all of which have subtly different behaviors. This paper and this one both have some useful discussion of the topic.

2 Likes

I don't think this is correct. As indicated above by @Cerber-Ursi, the problem is not that the compiler would in fact emit the incorrect (/unexpected) code without some explicit checks; naïve code generation would definitely not introduce spurious/arbitrary values. The problem is that the model is incomplete and the fact that this is not happening cannot be proven simply from the already existing axioms.

1 Like

My phrasing could have been clearer, perhaps.

I don't mean to suggest that the kind of faulty code generation given in my example would ever happen in a serious compiler, but as others have already mentioned, because this is not provably forbidden by the theoretical model of relaxed memory ordering, a compiler that were to produce that kind of code could still (according to previous language standards) claim to be compliant with the memory ordering guarantees.

In particular, the link to the C++ wiki makes the same point:

the store of 42 to y is only possible if the store to x stores 42, which circularly depends on the store to y storing 42. Note that until C++14, this was technically allowed by the specification, but not recommended for implementors.

Thanks @hax10 , this was the missing bit in my mental model.
So to confirm my understanding, the code example in the Atomics book could theoretically result in the compiler speculatively assigning a "random" value to y if it believes that value to be likely based on some prior analysis. And this kind of optimisation is done to improve performance.
In practice compiler writers don't do this and therefore the code in the question is "harmless" (so to say).
Is that correct?

That's basically my understanding of the situation too, but others on the thread may offer corrections.

In practice compiler writers don't do this

To a first approximation, I think it's true to say this, but bugs and misunderstandings can creep in, so even if compiler writers try their best to prevent speculative data reads from changing the semantics of your program or violating theoretical guarantees, problems may still occur. The first paper I linked to has an interesting section on this topic:

All common hardware disallows such results by prohibiting a store from speculatively executing before a load on which it "depends". Unfortunately, as we will see below, the hardware notion of "depends" is routine invalidated by many important compiler transformations, and hence this definition does not carry over to compiled programming languages.

Although such results are not believed to occur in practice, the inability to precisely preclude them has serious negative effects. We have no usable and precise rules for reasoning about programs that use memory_order_relaxed. This means we have no hope of formally verifying most code that uses memory_order_relaxed (or memory_order_consume). "Formal verification" includes partial verification to show the absence of certain exploits. Since we don't have a precise semantics to base it on, informal reasoning about memory_order_relaxed also becomes harder. Compiler optimization and hardware rules are unclear.

1 Like

Nice one, thanks for the confirmation and clarification and links to those papers (should be a fun read).

1 Like

Techically… maybe? I mean, it is pretty clear that this is not the intention of the designers of the memory model, and they consider it to be a bug. Thus, I would still call a compiler non-conforming if it emitted intended-to-be-incorrect code that accidentally happens to conform to the model due to a bug in the model.

For example, there was a famous bug in the Rust standard library whereby an incorrectly-typed function handed out a mutable slice pointing into an immutable &Vec or something like this. The standard library used unsafe so the compiler was unable to catch the bug. If you used this method, you could cause undefined behavior from safe code, which is a soundness bug. But the compiler didn't complain. So while the code was accepted by the compiler, it was due to a bug (in the standard library). I wouldn't call that code "correct" even though it was in safe code and the compiler accepted it.

1 Like

I'm unsure how this would be "informal". Maybe it cannot be formalized as part of a certain model / ruleset / formalism? If so, which one? I don't really understand the mathematical foundations behind it. But I feel like Mara Bos' book is slightly misleading here as the standard (which Rust adheres to) does explicitly forbid it, it seems; and that's not made clear in that chapter. The chapter/book thus seems to refer to an incomplete (more simple) ruleset which is allowing these effects while the C++14 standard does not.


My point is: The C++ standard just forbids it. I don't see how this wouldn't be part of the "formal" rules then. Maybe it's not "mathematical" enough? :man_shrugging:

Techically… maybe? I mean, it is pretty clear that this is not the intention of the designers of the memory model, and they consider it to be a bug.

Yeah, we agree on this point. Your Rust example and my pseudocode are both cases where the result clearly violates the intent of the programmer and shouldn't be allowed (it absolutely is a bug), but unfortunately it's difficult (if not impossible) to design a logical system to enforce human intent that always produces the desired result from a small number of axioms.

Hence the need to add exceptions and "escape hatches" such as unsafe code (in Rust) or explicitly forbidding circular dependencies (for relaxed memory ordering).

Exactly. Something easy to express informally might be surprisingly hard to formalize.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.