Is there a way to Send without enforcing happens-before semantics?

I am writing a datastructure that relies on acquire/release atomics to do something complicated.

I would like to know if my datastructure should be marked as !Send.
My problem boils down to: is there a way to Send a value from a thread to another without enforcing an happens before relationship.

In other words,

// thread 1
atomic1.store(1, Ordering::Relaxed); // A
atomic1.store(2, Ordering::Relaxed); // B
atomic2.store(Ordering::Release); // S
// thread 2
let version = atomic2.load(Ordering::Acquired); // L
somehow_send_to_thread3(version); // SD
// thread3
let version = somehow_receive_sent_version(); // RCV
let val = atomic1.load(Ordering::Acquire);
// Assuming we got the version sent in SD.
// Are we certain to have assert_eq!(val, 2); ?

My understanding is that we have

A happens-before B
B happens-before S
S happens-before L
L happens-before SD

However I don't see any argument to justify an happens before relationship in thread 3.
Could someone implement a strange channel based on relaxed atomics and end up
with the value 1 in val?

Is "causality" associated to the sheer fact of having that value of "version" in our hands is sufficient to ensure the happens-before relationship?

Extra related question: is atomics ordering in theory and in practise just about instruction reordering, or is the memory model broader than this, and could in theory be used for cache coherence strategies for instance?

2 Likes

Sending an arbitrary Send value to another thread without happens-before is unsound. For example if you were to send a Box<u8> whose content you changed immediately before sending it, then reading immediately after receiving it would be a data-race (and thus UB) if not for the happens-before relationship between sending and receiving. In concrete cases (like sending integers or types that guarantee they don't contain references or reference global state) it can be sound to send them without happens-before relationship. But in your case you can simply not promise that it is sound.

1 Like

Even with a proper channel implementation that uses acquire/release operations I don't think you're guaranteed to read 2, because thread 1 and thread 3 never synchronized with each other. However it should work if all acquire/release pairs (including the ones in the channel) were SeqCst instead.

Generally, when people say that they send something, they mean that a happens-before relationship is established. Of course, it's still possible to have "channels" that do not establish a happens-before relationship, but you would not call it a channel without a big warning sign.

Yes, if no happens-before relationship is established, then val may end up reading the value 1.

It's definitely not just about instruction reordering. In fact, forget all about instruction reordering because it misses the point. With relaxed atomics, there are possibilities that do not arise from any reordering, so even if you think through every single possible reordering, you might still miss a possibility.

The way to think about atomics is this:

  • Every single memory location has a "version history".
  • When you do an atomic load, you might get an old value.
  • Atomic orderings help ensure that you don't get a value that is "too old".

And so the rules look something like this:

  • On a single thread, if you read a value A, then all future reads will either read A or something newer in the history for that location.
  • When an acquire load L reads a value written by a release store S, then the history is "transferred". That is, any entry in the history that is visible to the thread performing L, also becomes visible to the thread performing S.
4 Likes

This is not true. Synchronization is transitive. With a proper channel, atomic1.store(2) happens-before atomic2.store(Release) happens-before atomic2.load(Acquire) happens-before somehow_send_to_thread3() happens-before somehow_receive_sent_version() happens-before atomic1.load().

Therefore, atomic1.store(2) happens-before atomic1.load(). Therefore, it's guaranteed that the load will see the value 2. (Or a newer value in the modification history for atomic1.)

1 Like

How does that work with mixed size atomic access? It does work in hardware on some architectures I believe. And I read somewhere that the Linux kernel even uses this.

For example if I have two atomic u8 (consecutive in memory) that are written by two different threads, then a third thread does atomic u16 load of that address? For acq/rel will the two halfs histories get merged?

There isn't really a good answer to that. Rust inherits the C++ memory model for atomics, and the C++ model says that mixed size atomics are illegal.

Some discussion on this topic can be found here:

3 Likes

With relaxed atomics, there are possibilities that do not arise from any reordering

Thank you! This was my understanding but having it spelled out is really helpful.

Generally, when people say that they send something, they mean that a happens-before relationship is established.

I am not in the position of the user of such a channel, but just writing a library.
My datastructure will return wrong results if it is sent to another thread without acquire/release ordering so I will mark my datastructure as not Send.

I recommend to read an article that shows that these things actually happen in hardware, BTW.

That is: not only possibilities that are “impossible” can happen in relaxed atomics model, but even on hardware level these things are actually possible. For real.

Compare to this:

Existing hardware doesn't have UB when one uses it with mixed atomics… but if C++ and Rust wouldn't add them to the language developers of hardware may decided to do tricks that would make them actually misbehave.

Contracts are important: today's hardware doesn't behave like yesterday's hardware and hardware of tomorrow may behave in yet another fashion.

In this particular case I believe it is unlikely to change, as the Linux kernel apparently uses these in central code. And the Linux kernel is a big enough that nobody wants to break it.

There are lots of embedded platforms that are not designed for Linux – yet are used with Rust, these days.

But I agree that it's unlikely that something like that would happen on a mainstream platform… but then… XBox360 included instruction that may break your code even if it's not executed… who knows what kind of crazyness may be added next…

Without considering thread 3, I don't think we even have A happens-before B. A and B are both accessed using relaxed memory order, these will use the most plain store instructions on the target platform so it provides no "atomic" guarantee, even through they are operating on atomic values.

I found https://marabos.nl/atomics/hardware.html#load-store-ops to be super helpful to understand what is going on under the hood when we use various memory orderings.

You have A happens-before B because they are on the same thread.

1 Like

Thread 2 doesn't need any memory ordering as synchronizing thread 2's memory state won't affect the memory states of other threads.

Thread 3 somehow_receive_sent_version needs acquire semantics or you need an explicit acquire thread fence immediately after it. Thread 3 atomic1.load can be changed to relaxed, the acquire ordering there doesn't do anything.

A document[1] is giving a formal description of what conditions the hardware must meet without specifying how it will do so.

The first thing to know about atomics: modification order "All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable."

You can read the definition of Happens-before (since C++26) I'm only just reading this and I expect is new to many.

Yes, Sequenced-before. + There is "Write-write coherence"

Due to Sequenced-before

Only if it "synchronizes-with". Which is case when getting newer value stored from thread 1. Then "Release-Acquire ordering" rules apply.

Due to Sequenced-before

Yes. (or 0 assuming it started as 0 before the first store.)

If the somehow_send_to_thread3 invoked a third atomic3.store(Ordering::Release) and somehow_receive_sent_version reads the new value with a atomic3.load(Ordering::Acquire) then the val is 2.
This is due to;
"All memory writes (including non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B."
All the writes in thread1 (up to including loaded value of atomic2) are considered as happens-before in evaluation of atomic3.store(Ordering::Release) in thread 2. Then thread 3 synchronizes-with that.


  1. std::memory_order - cppreference.com ↩︎

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.