Is the following impl of a spin lock wrong?

    let x = std::sync::atomic::AtomicI16::new(1);
    // 1 = free
    // 0 = locked

    let mut b = 2;

    loop {
        let t = x.compare_exchange(1, 0, Ordering::Relaxed, Ordering::Relaxed);
        if t.is_ok() {
            break;}}

    b = 3;

    x.compare_exchange(0, 1, Ordering::Relaxed, Ordering::Relaxed);

@droundy @simonbuchan (continuing from other thread)

Ignoring the fact that (1) the above is single threaded and (2) Rust's borrow checker ensures only thread can write to b: Is my above implementation of a spin lock wrong ?

The reasoning being: Relaxed does not guarantee anything, besides that then given op is atomic. Therefore, the compiler is free to reorder the above into:



    b = 3;


    loop {
        let t = x.compare_exchange(1, 0, Ordering::Relaxed, Ordering::Relaxed);
        if t.is_ok() {
            break;}}


    x.compare_exchange(0, 1, Ordering::Relaxed, Ordering::Relaxed);

and that what I really need to do is to use Ordering:: constraints that forces the b=3 to be sandwiched between the two ?

It's long, but jonhoo's video on atomics is really helpful

2 Likes

In general it's definitely "wrong" (won't behave like you want) according to the language spec and on any platform with a weak memory model, yes, but I assume you knew that much!

Whether rustc currently or in the future could generate logically incorrect code on Intel depends on if:

  • Whether LLVM can "see" the Ordering value while optimizing, and thus knows it can "optimize" according to the language specification - a good question!
  • The rest of the code in the program being built, which can change the decisions the optimizer can take, which makes it tricky to produce examples demonstrating the issue.

The LLVM docs on atomics may be helpful, here's how it describes the underlying treatment of relaxed: LLVM Atomic Instructions and Concurrency Guide — LLVM 16.0.0git documentation - specifically:

In addition, it is legal to reorder non-atomic and Unordered loads around Monotonic loads. CSE/DSE and a few other optimizations are allowed, but Monotonic operations are unlikely to be used in ways which would make those optimizations useful.

Interestingly, Rust currently generates lock cmpxchg with Relaxed ordering for compare_exchange, but only movzx for load: Compiler Explorer

1 Like

I think I over estimated my understanding of the "memory model"; especially since racey code can appear to "work perfectly fine".

Well the first good thing is the Rust borrow system make it pretty easy to very aggressively and safely write threaded code without any chance of racing, and without sacrificing performance in anything but very weird situations: knowing how to write, say, a channel let's you appreciate the edge cases that crates like crossbeam deal with, but you're not likely to beat them unless you have very particular knowledge.

The second is that if you do use atomic, you generally only have two cases:

  • Dumb counters, the only thing you care about is that none get dropped, and something else is handling ordering: you can use Relaxed
  • You're doing something "mutex like", where you are making a bunch of writes on one thread visible to another only when they're all done. Use stores with Release (like the end of a mutex) on the writing thread to make them visible, and load with Aquire (like the start of a mutex) on the other to see if you can read.
  • When in doubt and you just want to get it working, SeqCst as the strongest.

The third is that Rust had a tool called MIRI you can use to test multithreaded code. I've not used it myself yet, but it looks very cool.

1 Like

You need the acquire ordering when you acquire the lock and the release ordering when you release the lock.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.