A better term than "Thread safe"?

Hello,

The term "thread safe" has been around a long time and means a lot to programmers from other systems languages, but I feel like it's not ideal to talk about types in Rust.

The opposite of "thread safe" would be "thread unsafe" or "thread dangerous". In Rust, we have the Send and Sync marker traits, and if you have them on a type then it's definitely "Thread safe". But without them, it's still safe, just not as flexible. So it doesn't make sense to describe a type that isn't Send or Sync as "not thread safe", the way that description would come in bold flashing red letters in a README or .h file for a C project.

If I were documenting things for a Rust audience, then saying "Send + Sync" would be good enough, but that doesn't immediately convey the associations like "thread safe" to systems programmers broadly.

"Thread-flexible" doesn't exactly roll off the tongue. "reentrant" is a little too specific and not always a correct synonym of thread safe.

Does anyone have a word or succinct phrase they like, to convey the concept of Send + Sync to a broad audience? or alternatively a word of phrase that means "Not Send" or "Not Sync"?

Thank you.

3 Likes

Perhaps you are looking for phrases like "requires synchronization"?

Actually, I think it makes perfect sense. A type that is neither Send nor Sync cannot ever be accessed in any way from any thread other than the one it was created on. Sounds pretty "not thread safe" to me.

9 Likes

Speaking as someone who first programmed multi-threaded and multi-processor systems back in the early 1980's, in C, C++ and other languages, the use of "thread safe" and "not thread safe" are only what I would expect to see used for the same thing with Rust.

3 Likes

I'd emphasize the "not always" part and say that "reentrant" is very much not a synonym of parallelism, but a super-set of it!

Rust expresses concurrency, in a general form, as shared / & access.
It expresses parallelism by adding Sync into a concurrency context.

So, there exist things that are &-safe / shared-safe / concurrent-safe, whilst not necessarily being Sync, since not parallelism-safe. In that case, such things would be re-entrant safe, but very much not thread safe.

  • Anything immutable-when-shared / without shared-mutability / without UnsafeCells somewhere deep inside it (or raw pointer indirection) is necessarily[1] generally concurrency-safe / reentrant-safe already (and minor weird APIs, this ought to lead to being parallelism-safe as well).
  • RefCell would be the most canonical thing with shared mutability that remains reentrant-safe while not being parallelism-safe. Indeed, thanks to its runtime guard, it can detect re-entrant misusage.

    • Cell deserves a special mention. While allowing concurrent mutations, it may be one of the few instances of such that would be guaranteed never to feature re-entrancy, by virtue of never lending inner references (hence featuring "register semantics", we could say).
  • ReentrantMutex would be a wrapper type that is thread-safe by giving an illusion of parallelism (& / concurrent access from within multiple threads is safe by virtue of blocking it at runtime, should it ever happen) even if the inner wrapper isn't really parallelism-safe.

All that to say that Rust does precisely this great job at decoupling some of these notions that have been, historically, quite tangled for a while; and for those used to that environment with everything muddied, it can be a bit hard to take a step back and rethink these distinctions that Rust makes.

In this instance, that re-entrancy is a main "danger" of concurrency, but that parallelism is not the only form of concurrency. Even more so, I am personally used to talking of re-entrancy as an opposition to parallelism, as in "single-threaded re-entrancy".

Here is a diagram showcasing some of the things I mean (I apologize for the colors and the font; I didn't want to spend too much time making it so I just went for one of the very first[2] available online tools out there :sweat_smile:):


Now, regarding the naming, I like to say that something is "thread-safe" if all of its API is, which, as the diagram showcases, very often involves being Send and Sync when both shared and exclusive APIs are showcased.

  • Sync thus expresses the idea of being safe to use in parallel / "parallelism-safe";
  • Send is actually quite hard to put into words in a summarized fashion, but the gist of it would be about being safe to use across multiple threads sequentially / in a non-parallel fashion.

All in all, Rust's official terminology for Sync is quite neat: "safe to share across threads" :slightly_smiling_face:.

  • The official terminology for Send, however, is not as good: "safe to send across threads". What does send mean? It's either a quite informal definition (does &mut access count as Sending? I'd say that in English it doesn't, but in Rust's model it does), or worse, a tautological definition (something Send is safe to send…).

Bonus: Exercise

What is the correct impl to express the thread-safety of the aforementioned ReentrantMutex?

Click to expand

So, what should Bounds… and ThreadSafetyMarker stand for in the following impl:

unsafe
impl<T> ThreadSafetyMarker for ReentrantMutex<T>
where
    T : Bounds…,
{}
  • When ThreadSafetyMarker = Send?

    • Bounds… = Send?
    • Bounds… = Sync?

    Answer

    Given the aforementioned &mut RM<T> -> &mut T, for &mut RM<T>[3] to be safe to cross thread boundaries, it is necessary for &mut T to already be safe to cross thread boundaries. In other words, we have to have:

    RM<T> : Send => T : Send
    

    It turns out the other direction holds as well: the only time it doesn't is when shared ownership enters the equation (e.g., Arc), which ReentrantMutex<T> has nothing to do with.

    So RM<T> : Send <=> T : Send, i.e.,

    unsafe
    impl<T> Send for ReentrantMutex<T>
    where
        T : Send,
    {}
    
  • When ThreadSafetyMarker = Sync?

    • Bounds… = Send?
    • Bounds… = Sync?

    Answer

    This one is quite subtle. We already have kind of a tautological answer:

    • if T : Sync, then there is literally no reason for RM<T>[4] not to be Sync as well. Indeed, "sequential shared accesses from within multiple threads" is a subset / more strict than "shared accesses from within multiple threads". If the latter is safe (Sync), then the former must be safe as well (RM<T> : Sync (because the locking method is &-based)).

      /// It would be sound to have this
      unsafe
      impl<T> Sync for ReentrantMutex<T>
      where
          T : Sync,
      {}
      

      That being said, it's kind of a quite useless property, then: when T : Sync, we didn't really need the ReentrantMutex to begin with! This impl is basically sound because when T : Sync, RM<T> is useless, and thus, harmless. And there isn't much point in a useless API, is there?

    • so this kind of leaves T : Send as a candidate? Well, indeed, if you think about what I mentioned above in this very post, I summarized Send as the property of sequential multi-threaded accesses being safe.
      That is, "sequential shared accesses from within multiple threads" is also a subset / more strict that "sequential exclusive accesses from within multiple threads" (since we can always loosen an exclusive access (&mut) down to a shared one (&)).

      So this gives us:

      /// More useful impl
      unsafe
      impl<T> Sync for ReentrantMutex<T>
      where
          T : Send,
      {}
      

      This is now genuinely useful, since it makes something such as ReentrantMutex<RefCell…> or ReentrantMutex<Cell…> to become Sync, when the non-wrapped variants weren't :slight_smile:

    Now, currently, the coherence checker will prevent having both impls, since they would overlap for Send + Sync types, even if overlapping a marker trait ought to be fine. In the future, hopefully, the marker_trait_attr feature will be stabilized, precisely to express that certain traits are guaranteed not to have contents / to only act as a marker, thereby allowing impls overlaps. With it, we could have a SendOrSync kind of bound.

    So the final correct answer, in future Rust, would be:

    • Bounds… = Send or Sync:

      unsafe
      impl<T> Sync for ReentrantMutex<T>
      where
          T : Send or Sync,
      {}
      

This exercise is relevant to my original remark, since when considering ReentrantMutex<RefCell<T>>, we have:

  • the type is Sync thanks to ReentrantMutex being Sync in this case, i.e., safe to share across threads / parallel access to the mutex can be attempted;

  • the type is also re-entrant safe, since RefCell<T> features a runtime check to guard against it, even if it features Shared Mutability.

  • and yet the type involves no actual parallelism at runtime: the very design of ReentrantMutex's locking logic prevents it! So it is safe to expose to parallel acesses (attempts), since the mutex is able to guard against those, (b)locking the extra threads so that exactly one gets actual access to the inner value.


  1. unless it relies on mutating global storage: it wouldn't then technically have a reference / pointer to such mutated global storage, but conceptually it would, so my point still applies on the conceptual level. ↩︎

  2. if anyone knows of an almost as easy-to-use as Free Venn Diagram Generator, but nicer-looking / a bit more customizable, I'm all :ear:s! ↩︎

  3. shorthand for ReentrantMutex<T> ↩︎

  4. shorthand for ReentrantMutex<T> ↩︎

15 Likes

May be I'm missing a point here but I'm not seeing the difference between "concurrency" and "parallelism" here.

One can have two threads on two processors accessing the same data structure. Neither thread knows what the other is doing, so if they are in the middle of mutating/reading some shared data structure they will cause chaos.

Or one can have two threads running on a single processor. They each get scheduled by some timer or other interrupt to the OS kernel. Neither thread knows what the other is doing, so if they are in the middle of mutating/reading some shared data structure they will cause chaos.

What difference does "concurrent" vs "parallel" make when it comes to requiring Send and/or Sync to keep things in order?

I think the main point of the OP is that Rust protects us from accessing a "not thread safe" API in an unsafe fashion (unless we use unsafe), so in safe Rust, everything is "safe" regarding threads (except race-conditions and deadlocks).

I think the terms searched for could be something like "thread-sharable" for Sync and "thread-boundary-traversable" for Send. Arguably my proposal isn't very catchy though :sweat_smile:.

3 Likes

You could refer to them like this:

  1. !Send + !Sync is "not thread-safe at all".
  2. Send + !Sync is "requires synchronization, even for immutable access".
  3. !Send + Sync is "it's a long story".
  4. Send + Sync with &mut self methods is "requires synchronization when writing"
  5. Send + Sync with only &self methods is "thread-safe"
20 Likes

I like that :sweat_smile: For those wondering, the canonical example is MutexGuard, which is "obviously" Sync because we just locked the mutex, but at the same time !Send, because you have to unlock the mutex on the same thread which acquired it.

My hunch is that the essence of this weird behavior is that a mutex guard is not an ordinary "no-op" wrapper: it has its own, very active business to do with Send – unlocking the lock –, while its Sync-ness has to do with it only giving by-ref (but not by-value) access to the wrapped value. Dumb no-op wrappers, e.g. newtypes that always contain the wrapped type by value and simply forward everything to it are far more frequent, and so !Send + Sync is far less frequent.

10 Likes

I guess the question is then about the meaning of the word safe, or safety. In the context of Rust, this has a quite narrow scope, perharps surprisingly narrow, whereby the safe vs. unsafe dichotomy is about memory safety or, equivalently, defined behavior (or lack thereof: UB).

Some examples:

  • ::std::fs::remove_dir_all("/") is a non-unsafe operation in Rust, which could thus be labelled as safe, as in, (process-)memory-safe. Even though it's attempting to nuke all your data on permanent storage :sweat_smile:

  • struct Troll();
    impl Hash for Troll { … random() }
    
    let instance = Troll();
    let mut set = HashSet::new();
    set.insert(&instance);
    assert!(set.contains(&instance)); // will probably fail!
    

    Here we have an example of API misusage and logic bugs due to a memory state that does not match the program(mer)'s expectation. It's still memory-safe, in Rust parlance.

So, back to concurrency vs. parallelism, and in the context of:

Let's take an example: incrementing a number concurrently.

/// `::syn` can't parse this, btw…
use {
    ::core::{
        cell::Cell as Mut,
    },
    ::futures::{
        executor,
        future::join,
    },
};
    
fn main ()
{
    let state: Mut<i32> = 0.into();
    let task = || async {
        let value = state.get();
        let () = stuff().await;
        state.set(value + 1);
    };
    // You can write Rust and draw faces at once 🙃
    ((..) , (..)) = executor::block_on(join(task(), task()));
    assert_eq!(state.get(), 2);
}

This, runs that task twice, concurrently. Thus, depending on the behavior of stuff() and executor::block_on[1], the assertion will fail or pass.

So we have a "re-entrancy bug", we could say (depends on the point of view, to be honest), albeit a memory-safe one: no memory was harmed unsafe was written for this demo.

  • If stuff() were to yield based on some timer, for instance, directly or indirectly, then this bug could be labelled under the race condition category, but it would nonetheless not be a data race, in the hardware sense at least.

Now, if instead of the above we were to write, using some unsafe, the following:

fn main ()
{
    static mut STATE: i32 = 0;
    let task = || unsafe {
        let state = ::core::ptr::addr_of_mut!(STATE);
        // same as `*state += 1;`
        let value = *state; *state = value + 1;
    };
    ((), ()) = ::rayon::join(task, task); // UB!
}

whereby we are still featuring concurrency through join, but this time the single-threaded (and thus, parallelism-free) ::futures::executor::block_on(join(…)) has been replaced with the allowed-to-be-run-in-parallel ::rayon::join.
This means that while a thread is write-accessing *state in *state = …, the other thread may be accessing that same *state as well, either for reading or for writing. This is the textbook example of a data race, and it can lead to memory unsafety:

  • value is no longer guaranteed to be 0 or 1, it could be any arbitrary bit pattern (this is quite concerning given that value could have been typed, in Rust, as a bool, which makes observing a bit-pattern that is neither 0 nor 1 already UB), and the same applies to *state, i.e., to STATE.

Now, in order to simplify this example of UB, I have used a static mut, so that the closures are officially not capturing anything, and thus get to be Send, letting that example compile fine. But this is actually rather illustrating how dangerous static mut can be, even if I've taken the care of using addr_of_mut! rather than &mut to avoid other aliasing concerns that static mut has (and which would otherwise have been a source of UB before the data race even occurred).

But if we were to go back to my cell::Mut example, so as not to use unsafe:

use ::core::cell::Cell as Mut;

fn main ()
{
    let state: Mut<i32> = 0.into();
    let task = || {
        let value = state.get();
        state.set(value + 1);
    };
    ((..) , (..)) = ::rayon::join(task, task); // UB!
}
  • (modulo the stack vs. global storage distinction for state, this snippet would have the exact semantics w.r.t. what the tasks are doing)

this fails to compile :relieved:, with:

error[E0277]: `Cell<i32>` cannot be shared between threads safely
  --> src/main.rs:10:21
   |
10 |     ((..) , (..)) = ::rayon::join(task, task); // UB!
   |                     ^^^^^^^^^^^^^ `Cell<i32>` cannot be shared between threads safely
   |
   = help: the trait `Sync` is not implemented for `Cell<i32>`
   = note: required because of the requirements on the impl of `Send` for `&Cell<i32>`
   = note: required because it appears within the type `[closure@src/main.rs:6:16: 9:6]`
note: required by a bound in `rayon::join`

So there we have it. A comparison of single-threaded and parallel concurrency w.r.t. incrementing a shared counter, which is as basic as a shared data structure can get :slight_smile:

We can see how Cell<i32> is a concurrently-mutable integer[2], which is nevertheless (memory-)unsafe to mutate in parallel.


  1. basically, whether the second call to state.get() happens before the first call to state.set() ↩︎

  2. to put things into perspective, i32 alone, on the other hand, would not be a concurrently-mutable integer, but a sequentially-mutable one, since it requires exclusive/unique references (&mut) in order for the mutation to happen. ↩︎

4 Likes

To me, it feels like the problem is that the most intuitive term is already taken by the thing you're trying to contrast against.

Safe Rust (the language itself, rather than individual APIs) is thread safe in the same way that Java and Python are memory safe.

Maybe one could say Rust "has pervasive thread safety"?

1 Like

That's why I think we should refrain from saying things like "!Send + !Sync = not thread safe". We can't provoke UB by passiung values of these types (or references to it, respectively) to other threads in safe Rust. Instead, we get a compiler error. Everything is safe. Always. (Unless we type unsafe.)

Send and Sync are not about safety. They indicate whether we are able to pass values (or shared references to these values, respectively) to other threads.

So it's more about "thread ability" than "thread safety".

3 Likes

The problem is that now you're drawing a terminological distinction that I don't believe exists.

It's not "memory ability" just because Python and Java refuse to let you take a pointer to something. It's "pervasive memory safety" or "compiler-enforced memory safety" or some similar term.

We could do something similar in reverse and talk about whether "type safety" is the wrong term to use in languages with strict type systems so long as at least one macro assembler exists which is to pervasive type safety as mutexes are to thread safety under C.

Safe Rust blocks you from calling std::mem::transmute at compile time, so does that mean we should be calling it "type ability" rather than "type safety" in safe Rust?

3 Likes

Good point, but not sure if that comparison is good, because we normally don't use the term "thread" when we have a single threaded program, but we use the term "memory" both in cases of manual and automatic (safe) memory management. Both Python and C as well as C64-Basic, Lua, or JavaScript use "memory", but C64-Basic or Lua or JavaScript doesn't "use threads", I would say? But I understand your point anyway.

Maybe it's best to say Send means a type is "send" and Sync means a type is "sync" :stuck_out_tongue_winking_eye:.

1 Like

Since the point here is bikeshedding, I don't feel too bad about suggesting:

  • Send + Sync: Thread shareable
  • Send + !Sync: Thread transferable
  • !Send + Sync: Thread referencable
  • !Send + !Sync: Thread local, crap. Uh. Thread locked? Unthreaded?
6 Likes

"Thread sensitive" is a great description for std::sync::MutexGuard.

4 Likes

Love it! Currently, my favorite terms are "Thread Portable" and "Thread Sensitive" to describe their respective concepts.

I want to specifically thank @Yandros for his awesome write-ups. I feel like these replies should be preserved as a blog post or as part of a guide somewhere. I'm hesitant to mark the solution however because I don't want to curtail the continuing discussion.

Thanks everyone!

3 Likes

I like "Thread sensitive" just in abstract, but I'm not sure it sells to me exactly what it means? To me it sounds most like any !Send, Sync or not.

Also, you've quoted !Send + !Sync but provided the example of MutexGuard, which I'm lead to believe is !Send + Sync. I'm a little confused.

Ah, oops. MutexGuard is only thread-sensitive on drop. If the &self functions of a type inspect the thread identity, then that would make it !Send.

My understanding is you can pass a ref to a MutexGuard across threads (presumably to prove you definitely have exclusive access by some chain-of-custody), so it's Sync, but the implementation itself assumes that the unlock is on the same thread as the lock, so it's !Send.

Honestly, @alice is probably most accurate here in that you don't really want to be messing around with short-hand for mixed Send and Sync anyway, they can be pretty confusing.

Fortunately, in my experience "Send" and "Sync" on the usage side is pretty straightforward. Use Send if you're sending it, use Sync if you need it to handle unsynchronized access.

One of my problems with the name of Send (and suggestions like "thread transferable") is that Send isn't really about the validity of moves. It has more to do with when mutable access is safe.

There are also many situations where it's unclear which thread owns the value in the first place, e.g. what about values stored in global variables?

5 Likes