What is Send and Sync exactly?

Hi folks, you're the best community I have ever seen. I asked question at title but let me explain what exactly I'm looking for.

1- There isn't anything source code of these traits. What is there inside of Send and Sync traits?

pub unsafe auto trait Send {
    // empty.
}

pub unsafe auto trait Sync {
    // FIXME(estebank): once support to add notes in `rustc_on_unimplemented`
    // lands in beta, and it has been extended to check whether a closure is
    // anywhere in the requirement chain, extend it as such (#48534):
    // ```
    // on(
    //     closure,
    //     note="`{Self}` cannot be shared safely, consider marking the closure `move`"
    // ),
    // ```

    // Empty
}

2- I can't use derive macro for Send and Sync but I can create impl blocks for these. Probably derive macro can't derive unsafe blocks. Also this macro's source code is empty too. What is exactly makes this macro? Why it can't create unsafe implementations?

    pub macro derive($item:item) {
        /* compiler built-in */
    }

....

// I can't do that:
#[derive(Debug, Send, Sync)]
pub struct SendSyncExample {
    pub counter: Arc<Mutex<i32>>,
}

// but I can do that:
#[derive(Debug)]
pub struct SendSyncExample {
    pub counter: Arc<Mutex<i32>>,
}

unsafe impl Send for SendSyncExample {}
unsafe impl Sync for SendSyncExample {}

Thanks to all brothers.

1 Like

Nothing. They are so-called "marker traits", that doesn't have any methods, but serve as a promise from the implementor that the type they're implemented on satisfy corresponding properties.

The technical reason is not unsafe, actually - derive has an explicitly limited list of traits it can generate (Display, for example, cannot be derived, yet it is perfectly safe). The reason for Send and Sync not being in that list is indeed that they are unsafe - that is, that they provide some soundness promise, which must be checked by the implementor and explicitly marked as such.

3 Likes

What do you mean "implementor"? Who is it? The rust developer or who develops the compiler?

The creator of the data structure that implements the traits.

Probably you didn't understand what I'm asking. I already gave example for implementation of these traits like that: unsafe impl Send for SendSyncExample {} I'm asking that "what is going on in compiler side?"

Nothing different from any other unsafe trait. And, aside from the check for the unsafe keyword, nothing different from any other trait.

The important part is that thread::spawn() and similar apis require Send for the parameters. This then cascades into the types your closure captures. Like a Channel which then requires Send for the transmitted values.

But the only thing the compiler knows is that tread::spawn() requires Send, because the signature says so.

2 Likes

The only special things about the Send and Sync traits are that they're auto traits. Send and Sync are automatically implemented on a struct type or enum type if all of its fields' types also implement Send/Sync. In addition to these automatic implementations, you can also manually add more implementations on your own, like a normal trait. (The fact that Send and Sync are unsafe traits are not special. You can define your own unsafe traits.)

Implementing these traits do absolutely nothing on their own. However, some functions will only let you call them with types that implement Send and/or Sync. That's all Send and Sync does.

7 Likes

It doesn't do anything actively. It doesn't add any code. These don't exist in any shape or form in the running program.

It tells the compiler "you can allow usage of this struct from other threads". Without this, users of multi-threaded code would get a compilation error if they tried to use this struct.

Rust uses Send/Sync as a way to track (at compile time) what data is permitted to be used in multi-threaded code, and under what circumstances. Types that don't have these markers are considered non-thread-safe, and only allowed in single-threaded code.

This being a universal marker allows checking huge amount of code across different libraries, and automatically ensuring that it all works safely together.

Most of the time the compiler can automatically guess whether a struct is Send or Sync, but sometimes the programmer has to manually override that (e.g. when implementing a Mutex).

6 Likes

Probably compiler is adding necessary assembly codes itself to executable right?

I would say it's exactly the opposite. These are marker traits, and since they don't have any effect at runtime, I'm pretty sure they don't end up in the compiled binary.

7 Likes

No. Send and Sync translate to zero assembly instructions.

Do you suggest to me that study about how to develop multithread code in assembly for better understanding of relation between multithreading and Send-Sync traits.

Disclaimer: I haven't yet grocked Send vs Sync rules fully, so I'm lumping them together in to one bucket for this illustration.

  1. Plain old data types are Send and Sync. Any structs that contain only plain old data are also Send and Sync, because all of the contained types are Send and Sync.

  2. As a layer on top of this,
    There are some types (UnsafeCell, and I think others that use thread local storage?) that explicitly do not implement Send and/or Sync because they are not safe to send or use across threads boundaries. Including these types in your struct will mean your struct no longer implements Send and/or Sync by the first rule (no longer plain old data).

  3. As another layer, some types unsafe impl one or both of the traits, because they contain a non-Send or non-Sync type from (2) above, but they add extra mechanisms on top of their type to ensure that it is safe to use across threads.

Generally, any type that uses thread-local storage is not going to implement Send. This is why you can't send an Rc across threads.

1 Like

No, the opposite. You will not see them in assembly, because they don't exist there. They are an abstract theoretical concept, not real code.

Send/Sync, as well as lifetimes and unsafe{} blocks do not alter the program in any way. They only decide what programs are allowed to exist (i.e. invalid programs breaking these checks can't exist).

6 Likes

Only if you're kind of person that would enjoy eating a bowl of soup with a toothpick.

Assembly is machine code. Codified set of 0's and 1's which instructs the hardware itself to copy other sets of 0's and 1's from one location in memory to others. That's all it can and all it's allowed to do. Analogy - the individual nerve signals, coordinating the whole of your experience as a human being: from thinking and feeling down to moving your limbs and scratching your toes.

Machine code is (surprisingly) machine-specific. Projects like LLVM do what they can/want to abstract over the differences of individual hardware in order to allow you to write one set of instructions and have the toolchain itself compile it into raw asm code for any given architecture.

example
define i32 @add1(i32 %a, i32 %b) {
entry:
  %tmp1 = add i32 %a, %b
  ret i32 %tmp1
}

define i32 @add2(i32 %a, i32 %b) {
entry:
  %tmp1 = icmp eq i32 %a, 0
  br i1 %tmp1, label %done, label %recurse

recurse:
  %tmp2 = sub i32 %a, 1
  %tmp3 = add i32 %b, 1
  %tmp4 = call i32 @add2(i32 %tmp2, i32 %tmp3)
  ret i32 %tmp4

done:
  ret i32 %b
}

// in C 

unsigned add1(unsigned a, unsigned b) {
  return a+b;
}

// Perhaps not the most efficient way to add two numbers.
unsigned add2(unsigned a, unsigned b) {
  if (a == 0) return b;
  return add2(a-1, b+1);
}

// in Rust

fn add1(a: u32, b: u32) -> u32 {
    a + b
}
fn add2(a: u32, b: u32) {
  if a == 0 { return b; }
  add2(a - 1, b + 1)
}

// taken from: https://aosabook.org/en/v1/llvm.html

Different programming languages can either reinvent the wheel from scratch in order to compile into raw asm instructions themselves or compile / transpile / engineer their own infrastructure around pre-existing toolchains like GCC or LLVM itself. What's left is (absolutely) trivial: just some causal lexing and parsing and type checking and dependency management and testing and docs. Some meta-programming capability with declarative and/or procedural macros are a plus.

Maybe tracking individual variable lifetime/s with respect to the memory they own vs share vs have an exclusive access to; or enforcing sensible thread-safety requirements in order to prevent some of the worst bugs you might come across. Send and Sync are quite good at that last part.

unsafe impl is an impl for a trait explicitly marked as unsafe. In the context of Send and Sync markers: some types might be safe to hand over / Send or share a reference to / Sync with another thread even though some of the data they hold doesn't seem safe to Send or Sync.

For such cases: you are allowed to unsafe impl Send/Sync for T {} on your own. The unsafe portion is there to tell the compiler "I know what I'm doing here". It might not trust you otherwise.

Macros are a whole different topic. /* compiler built-in */ means exactly what it says there. You can implement your own #[derive(SomeTrait)] logic as necessary for most types. Some of the traits come built-in with the std and it made more sense to bake them right into rustc.

2 Likes

#[derive] is not a macro that makes trait implementations. Rather, it is a built-in macro that calls other macros — derive macros. Each derive macro produces a specific trait implementation — for example, there is a derive macro for PartialEq that generates PartialEq implementations. The reason that #[derive(Send)] doesn't work is that there is no derive macro named Send. The reason there is no derive macro named Send is that the auto trait mechanism serves the role that a derive macro usually would, implementing Send for most types automatically in a systematic fashion.

Derive macros producing unsafe impls is rare, but does happen sometimes; for example, bytemuck provides such derive macros to implement its unsafe traits.

4 Likes

The raw pointer types *mut and *const (and similar types like NonNull) are also neither Send nor Sync because they lack guarantees about access to what is behind the pointer. On the other hand, LocalKey is Send, because it is just a handle that accesses the local storage of whatever thread it happens to be on and just reinitialises for that thread (see an example here).

The reason Rc isn't Send isn't because it uses thread-local storage as such, it is because when it is cloned or dropped, the reference count is updated with a non-atomic action, which is unsafe if two threads are trying to do it at once. Since all clones of an Rc point to the same reference, that precludes it being Send.

From the discussion it starts looking more and more like you are doing the exact same mistake as “we code for the hardware” people (mostly C, although sometimes C++, developers) are doing: assume that assembler, machine code matter – and everything else is not of too much importance.

The end result: compilers destroy their “perfectly valid programs” and then they publish blog posts, petitions, throw temper tampers and do various other silly things in an attempt to squeeze something that's, as they perceive, is their inalienable right: get, from the compiler makers, some description of the way to predict how arbitrary program would behave, after compilation.

The sad truth that their petitions, demands, and blog posts ignore is the following: what they want simply couldn't exist.

Not as “it would require superintelligence and if we would build few terawattes of datacenters than super AI may do that”, but as in “no matter how much resources we would throw on the probem… it would still be unsolved”.

It's just simply mathematically impossible to reliably get as answer to the simple question “does that machine code works as the source code of Rust program intended or not”.

And that is where the dreaded “undefined behavior” springs from: if it's flat out simply impossible to create a compiler that both accepts all “good” programs and rejects all “bad” programs (note that I haven't specified what are “good” programs and what are “bad” programs… I don't need to, the impossibility would happen with any non-trivial definition of “good” and “bad” programs) – then we can only one choice left: accept the fact that some programs that compiler accepts… would be miscompiled.

And these traits, Send and Sync are part of the solution for that sad dilemma. Note that I highlighted all in the previous paragraph. It's very important. If we are ready to reject some “good” programs, or, alternatively, accept some “bad” programs… either accept a bit of chaff or lose a bit of wheat… then the hard block on the mathematically proven “impossible” path disappears.

And that's where Send and Sync traits become important. These traits don't actually do anything. They just mark certain types as “good” (in two different senses).

And, most of the time, compiler can do such decision without your help.

But sometimes compiler can not be sure that certain types are “good”. Then you need to verify certain properties of the types involved and implement these traits manually.

This happens rarely but it does happen.

You couldn't derive them automatically because most of the time compiler provides an implementations of these traits for you without you even doing anything, there are no need to derive.

And in cases where you need to derive them manually you need to first understand what requirements are applied on types that are Send or Sync, then you need to understand why compiler haven't derived these traits automatically and then you need to explain, to yourself and to the reviewers why do you think your types is actually “good”… and only then you need to derive these traits manually.

But because these traits are used only to separate “good” programs that should be accepted and translated to machine code and “bad” programs that should be rejected… nothing happened in compiled program if you implement or don't implement these traits.

They are only there to better separate chaff from wheat…

5 Likes