*mut/*const and proper Send/Sync on stable rust


#1

Hi folks,

I’m working on a rust binding for a c library (https://github.com/daschl/hwloc-rs) and while the single threaded version works fine I’m now trying to make it work in multithreaded contexts.

I have a struct which is my main entry point and it contains the main references to the underlying c structures (https://github.com/daschl/hwloc-rs/blob/master/src/lib.rs#L25):

pub struct Topology {
    topo: *mut ffi::HwlocTopology,
    support: *const TopologySupport,
}

I know I need to implement Send and Sync like this:

unsafe impl Send for Topology {}
unsafe impl Sync for Topology {}

But then of course the question comes up how to actually synchronize access to *mut and *const. I know there are wrappers like Unique, but they don’t work on stable rust right now and this is what I want to target.

So my next thoughts were: Mutex? RwLock? But if I understood it correctly they all need to be instantiated to wrap the type, but how can I wrap a *mut or *const? As the pointers indicate, this TopologySupport thing is immutable, there is no way to change it. Do I need to synchronize access to it at all? My *mut HwlocTopology definitely needs to be synchronized.

I tried to look in the std collections but I wasn’t very successful since I saw lots of Unique wrappers there which I have no idea why it works in stdlib on stable rust but I can’t use it? :wink:


#2

The stdlib is special in this regard. It can be stable but use unstable APIs, because breakages will be fixed at the same time as the introduction of the breaking changes. Stability guarantees are purely for user crates: if you use only stable APIs, your crate is 99% guaranteed to build on all future versions of Rust < 2.0, so you won’t be forced to go back and fix it later on.

As for requiring synchronization, it’s easy as long as you verify the FFI types you’re using are thread-safe. On your Topology struct, make all methods that modify topo take &mut self. This will make sure that there’s always a unique reference before topo can be modified. Then Topology can go into a regular old Mutex or RwLock.


#3

Make that 99.99%. We have crater for that. https://github.com/brson/taskcluster-crater


#4

NB. Unique and Shared do not actually do any synchronisation: in this respect, they are assertions about the semantics of the type they contain, abbreviating various pieces of boilerplate (such as the Send and Sync implementations).

They’re relying on the sharing/mutation implications of their own definitions, and the “definitions” of the concurrency traits traits:

  • Send is for types that can be safely moved between threads (basically, as long as the C API doesn’t require that functions are only called on a single thread, and that you control sharing, this should be fine without synchronisation)
  • Sync is for types that can be safely used in a shared fashion (that is, via &T pointers) from multiple threads concurrently

In particular, if you ensure that data can only be mutated via &mut pointers and don’t have any internal sharing, then you won’t need any synchronization.

If a type is Send but not Sync it is more usual to not do any synchronisation and let users do it themselves, by, e.g. wrapping it in a Mutex. This avoids the cost of synchronisation when it’s not needed (single threaded, or just using message passing), and allows users to choose the appropriate level of coarse/fine-grained synchronisation for their application.

(Eh, our stability story is good, but I think you’re being a little over-enthusiastic. That link shows a number much less than 99.99% and I’d definitely expect that there will be future bug-fixes that affect more than one line in ten thousand, or one crate in ten thousand, or whatever measure you choose.)


#5

Thanks everyone for your replies - so to rephrase that: as long as I’m “just” modifying stuff behind a *mut pointer if my method is a &mut self, then all is well and the user just needs to place my Topology behind a Mutex?

So your second sentence is the right way to fulfil the first one?

Are there any examples when this does not hold true or are there any edge cases where I need to be careful? So for example what would happen if my library underneath changes some data without me knowing and then I read (potentially stale) data? I apologize if my questions don’t make much sense, but I’m trying to grasp the concepts and in this particular area the available documentation has not enlightened me enough :smile:


#6

Interesting, I did not know that - but it makes sense!


#7

For those who are interested, this is what I ended up with as an example of binding threads to cores: https://github.com/daschl/hwloc-rs/blob/master/examples/bind_threads.rs


#8

This is frankly too optimistic. We know crater can’t and doesn’t cover all rust code that people use and rely on. It’s a cool tool, but I’m on the side of always cautioning against relying on it as an oracle.