How do I - Rust + C lib + threads

Hello. I'm new to Rust. It fries my brain when I try to understand it. Here's some code I'd like to make properly idiomatic (and by extension, actually compile..) All feedback welcome, including "take off and nuke it from orbit" if it's really that bad.

Note: The code will not compile as is. It needs.. help..

#[allow(unused)]
use std::sync::{Arc, Mutex};
use std::thread::{self, JoinHandle};
use std::collections::VecDeque;
use std::time::Duration;
#[allow(unused)]
use std::ptr;

// a wrapper for a pointer to a Thing allocated by a C library
struct LibThing<T>(*mut T);
// or possibly..?
//struct LibThing<T>(Mutex<*mut T>);

impl<T> LibThing<T> {
    pub fn new() -> Self {
        // call C lib to allocate a new Thing
        Self(ptr::null_mut())
    }
    
    // Thing should never be accessed by two threads at once
    // do we do this here, or externally using Mutex<LibThing> ?
    pub fn lock() {}
    
    // requires exclusive access for duration
    pub fn transform(&mut self) {
        // call C lib to mutate the Thing in some way
    }
    
    #[allow(unused)]
    // Source and Dest should be exclusively accessed by a single thread for duration
    pub fn copy_to(&self, dst: &mut LibThing<T>) {
        // call C lib to copy data from Thing A to Thing B
    }
}

// Thing can be safely freed by the thread that allocated it
impl<T> Drop for LibThing<T> {
    fn drop(&mut self) {
        // call C lib to free a Thing
    }
}

// !Send: it is not desirable for any other thread to free a Thing than the thread that allocated it
// Sync: a source Thing will be shared with multiple worker threads,
//  but each worker must have exclusively access while reading from it
//unsafe impl<T> Sync for LibThing<T> {}

fn main() {
    const max_threads: u8 = 10;
    let mut threads: VecDeque<JoinHandle<()>> = VecDeque::new();
    // simulate loading a source Thing from a file
    let source_thing = LibThing::<i8>::new();
    let mutex = Mutex::new(source_thing);
    let shared_mutex = &mutex;
    
    for i in 0..max_threads {
        let handle = thread::spawn(move || {
            println!("t{i} spawned");
            println!("t{i} get source Thing mutex");
            let guard = shared_mutex.lock().expect("mutex poisoned");
            println!("t{i} got mutex, simulate work..");
            let mut dst = LibThing::<i8>::new();
            guard.copy_to(&mut dst);
            thread::sleep(Duration::from_secs(1));
            println!("t{i} release mutex");
            //drop(guard);
        });
        threads.push_back(handle);
    }
    
    for i in threads.into_iter() {
        let _ = i.join();
    }
}

Do you mean that the C library in fact requires that the Thing is deallocated on the same thread that it was allocated, or just that you want to make sure there is no early freeing?

If that is required, then you cannot also use &mut LibThing for exclusive access, because &mut LibThing implies the ability to swap two LibThings, so sending a &mut LibThing is potentially equivalent to sending the LibThing. In this case, you have to either:

  • design the API so that &LibThing is sufficient, and you can then share the &LibThing with other threads, given unsafe impl Sync; or
  • stick to access from a single thread.

If deallocating on the same thread is not required, then LibThing can and should be Send.

It's required by the C library that the Thing is freed on the same thread that allocated it.

When shared between threads, the only expectation is that Thing can be read/copied, not mutated.

The destination Thing (for copying) is always created in the worker thread. All mutations are done on the thread that created the Thing.

Then why put it in a mutex?

The C lib requires that only one thread reads the Thing at a time.

So even &Thing is dangerous if multiple threads try to read at the same time. I don't know why. It's the first time I'm using this lib but that's what it says.

This suggests something is being mutated in the lib when Thing is read, probably either Thing itself (maybe a counter of some sort) or some global state. In any case, you could make a type SharedLibThing(&LibThing) that implements Send and only allows copy_to with &mut self, then put that in a mutex. I tried to figure out a solution with ReentrantMutex or Exclusive but I couldn't quite make it work.

Another solution is to move the problem to runtime. On creation, store a ThreadId, and check it on drop. If the current thread is not the original thread, panic.

And FYI, you need scoped threads in order to use shared refs from the main thread in spawned threads.

Is there a downside to doing struct LibThing(Mutex<*mut T>) ? Then every operation - copy_to and transform, etc - can simply be blocking until the Mutex is locked?

Then I can impl Sync for LibThing(Mutex<*mut T>) ?

Or I am heading in the wrong direction, there..

That should work, just with the obvious downsides: the struct is bigger and you have to lock a mutex for every operation. But those aren't too bad. If you need to do several operations in a row, you can always make a scope method or guard.

1 Like

I just realised I can't do that anyhow. LibThing(Mutex<*mut T>) won't work because the inner of a Mutex has to be Send, lol.

But I could possibly have

struct LibThing<T> {
    ext_ptr: *mut T,
    mutex: Mutex<PhantomData<T>>,
}

The trouble is then, if I use a 'fake' Mutex, I don't know if compiler optimisations would make accesses to the pointer out-of-order, i.e. before I'd locked the Mutex member.

So then I'd need to wrap the pointer..?

struct LibPtr<T>(*mut T);
unsafe impl Send for LibPtr {}

struct LibThing<T>(Mutex<LibPtr<T>>)

Ugh! I don't think that's correct, is it. This all feels wrong, like I'm fundamentally misunderstanding something :stuck_out_tongue:

I think it’s worth seeing what other bindings to C code do, like bindings to Zstandard: lib.rs - source

// Non thread-safe methods already take `&mut self`, so it's fine to implement Sync here.
unsafe impl Sync for DCtx<'_> {}

Hmmm though their case isn’t quite the same as yours, since they can implement Send, meaning an external Mutex is sufficient for their code to be useful.

That’s not a concern, the lock (and its guard) use sufficient memory orderings to make sure that doesn’t happen. Internally, a Mutex consists of a mutex implementation stored next to the data it’s protecting; basically the same as what you want to do.

Just make sure to hold the guard for the entirety of the time you use the data. Explicitly calling drop(guard) when you’re done is probably a good idea, just to help catch any problems.

If you ever need to have a function that returns a reference to the data, you’d need to make some WrappedReferenceGuard type that stores both the data reference and the MutexGuard.

It almost seems to me that there should be another Mutex type, one that doesn't require Send.

The requirement for Send is explained with the example of Mutex<Rc<T>>, where Rc clones reference the same (mutable) memory internally.

If your type does no such thing, but still doesn't want to be Send, I don't see why that would necessarily cause any issues.

Do you want multiple threads to be able to call this as well? (If so, make it mutex-protected and take &self just like the other function.)

I think it’s not included in the standard library because it’s a relatively rare use case, and isn’t too complicated to implement ad-hoc. (There’s just too many permutations of things people might want to support them all in std.) Even owned mutex guards (e.g., lock an Arc<Mutex<T>> and get back a guard that holds a reference count rather than using a lifetime) aren’t supported. There’s still enough tools available that it only takes a small amount of unsafe code to build such a type on top of the standard library.

Cheers for the replies. I just want to be absolutely clear about this bit:

struct LibThing<T> {
    ext_ptr: *mut T,
    mutex: Mutex<PhantomData<T>>,
}

...

impl LibThing {
    fn do_something(&self) {
        let _guard = self.mutex.lock();
        // call C lib
        some_lib_func(ext_ptr);
        drop(_guard);
    }
}

This is completely safe? The external pointer ext_ptr isn't inside the Mutex, but we only use the pointer once we've got a lock from the Mutex.

Yes, that’s sound (though shouldn’t be “safe”). (some_lib_func should be an unsafe function, and there should be a SAFETY comment explaining why calling it there is sound.)

Ah yes, sorry. I didn't mean "Rust safe", rather "not going to catch fire" safe :stuck_out_tongue:

There's a simpler explanation: a thread with a &Mutex<U> can get a &mut U and swap in a different U, and now the original U has effectively been sent across threads.

See this doesn't make any sense to me.

There's no reason I can think of to Send a Mutex to another thread. Which you would have to do afaict to call get_mut().

A Mutex inside an Arc is Sync and not Send (right?), so you can't call get_mut() in that context anyhow, because you only have &Mutex.

This is my understanding right now.

Whenever you make Arc<Mutex<U>> and send clones of the Arc to other threads, the Arc creates joint ownership from all threads, so in a sense it has already been sent. Even more so, when some thread drops the last Arc, the Mutex and the U is dropped on that thread, so it has definitely been sent to that thread.

@quinedot isn’t talking about using Mutex::get_mut(), but the &mut U that you get from dereferencing a MutexGuard in normal usage of the Mutex. Once you have that &mut U, you can std::mem::replace() it with a different U, and thus the first U has been sent.

Arc can be used as a means of sending, as I described above, so Arc<T> only implements Send or implements Sync if T implements both Send and Sync. If the Mutex wasn’t Send, then the Arc would be neither Sync nor Send.

If you have a value of some type T that is Sync and not Send, then the only way to share that value with another thread is to send &T (or something else only that powerful) rather than sending Arc, because you can only send something that doesn’t allow gaining ownership of the T, because another thread gaining ownership is what Sending is.

1 Like