Mutability of bingen's generated code

My company has been using rust to build an ecosystem of services that process input streams. We are using rust libraries to build thread-pools of concurrent services and it works fine.

Now I'm currently writing some bindings around an old C++ library that we want to keep. My project uses bindgen on a C++ class called G with the standard build.rs script to generate the grammar library and looks like it is working. G uses some other classes in the same library.

Let me summarize the situation:

  1. All our rust services are implemented using traits that require immutability (so, the trait functions are on &self instead of &mut self)
  2. G is created and stored inside one of these services for subsequent use.
  3. The binding of G has some rust methods that make G itself mutable, so you need a mutable reference to actually use the object for anything.
  4. I have wrapped G with a rust Cell<> to handle this and include the binding with the rest of the services.
  5. I don't have the technical capacity (or time) to rewrite the C++ library to address this mutability issue. (if possible)

The mutable methods of G created by bindgen are like this:

pub unsafe fn apply(&mut self, input: *const ::std::os::raw::c_char) -> *mut ::std::os::raw::c_char {
    G_apply(self, input)
}

So far the binding with the C++ library works fine and I'm very happy, but the real problem starts when I try to pool this object, for example with rayon crate, and I get the expected error:

`std::cell::Cell<grammar::root::G>` cannot be shared between threads safely
[...]  the trait `std::marker::Sync` is not implemented for `std::cell::Cell<grammar::root::G>`

And also this unexpected error:

`*mut grammar::root::File` cannot be sent between threads safely
 note: required because it appears within the type `grammar::root::G`

So, not only G is unsafe, but one of the objects used inside G is also unsafe on its own way. (Although after some years on production I'm confident in that grammars library is thread safe, but this doesn't matter here)

Then, my questions are:
Is there a way other tan Cell to handle this situation and allow me to pass G to threads?
What to do with grammar::root::File? Is there a way of fixing this problem from rust? Or a way to hide it and tell the compiler "just use this library, It won't crash"?
I understand that possible solutions involve either making G immutable or implementing Send/Sync, but can't find how to do any.

Thanks!

Rust assumes that all structs containing raw pointers are not thread-safe just because it doesn't know what's behind them.

  • If the C++ class is thread-safe, then you can implement Send and Sync for your wrapper (unsafe impl Send for G {})

  • If the C++ class isn't thread-safe, then Rust is right, and just did its job preventing data corruption. You can safely make non-Sync types Sync by wrapping them in a Mutex.

1 Like

Cell is never Sync, regardless of the contained type. Raw pointers are also never automatically Sync, since Rust doesn't know how you're going to use them. You would have to wrap UnsafeCell<G> and use unsafe impl Sync for that wrapper type to assert its safety.

If these actually require an exclusive &mut reference to G and you want to call them while G is shared by multiple threads, you'll need to wrap G in a Mutex rather than a Cell.

On the other hand, if they are already thread-safe because the C++ library is doing its own locking, and so it's safe to call them simultaneously on multiple threads, then you could make Rust functions that take a shared &G reference, and cast it to *const G and then to *mut G.

Wouldn't you have to use an &UnsafeCell<G> here to signal to the compiler that something recursively accessible through G is modifyable?

1 Like

Thanks for all your replies. The solution has been as follows

First of all I started wrapping G inside a new struct and implementing Send and Sync for this new struct. This allowed to pool copies of G but had a drawback. Since G contained a Cell, I needed to explicity take() and replace to struct to use it, which needed a mutex to avoid having instances of G with the default empty memory. So, this solution defeats the purpose of concurrency.

The solution came from learning that in C++ you can state the function property of not modifying the class attributes by adding a final const:

int apply(const char *input) const;

This generates a rust function that doesn't use a mutable reference to self. Then, you don't need to use Cell and the pooling mechanism works fine.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.