API Design - Interior Mutability of Opaque Pointer

I'm working on a wrapper for a C API and I'm trying to use Rust's type system and lifetimes to make it easier to use than the default flat API.

Behind the opaque pointer there are sub-items (Channels below) that can be accessed. I could just provide them a non-mutable copy of the pointer but really the semantics are that we are mutating the task.

So I want users to be able to get a channel struct for the time that they are using it and ensure its lifetime is restricted to avoid parallel access.

I'm torn between two APIs:

  1. Get a channel which holds a mutable reference. If the channel outlives the next task access it will fail to compile. This may feel different to previous users of this API in C and could confuse new users still learning lifetimes.
  2. Create a function which takes a closure for them to complete configuration in a defined scope.

As I try to explain this I think 1 is better. I think it is more idomatic and the compiler should guide the users.

An example of the two methods are below. Can anyone give any guidance on whether I'm on the right path?

struct Task (*mut std::os::raw::c_void);

/// Technically we dont need mut since the c_void has
/// interior mutability - but seems clearer to take as mut?
struct Channel<'a> {
    task: &'a mut Task
}

impl<'a> Channel<'a> {
    pub fn do_something(&self) {
        todo!()
    }
}

impl Task {

    /// Get the channel with a linked lifetime. Up to user to understand
    /// when it goes out of scope
    fn get_channel(&mut self, name: &str) -> Channel {
      //ffi call
      Channel {
          task: self
      }
        
    }
    
    /// Closure giving clear bounds.
    fn configure_channel(&mut self, name: &str, configuration: fn(Channel) -> ()) {
        configuration(self.get_channel(name));
    }
}

fn main() {
    let mut task = Task(std::ptr::null_mut()); //so I can check compilation
    
    let channel = task.get_channel("test");
    channel.do_something();
    
    //don't hold onto channel for too long!
    task.configure_channel("test2", |channel| {
        channel.do_something();
    })
    
}

(Playground)

I would go with the lifetime. If they're using rust, they're going to have to learn lifetimes either way. And closures can sometimes cause confusing errors especially if you try and use them the same way closure can be used in other languages

Your sample actually looks like it's taking a function pointer and not a closure though, which might avoid some of the problems with closures at the cost of less flexibility.

Even in the function pointer case you still really need Channel to have a lifetime. Otherwise a user could stick it in a static or something which might cause issues

2 Likes

Speaking of learning lifetimes, it's considered more clear to use Channel<'_> instead of Channel to make it more clear that a lifetime-carrying struct is borrowing something. You can force yourself to do this by using !#[deny(elided_lifetimes_in_paths)].

If you do supply some sort of scoped access, you probably want

fn configure_channel<F: FnOnce(Channel<'_>)>(&mut self, name: &str, configuration: F) {

as opposed to your OP that used a function pointer (fn(Channel<'_>)). With the FnOnce version, the user of the API could reference local variables in their closure, bringing it closer in functionality to the Channel<'_>-returning approach. (Closures can only be coerced to function pointers if they don't capture anything.)

4 Likes

Thanks for the replies and the tips, especially on the functions and lifetimes - I wasn't aware of those changes.

I've been playing with the API this evening and like the lifetimes method more too. I guess this will come with experience - the typesystem in rust is powerful but I find it means there is normally 50 ways to design the API!

I've worked with *-sys crates that take this approach and would strongly recommend against it because it leads to some unwieldy APIs.

On the surface, this feels like an excellent place to use Rust's lifetime and borrowing rules. You see that the C code has an invariant that one object can't outlive another and decide to encode this in the type system, excellent, now it's impossible to misuse the API and run into a use-after-free.

However! What you often find is that this Channel is intended to be a long-lived object with a dynamic lifetime - i.e. you'll pass it around and store it somewhere instead of just leaving it as a local variable in a single function - so the Rust user will treat it as such by putting it in long-lived business objects or having a Vec<Channel<'task>>.

struct App<'task> {
  channel: Channel<'task>,
  ...
}

Now, what we've done is infect the rest of our app with our 'task lifetime, littering the codebase with lifetime annotations and making it so common APIs like tokio::spawn() or std::thread::spawn() can't be used. It also makes setting up your business objects more difficult and exposes your users to frustrating lifetime issues (especially if you use &'task mut Task instead of &'task Task because of variance).

It also means you can't encapsulate App::new() in a single function because you need to pass in a reference to your task. That makes people want to store the Task in the same object as the Channel to make managing the object's lifetime easier.

struct App {
  task: Task,
  channel: Channel<'???>,
}

impl App {
  fn new() -> Self {
    let task = Task::new();
    let channel = task.get_channel("...");
    App { task, channel }
  }
}

As seasoned Rust veterans will tell you, this is a "self-referential struct" and if you're probably gonna have a bad time[1].

Instead, I would suggest using something based on reference counting that ensures your Channel can't outlive the *mut c_void pointer to the underlying task.

struct RawTask(*mut c_void);

impl Drop for RawTask { 
  fn drop(&mut self) { /* call your library's destructor function */ }
}

pub struct Task(Rc<RawTask>);

impl Task {
  pub fn get_channel(&self, name: &str) -> Channel {
    let task = Rc::clone(&self.0);
    ...
    Channel(task)
  }
}

pub struct Channel(Rc<RawTask>);

impl Channel {
  pub fn do_something(&self) { ... }
}

C APIs are generally meant to work on shared, non-threadsafe data, so I would actually make all methods work on &self instead of &mut self and use Rc instead of Arc.

Of course, like everything, this all depends on your intended use case. If your Channel is only meant as a temporary thing that users hold onto for the context of a single function call, then feel free to ignore me and use one of your original methods (I'd prefer #1). Using a closure is equivalent to having a get_channel() -> Channel<'_> (both give the user an object with a temporary lifetime) except you can enforce that any cleanup work can be done at the expense of being more annoying to use.


  1. For example, how can I return an App from App::new()? We've moved the task variable into App { task, channel }, but that's not allowed because channel still has a reference to task. There are other ways of constructing App safely, but they have the problem that your app variable can now never be moved. ↩ī¸Ž

7 Likes

This is great reply - thank you - and definitely given me something to think about.

In this specific case I think these will just be short lived, they are kind of just a configuration access but I've definitely got other projects where I can see this problem coming up!

(And now I've said that I bet this will be one of the first bits of feedback from users!)

1 Like

Yeah, I had a feeling that might be the case because of your "I want users to be able to get a channel struct for the time that they are using it" comment, but I still think it's something useful to keep in mind, especially for the higher-level components (i.e. Task).

When it comes to FFI and lifetimes, I normally ask myself whether it would make sense for a pure Rust API to use lifetimes or smart pointers in a similar context, and use that to inform the design.

Yep - I think I will be putting the pointer behind an Arc anyway because as you say, this is semantically what the C API is wanting to do. (In this case it is "thread-safe")

1 Like

This a hundred times. The usage of SQLite transactions in Rust can get annoying because the transaction references (but can't keep alive) the connection. I have written a custom SQLite wrapper before, and I used exactly this design – the transaction holds an Arc to the connection in order to fulfill the lifetime requirment while hiding the (dynamic) lifetime itself.

3 Likes