Stateful closure for FFI callback

I’m writing a Rust binding for a C library. Basically, the library proposes functions working on a underlying object through a raw pointer handle:

mod ffi {
    extern "C" {
        pub fn new() -> *const u64;
        pub fn delete(handle: *const u64);

        pub fn get_state(handle: *const u64) -> u64;
        pub fn exec(handle: *const u64);

        pub fn register_callback(
            handle: *const u64,
            data: *const u64,
            callback: extern "C" fn(handle: *const u64, data: *const u64),
        );
    }
}

There are functions to initialize, release resources, get state of the underlying object, add callbacks, and execute the underlying object. And in executing, callbacks are able to get status of the object, this is the reason why callback in register_callback has handle as one of its arguments.

In the current design, I’m wrapping the raw pointer by a FatPointer:

struct FatPointer<'a> {
    handle: *const u64,
    phantom: PhantomData<&'a u64>,
    callbacks: Vec<Box<dyn 'a + Fn(&FatPointer)>>, // store registered callbacks
}

and some functions to create, drop, etc. the pointer:

impl<'a> FatPointer<'a> {
    fn new() -> Self {
        let handle = unsafe { ffi::new() };
        FatPointer {
            handle,
            phantom: PhantomData,
            callbacks: vec![],
        }
    }

    pub fn state(&self) -> u64 {
        unsafe { ffi::get_state(self.handle) }
    }

    pub fn exec(&self) {
        unsafe { ffi::exec(self.handle) };
    }

    // pub fn add_callback(...
}

impl Drop for FatPointer<'_> {
    fn drop(&mut self) {
        unsafe { ffi::delete(self.handle) };
    }
}

For registering callbacks, I think that FatPointer should accept closures as callbacks, so I’ve use this technique (it’s the single technique I’m aware of, are other techniques similar?)

impl<'a> FatPointer<'a> {
    // pub fn exec(...
    
    pub fn add_callback<F: 'a + Fn(&FatPointer)>(&mut self, callback: F) {
        let callback = Box::new(callback);

        let data = &*callback as *const _ as *const u64;
        unsafe { ffi::register_callback(self.handle, data, adapter::<F>) };

        extern "C" fn adapter<F: Fn(&FatPointer)>(handle: *const u64, data: *const u64) {
            let cb = data as *const F;
            // unsafe { (*cb)(transmute(handle)) } !?
        }

        self.callbacks.push(callback);
    }
}

So any callback is of type 'a + Fn(&FatPointer) (while a C callback is of type fn(handle: *const u64, data: *const u64)):

  • it has lifetime 'a because I don’t want it access to anything living longer than the fat pointer.
  • it requires an argument of type &FatPointer because it may need to get some state of the underlying object (e.g. by calling state(&self))

The idea of passing a Rust callback is first transform it into raw data, then restore it in an adapter

...
let data = &*callback as *const _ as *const u64;
...

extern "C" fn adapter<F: Fn(&FatPointer)>(handle: *const u64, data: *const u64) {
    let cb = data as *const F;
    ...

and this adapter will be used as a C callback

...
unsafe { ffi::register_callback(self.handle, data, adapter::<F>) };
...

But I’m struggling with how to pass arguments to the Rust callback in the adapter, because this callback needs a &FatPointer which is not available, for example the following code is obviously unacceptable

    extern "C" fn adapter<F: Fn(&FatPointer)>(handle: *const u64, data: *const u64) {
        let cb = data as *const F;
        unsafe { (*cb)(transmute(handle)) } // wrong: handle is not &FatPointer
    }

Another effort is to pack both the callback and &self into data, and restore them in the adapter, e.g.

let (cb, fp): (F, &FatPointer) = data as ....
(*cb)(fp);

but it does not work neither because &self would be reallocated at the time the callback is executed (so the callback will access to an invalid memory).

Pinning &self should work, e.g. by always boxing FatPointer

impl<'a> FatPointer<'a> {
    fn new() -> Box<Self> {
        let handle = unsafe { ffi::new() };
        Box::new(FatPointer {
            handle,
            phantom: PhantomData,
            callbacks: vec![],
        })
    }

but it seems overkill because the essential part of FatPointer is just a raw pointer, which is obviously movable and even copyable.

I wonder that is it possible to keep the callback with type 'a + Fn(&FatPointer) without making &FatPointer stable, or the type of the callback should be changed?

Many thanks for any help.

1 Like

Improving extern declarations

First of all, I like improving FFI just by “transmuting” their signatures (changing the API without changing the ABI :slight_smile:), so as to be able to explicitely state where nullable pointers may or may not be; which leads to being able to use references afterwards, once the implcit ownership / borrowing patterns from FFI become clearer:

use ::std::{*,
    ptr::NonNull,
};
use ::libc::{
    c_void,
};

mod ffi {
    #[repr(C)]
    pub
    struct Handle {
        opaque: [u8; 0],
    }

    extern "C" {
        pub
        fn new () -> Option<NonNull<Handle>>;

        pub
        fn delete (handle: &'_ mut Handle);

        pub
        fn get_state (handle: &'_ Handle) -> u64;

        pub
        fn exec (handle: &'_ Handle);

        pub
        fn register_callback (
            handle: &'_ mut Handle,
            data: Option<NonNull<c_void>>,
            callback: unsafe extern "C" fn (handle: &Handle, data: Option<NonNull<c_void>>),
        );
    }
}

fn main ()
{
    unsafe {
        let mut handle: NonNull<ffi::Handle> = ffi::new().expect("new() failed");
        dbg!(handle);
        let handle: &mut ffi::Handle = handle.as_mut();
        ffi::register_callback(
            &mut *handle,
            None,
            Some({
                unsafe extern "C" fn cb (handle: &ffi::Handle, data: Option<NonNull<c_void>>) {
                    println!("handle = {:p}", handle);
                    println!("data = {:?}", data);
                }
                cb
            }),
        );
        println!("state = {}", ffi::get_state(&*handle));
        ffi::exec(&*handle);
        ffi::delete((move || handle)()); // prevent reborrow
    }
}

The problem at hand

Now, you wish to wrap all this in a safer wrapper, while also storing the boxed closures in a vec in order to correctly free them. The issue is that you are doing both things at once, which is problematic.

I suggest you first wrap all the functionality but the callback-registering in a first wrapper, and then create a struct stitching together both the wrapper and the vec. By making the struct deref to the wrapper, you get the same ergonomics, and you just have to add the callback registering to the struct. This is one way to solve your problem:

#[derive(Debug)]
#[repr(transparent)]
struct MyHandle (
    NonNull<ffi::Handle>,
);

impl MyHandle {
    #[inline]
    pub
    fn new () -> MyHandleWithCallbacks
    { unsafe {
        MyHandleWithCallbacks {
            handle: Self(ffi::new().expect("new() failed")),
            callbacks: vec![],
        }
    }}

    #[inline]
    pub
    fn state (self: &'_ Self) -> u64
    { unsafe {
        ffi::get_state(self.0.as_ref())
    }}

    pub
    fn exec (self: &'_ Self)
    { unsafe {
        ffi::exec(self.0.as_ref())
    }}
}

impl Drop for MyHandle {
    fn drop (self: &'_ mut Self)
    { unsafe {
        ffi::delete(self.0.as_mut());
    }}
}

struct MyHandleWithCallbacks {
    handle: MyHandle,
    callbacks: Vec<Box<dyn Fn (&MyHandle) + 'static>>,
}

impl ops::Deref for MyHandleWithCallbacks {
    type Target = MyHandle;
    
    #[inline]
    fn deref (self: &'_ Self) -> &'_ Self::Target
    {
        &self.handle
    }
}

impl ops::DerefMut for MyHandleWithCallbacks {
    #[inline]
    fn deref_mut (self: &'_ mut Self) -> &'_ mut Self::Target
    {
        &mut self.handle
    }
}

impl MyHandleWithCallbacks {
    pub
    fn add_callback<Closure> (
        self: &'_ mut Self,
        boxed_closure: Box<Closure>,
    )
    where
        Closure : 'static,
        Closure : Fn(&MyHandle),
    {
        unsafe extern "C"
        fn c_callback<Closure> (
            ffi_handle: &'_ ffi::Handle,
            data: Option<NonNull<c_void>>,
        )
        where
            Closure : Fn(&MyHandle) + 'static,
        {
            ::scopeguard::defer_on_unwind! {{
                eprintln!("Caught Rust unwinding accross FFI, aborting...");
                ::std::process::abort();
            }}
            let closure =
                data.expect("Error, got NULL data")
                    .cast::<Closure>()
            ;
            let at_my_handle = mem::transmute::<
                & &ffi::Handle,
                & MyHandle, // thanks to #[repr(transparent)] 
            >(&ffi_handle);
            closure.as_ref()(at_my_handle);
        }

        let data = Some(NonNull::cast::<c_void>(
            NonNull::from(&*boxed_closure)
        ));
        unsafe {
            ffi::register_callback(
                self.handle.0.as_mut(),
                data,
                c_callback::<Closure>,
            );
        }
        self.callbacks.push(
            boxed_closure
            /* as Box<dyn Fn(&MyHandle) + 'static> */
        );
    }
}

fn main ()
{
    let mut handle = MyHandle::new();
    let closure = {
        let x = cell::Cell::new(0);
        move |handle: &MyHandle| {
            if x.replace(dbg!(x.get()) + 1) < 5 {
                dbg!(handle.state());
                handle.exec();
            }
        }
    };
    handle.add_callback(Box::new(closure));
    handle.exec();
}
  • Playground

  • if you are to always box an input, asking for it to be boxed beforehand avoids unneeded boxing (imagine someone already having a boxed closure);

  • unless the callback code is refactored into using CSP (e.g., using nested closures instead of classic procedural style), you cannot enforce that that closures borrows last until Handle; hence the 'static requirement. So you will need move closures.


NB: the extern "C" fn adapter<F> is a pretty neat pattern, well found!

2 Likes

Many thanks @Yandros,

Your answer, as usual, is full of gems for me. The solution is execellent (it’s beyond my imagination, given that I’ve been struggling several days with it). There are some minor details that I didn’t understand, sorry :frowning:

In the FFI “interface” (though your real implementation is different), you’ve used:

mod ffi {
  #[repr(C)]
  pub struct Handle {
    opaque: [u8; 0],
  }
  ...
}

Using a zero size opaque is a very new technique for me, what is the advantage of its over the basic *const u64?

And you’ve used anonymous lifetime almost everywhere, e.g.

pub fn delete (handle: &'_ mut Handle);

Is it more “idiomatic” than the simple?

pub fn delete (handle: &mut Handle);

The technique for preventing reborrow

ffi::delete((move || handle)());

is very neat, I’ve never thought about this.

I may not think carefully enough, but it would be quite direct to modify your solution to make MyHandle private (and only MyHandleWithCallbacks public)?, e.g.

mod foo {
   struct MyHandle {
     ...
   }

   impl MyHandle {
     // move pub fn new() -> MyHandleWithCallbacks
     // to MyHandleWithCallbacks
   }
}

struct MyHandleWithCallbacks {
  ...
}

impl MyHandleWithCallbacks {
   
   pub fn new() -> Self {
     ...
   }
}

Thanks you again for a very detail and beauty solution.

It has the advantage of making the different opaque types be incompatible with each other, even when behind a pointer (explicit conversion with as is required when wanting to use one instead of the other):

typedef struct peach peach_t;
peach_t * new_peach (void);

typedef struct apple apple_tv;
apple_tv * new_apple (void);

void foo (apple_tv *, peach_t *);

If *mut u8 (or *mut (), or usize, etc.) is used to represent apple_tv * and peach_t *,

extern "C" {
    fn new_apple () -> *mut ();
    fn new_peach () -> *mut ();

    fn foo (apple:*mut (), peach: *mut ());
}

then one can accidentally call

unsafe {
    foo(new_peach(), new_apple());
}

A declaration using opaque types would not have this kind of problem.

#[repr(C)]
struct Apple {
    _opaque: [u8; 0],
}
#[repr(C)]
struct Peach {
    _opaque: [u8; 0],
}
extern "C" {
    fn new_apple () -> *mut Apple;
    fn new_peach () -> *mut Peach;

    fn foo (apple: *mut Apple, peach: *mut Peach);
}

Although such “hack” is far from perfect, and having to use it is quite sadenning, it is still one of the best solutions currently available, until extern types land on stable.


That’s just me being obsessed with writing explicit lifetimes, even when they are elided. Here, given that they are ffi functions, paying attention to lifetimes is not worth it and may lead to a false sense of security, so I would almost advise against it in the general case :sweat_smile:

However, they can be useful when you spot a borrowing pattern.

  • Example:

    typedef struct thingy thingy_t;
    typedef struct field field_t;
    
    field_t const * get_field (thingy_t const *);
    

    This is clearly a borrow; even if C cannot enforce it. So, in this case, we can enforce that Rust users of this extern "C" function do respect it:

    #[repr(C)]
    pub struct Thingy {
        _opaque: [u8; 0],
    }
    #[repr(C)]
    pub struct Field {
        _opaque: [u8; 0],
    }
    
    extern "C" {
        fn get_field (_: &'_ Thingy) -> Option<&'_ Field>; // you can get rid of the Option if the C lib specifies that the borrow cannot fail
    }
    

More generally (i.e., outside ffi), I do think that explicitly elided lifetimes are really beneficial for the type of the return value, since whether a function borrows from its input or not drastically changes how such function can be called.


Not easily:

  • if MyHandle was “private” (pub, but within a non-pub mod), since the callbacks you add are taking a &MyHandle, users would not be able to use it / interact with it;

  • if MyHandleWithCallbacks was the one hidden, users would not be able to add a callback.

(You can, however, use #[doc(hidden)] if you want something to be usable but not appear on the documentation.)

2 Likes