Safe interface to ffi wrapper - one context multiple callbacks

ua-kxie · March 25, 2023, 10:26am

Hello,

I'm lost trying to figure out how to expose a safe interface to a C function which takes and keeps a pointer and returns it in several callbacks, invoked at arbitrary times.

int  ngSpice_Init(SendChar* printfcn, SendStat* statfcn, ControlledExit* ngexit,
                  SendData* sdata, SendInitData* sinitdata, BGThreadRunning* bgtrun, void* userData);

After studying various examples online, I ended up with the following, with the intention that higher level users would impl the PkSpiceManager trait for their own manager type.

    pub fn call<T>(&self, manager: &T) -> i32 where T: PkSpiceManager {
        unsafe {
            (self.fnptr)(
                cbw_send_char::<T>, 
                cbw_send_stat::<T>, 
                cbw_controlled_exit::<T>, 
                cbw_send_data::<T>, 
                cbw_send_init_data::<T>, 
                cbw_bgthread_running::<T>, 
                manager as *const _ as *const c_void,
            )
        }
    }
-- snip --
unsafe extern fn cbw_bgthread_running<T>(finished: bool, id: c_int, user: *const c_void) -> c_int where T: PkSpiceManager{
    unsafe {
        <T as PkSpiceManager>::cb_bgt_state(&mut *(user as *mut T), finished, id);
    }
    0

But I've discovered that this isn't safe, if manager changes address in memory. I've tried several times to have a callback handler in a pinned box, but I get lost trying to figure out how I should expose an interface in a clean way.

Could someone point me in the correct direction, or better yet, point me to some example code that does this well?

Yandros · March 25, 2023, 4:42pm

So, you have multiple options, depending on how you want to model it.

Ownership-based

The simplest approach is to let the C stuff take ownership of your thing: that is, you'd be operating with a Box<T>, for instance, which would be Box::into_raw()ed before giving the resulting raw pointer to C.

Then, you'd return some newtype wrapping this raw pointer which could expose a back_to_box kind of functionality, by virtue of first unregistering the pointer w.r.t. the C API (e.g., through some ngSpice_Destroy function), before calling Box::from_raw() back on the wrapped pointer.

extern "C" {
   fn register(_: *mut c_void);
   fn unregister(_: *mut c_void);
}

pub
struct Registered<T> /* = */ (
    *mut T,
);

impl<T> From<Box<T>> for Registered<T> {
   fn from(b: T)
      -> Registered<T>
    {
        let ptr = Box::into_raw(b);
        unsafe { register(ptr.cast()); }
        Self(ptr)
    }
}

impl<T> Registered<T> {
    pub
    fn into_inner(self)
      -> Box<T>
    {
        unsafe {
            unregister(self.ptr.cast());
            Box::from_raw(self.ptr)
       }
    }
}

Note: the Box here, in and of itself, is not paramout; any owning pointer, even shared ones if you only lend & access to the inner value, would be fine. So Arc<T> and/or Weak<T> would be a convenient way for you to keep access to the T state while it is being used by the C library.

Borrowing/lifetime-based by using a scoped API

The other approach is to use the previous ownership-based method but for some unrelated-to-pointers data structure, that shall, symbolically/conceptually, represent the FFI state. I'll call it &mut CLibrary.

Its drop glue would unregister all the registered pointers, somehow:

extern "C" { fn unregister_all(...); }

Then, you'd be able to use the inherent 'c_library as input to a method, at which point you'd be guaranteed that &'c_library T would be a legitimate non-dangling pointer to use:

pub
struct CLibrary<'c_library> {
    /* whatever necessary state to interact with your C library */,
    _invariant: PhantomData<*mut &'c_library ()>,
}

impl Drop for CLibrary<'_> {
    fn drop(&mut self)
    {
        unsafe {
            unregister_all(self....);
        }
    }
}

impl<'c_library> CLibrary<'c_library> {
    pub
    fn register(
        &mut self,
        value: &'c_library T,
    )
    {
        unsafe { register(<*const _>::cast(value)); }
    }
}

Now, this seems fine: 'c_library must span beyond the point of last usage of the : CLibrary owned instance, therefore spanning beyond its drop glue, i.e., the point where the registered pointers are unregistered, and we know that our &'c_library reference and thus pointer won't dangle for that duration/lifetime.

Except... "beyond last usage [...] therefore spanning beyond its drop glue" is WRONG

Indeed, mem::forget(c_library_instance); is a thing safe code owning the instance could write. That would result in a last usage of the c_library_instance which would skip the drop glue, and thus skip the necessary-for-soundness unregistering of the pointers.

The solution to this is to never lend ownership of your CLibrary instance to user code. Which can be done, but only through a scoped / callback-based API;

impl CLibrary<'_> {
    pub
    fn with_new<R>(scope: impl FnOnce(_: &'_ mut CLibrary<'_>) -> R)
      -> R
    {
        /* only this local function, which we control, owns the `CLibrary` instance, and we know we don't `forget(it)` */
       let it = CLibrary { ... };
       scope(&mut CLibrary)
    }
}

Using `Pin` guarantees

Now, a scoped API can be quite cumbersome to use; and we only did that because of the need for something to unregister stuff before it is dropped, without room for forget shenanigans.

And it turns out that this is very related to what Pin wants to achieve.

the exact semantics of Pin are very and incredibly subtle so ask for code review in this forum if going down this route.

The exact property that we can have, with Pin, some Pointee<T> type, and some Ptr<Pointee<T>> such as &'_ Pointee<T>, is that:

unless Pointee<T> : Unpin / if Pointee<T> : !Unpin;
and if Pointee<T> has drop glue (e.g., Pointee<T> : Drop);
and if we get our hands onto some Pin<&'does_not_matter Pointee<T>> "witness" instance,

then, and only then, we know that:

either the Pointee is never moved or deallocated, i.e., the 'does_not_matter was kind of 'static;
or the Pointee's aforementioned drop glue is to be run before the value is moved or deallocated.

Moreover, if Pointee happens to hold a value of type T inside it, then a fortiori, that value will also not have moved, provided we never expose an API/accessor to that value.

From all this, we can thus write:

use ::core::marker::Unpin as PinSensitive;

type Pointer<'lt, Pointee> = &'lt mut Pointee;

pub
struct Unregisterer<T> {
    value: T,
    _pin_sensitive: PinSensitive,
}

impl<T> From<T> for Unregisterer<T> { ... }

impl<T> Unregisterer<T> {
    pub
    fn register(
        self: Pin<Pointer<'_, Self>>,
    )
    {
        unsafe { register(ptr::addr_of!(self.value).cast()); }
    }
}

impl<T> Dropf or Unregisterer<T> {
    fn drop(&mut self)
    {
        // TODO: ensure this function is ok to run for an unregistered thing
       // or add a boolean to check for registration
        unsafe { unregister(ptr::addr_of!(self.value)); }
    }
}

ua-kxie · March 26, 2023, 3:43am

Thank you very much for your detailed reply.

I'm not sure if this changes anything (and I'm sorry for not including this in the original question if it does), the CLib I'm working with says that no memory is freed across the FFI boundary. Nothing is done with the context pointer except passing it back "as is" in the callbacks. I don't see anyway to unregister anything except to not use the context pointer inside the callbacks, or calling the register function again with null pointers. The CLib also accepts a "quit" string as a command which "unloads" the dll, according to the documentation.

In the ownership based solution, am I giving Box to the user? wouldn't this allow them to move/drop it?

Would it work if I just chose to accept only a Pin<Box> in the constructor? like:

impl<T> From<Pin<Box<T>>> for Registered<T>

I definitely intend to ask for a code review! But right now, I feel like I need to do so much more just to satisfy myself

Yandros · March 26, 2023, 5:06pm

So, if there is no way to free/unregister (I was afraid of that, ), then you'll have to go with the owning-based API, and forgo the .back_into_box() API (so just don't return a handle).

Remember you don't need a Box<T> strictly-speaking: if you can afford only having &T access within your callbacks, then an Arc<T> or a sync::Weak<T> would also do:

unsafe // Safety: the pointer must be sound to `&*` dereference into a `&Self::Pointee`,
       // at any time.
trait IntoRaw {
    type Pointee;
    fn into_raw(self) -> *const Pointee;
}

unsafe
impl<T> IntoRaw for Box<T> {
    type Pointee = T;
    fn into_raw(self) -> *const Pointee { Self::into_raw(self) }
}

unsafe
impl<T> IntoRaw for Arc<T> {
    type Pointee = T;
    fn into_raw(self) -> *const Pointee { Self::into_raw(self) }
}

unsafe
impl<T> IntoRaw for sync::Weak<T> {
    type Pointee = T;
    fn into_raw(self) -> *const Pointee { Self::into_raw(self) }
}

Note that by never calling from_raw back on these things, you'd always end up leaking memory, although in the Weak case, at least, you'd only be leaking the direct backing memory for T (that is, if T, itself, were to manage other memory, then that memory would be freed when all the strong Arc<T> instances are dropped, independently of the outstanding Weaks.

ua-kxie · March 27, 2023, 2:38am

Ok, I think I get the gist of it (for the ownership-based solution, please correct if wrong):

The idea is that higher level user loses ownership of the manager type while the CLib is holding a pointer to it, thus allowing the wrapper to guarantee that the manager type does not 1. get moved and 2. get dropped. When higher level users want access to the manager, we give it after unregistering it from the CLib so that even if it does get moved/dropped, it does not matter.

If so I think calling the register functions again with null pointers will work fine?

Would this force users to poll for changes, or would they still be able to react to callbacks as they come?

Thank you for your feedback.

system · June 25, 2023, 2:39am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Stateful closure for FFI callback help	4	2086	September 24, 2019
C FFI Questions: rust methods in callbacks help	6	936	August 20, 2019
Should FFI callbacks into Rust be marked `unsafe`?	6	1478	January 12, 2023
Callbacks and Function Pointers across the FFI boundary [Feedback Wanted]	6	1531	January 12, 2023
Is there an established technique for FFI callbacks? help	4	945	January 24, 2021

Safe interface to ffi wrapper - one context multiple callbacks

Ownership-based

Borrowing/lifetime-based by using a scoped API

Using Pin guarantees

Related Topics

Using `Pin` guarantees