Unwanted copies of values when using unsafe pointers for FFI bindings

I am creating a Rust API based on INDIGO using bindgen and FFI. Indigo is fully asynchronous and requires native C clients to implement six callback functions.

After a lot of trial and error and quite a bit of googling, I landed on a design based on a CallbackHandler trait defining safe versions of the callback functions and a generic Client.

So far, this seems to work in all aspects except that for some reason a the client ends up using a copy of the handler, i.e. state is not persistent.

I would be super grateful for any of the following:

  • Any insight on why the handler state is not persisted / the code is copying the TestHandler handler struct.
  • Suggestions on an alterative approach for implementing the six, safe callback handlers required by the native API (only one shown).
  • Improvements in general - I am a rust beginner.

I have boiled down to an (unfortunately) rather large test case in order to illustrate the problem.

Please find some detailed notes and explanations below!

/Chris
PS This is a rather long request for help and for this I apologise, but I feel the details of the FFI bindings etc are necessary to fully capture the problem.

Notes

Bindgen generates a native c client similar to:

pub struct c_client {
    /// Arbitrary data structure holding client data.
    pub context: *mut ::std::os::raw::c_void,
    /// System callback called when client is attached to the bus.
    pub attach:
        ::std::option::Option<unsafe extern "C" fn(client: *mut c_client) -> c_result>,
    // Additional unsafe callbacks elided for brevity.
}

Attaching a c_client is done through a native FFI function attach_client:

extern "C" {
     pub fn attach_client(client: *mut c_client) -> c_result;
}

My matching safe Client contains a reference to the FFI c_client to be used as an argument for the native c_client of the attach_client and other native calls (elided for brevity):

#[derive(Debug)]
pub struct Client<T:CallbackHandler> {
    sys: c_client,
    handler: T,
}

The CallbackHandler trait looks like the following:

pub trait CallbackHandler {
    fn on_client_attach(&mut self, client: &mut Client<impl CallbackHandler>) -> Result<(),String>;
    // Additional safe callbacks elided for brevity.
}

I would really like to provide a mutable reference to Client as it greatly facilitates the CallbackHandler trait implementations to make additional calls using the Client API.

To connect the native API to the safe Client and to find the callback handler, I use the context field of the native c_client struct. This creates a circular reference that is easily circumvented:

/// Create a new `Client` for the provided `CallbackHandler`.
pub fn new(handler: T) -> Self {
    let sys = c_client {
        context: core::ptr::null_mut(),
        attach: Some(Self::on_attach),
    };
    let mut client = Client { sys, handler };

    // save a reference to the client sys context
    let ptr = core::ptr::addr_of_mut!(client) as *mut c_void;
    client.sys.context = ptr;

    client
}

However, the only way (that I have found) of using that circular reference in e.g. in the native on_attach is to is to create two unsafe raw pointers to the same data structure:

unsafe extern "C" fn on_attach(client: *mut c_client) -> c_result {
    // rehydrate the safe client of the Rust API
    let ptr = (*client).context;
    // create two separate clients pointing to the same memory to circumvent circular dependency
    let c1: &mut Client<T> = &mut *(ptr as *mut Client<T>);
    let c2: &mut Client<T> = &mut *(ptr as *mut Client<T>);

    // call the handler callback with the rehydrated safe client
    if let Err(e) = c1.handler.on_client_attach(c2) {
        println!("Client handler error: {}", e);
        return c_result_FAILED;
    }
    c_result_OK
}

Finally, the failing unit test where I fake a native system API call to illustrate the problem:

#[test]
fn test() {
    let t = TestHandler { visited: false, };
    let mut c = Client::new(t);

    c.attach().unwrap();

    // simulate callback from system API
    let ptr = &core::ptr::addr_of_mut!(c.sys);
    unsafe {
        let _ = Client::<TestHandler>::on_attach(*ptr);
    }

    assert!(c.handler.visited, "Handler not visited")
}

The full compiling but failing code is available in the playground.

Regarding the approach of using a trait for specifying the callback handlers, using closures might be a more idiomatic way in rust, but I don't see a good way to provide them to the Client. Perhaps optional fields and setters could be used but it seems more messy. This said, I have not yet explored this option in detail.

You are leaking a stack pointer in your Client::new.

Because your struct is self referential the address of the client struct changes after the value is returned (and moved). Add debug prints of the address of the struct inside new and the address of the returned struct to see this. One solution is to return a boxed struct so the address you use is heap allocated and won't change.

    pub fn new(handler: T) -> Box<Self> {
        let sys = c_client {
            context: core::ptr::null_mut(),
            attach: Some(Self::on_attach),
        };
        let mut client = Box::new(Client { sys, handler });

        // save a reference to the client sys context
        client.sys.context = (&mut *client) as *mut Self as *mut c_void;

        client
    }
2 Likes

Thanks a lot for the proposed solution, it does indeed resolve the problem.

Also the (now obvious) tip of printing the pointers was very useful.

Printing the debug addresses it was clear that client data structure did change, but I believe this to be because it pointed to more or less random areas in memory. The visited field was initialised to true on most occasions except a few where it got set to false. This happened even after implementing Default, so no default initialisation took place and given that the value was false there was no copying going on either.

Given that my understanding is correct, I am surprised that my initial test code did not blow up with a seg fault. (Actually, this did happen with the real code base, but so far not with the smaller test case).

Note that the printed addresses of the client still changes with the Box solution though the struct fields now remain unchanged. I understand this to be because we the references sent between the calls are passed by value (and thus have different addresses) but they all point to the same heap structure.

When first trying your solution I missed dereferencing the client which of course did not work as I got the pointer to the box instead of the client... :slight_smile:

Thanks again for the help!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.