Closure for C callback

Hello,

I am desperately trying to set the callback of the glib g_dbus_connection_signal_subscribe function. I tried several codes, but I always get a crash.

I wrote the following code:

let property_change_callback = |property_name: &str| { ... }

proxy.on_property_change(property_change_callback);

let g_main_loop = unsafe { glib_sys::gio::g_main_loop_new(std::ptr::null_mut(), 1) };
unsafe { glib_sys::gio::g_main_loop_run(g_main_loop) };

Here is the on_property_change function definition:

pub fn on_property_change<C>(&mut self, callback: C)
    where
        C: Fn(&str, &DBusValue) + 'static,
    {
        const MEMBER_NAME: &str = "PropertiesChanged";

        let connection = unsafe { glib_sys::gio::g_dbus_proxy_get_connection(self.handle) };
        let sender = std::ptr::null_mut::<glib_sys::gio::gchar>();
        let member =
            unsafe { std::ffi::CStr::from_ptr(MEMBER_NAME.as_ptr().cast::<std::ffi::c_char>()) };
        let interface_name = std::ptr::null_mut::<std::ffi::c_char>();

        let callback_box = Box::new(callback);
        self.callback_list.push(callback_box);
        let callback = self.callback_list.last_mut().unwrap() as *mut Box<dyn Fn(&str, &DBusValue)>;

        let object_path = std::ffi::CString::new(self.object_path.as_str()).unwrap();

        let flags = glib_sys::gio::GDBusSignalFlags_G_DBUS_SIGNAL_FLAGS_NONE;

        let arg0 = std::ptr::null::<std::ffi::c_char>();

        unsafe {
            glib_sys::gio::g_dbus_connection_signal_subscribe(
                connection,
                sender,
                interface_name,
                member.as_ptr(),
                object_path.as_ptr(),
                arg0,
                flags,
                Some(Self::on_property_changed),
                callback.cast::<std::ffi::c_void>(),
                None,
            );
        }

self.callback_list is of type Vec<Box<dyn Fn(&str, &DBusValue) + 'static>>

The Self::on_property_change function definition is:

extern "C" fn on_property_changed(
        _connection: *mut glib_sys::gio::GDBusConnection,
        _sender_name: *const glib_sys::gio::gchar,
        _object_path: *const glib_sys::gio::gchar,
        _interface_name: *const glib_sys::gio::gchar,
        _signal_name: *const glib_sys::gio::gchar,
        parameters: *mut glib_sys::gio::GVariant,
        user_data: glib_sys::gio::gpointer,
    ) {
        let callback = unsafe { &mut *user_data.cast::<Box<fn(&str, &DBusValue)>>() };

        let dbus_value = DBusValue::from_gvariant_raw_ptr(parameters);
        if let Some(dbus_value) = dbus_value
            && let Ok(dbus_struct) = TryInto::<DBusStruct>::try_into(dbus_value)
            && let Ok(dbus_array) =
                TryInto::<DBusArray>::try_into(dbus_struct.iter().nth(1).unwrap())
        {
            for property_changed in dbus_array.iter() {
                if let Ok(property_changed) = TryInto::<DBusDictEntry>::try_into(property_changed)
                    && let Ok((property_name, property_value)) =
                        TryInto::<(String, DBusValue)>::try_into(property_changed.clone())
                {
                    callback(&property_name, &property_value);
                }
            }
        }
    }

I always get a crash at callback(&property_name, &property_value);. For the current code I get:

Signal: SIGSEGV (signal SIGSEGV: address not mapped to object (fault address: 0x1))
Signal: SIGSEGV (signal SIGSEGV: address not mapped to object (fault address: 0x1))

The callback can be called multiple times.

Thank you very much for any help.

C doesn't support fat pointers.

A pointer to a dyn Trait is 128-bit large on a 64-bit system. Passing it to a C API will truncate it to 64 bits and turn it into crashy jibberish.

In Rust all the references and pointers have the same syntax, but the "unsized" ones like slices and dyn are a different type that would be a struct passed by value in C.

You can use thin fn() types (that don't support closures with captured context) to have C-compatible callback.

Otherwise you will need to use C APIs that can supply additional user-provided pointer. Set a callback to a function that takes the user-data pointer, casts it to the right type and calls the closure through it. That user pointer needs to double Box the closure to have a C-compatible thin pointer to a Rust fat pointer to dyn Trait. If you use concrete types (even if generic just not dyn), you can avoid double boxing. you still need user-data to support closures with capturing.

3 Likes

Thank you for your quick response.

A pointer to a dyn Trait is 128-bit large on a 64-bit system. Passing it to a C API will truncate it to 64 bits and turn it into crashy jibberish.

In Rust all the references and pointers have the same syntax, but the "unsized" ones like slices and dyn are a different type that would be a struct passed by value in C.

Thank you for this information.

You can use thin fn() types (that don't support closures with captured context) to have C-compatible callback.

So I suppose this is what I need to use ? Closures are not possible I suppose ?

Otherwise you will need to use C APIs that can supply additional user-provided pointer. Set a callback to a function that takes the user-data pointer, casts it to the right type and calls the closure through it. That user pointer needs to double Box the closure to have a C-compatible thin pointer to a Rust fat pointer to dyn Trait. If you use concrete types (even if generic just not dyn), you can avoid double boxing. you still need user-data to support closures with capturing).

In my case the callback can supply additional user provided pointer. So I suppose it can work with closures ? Can you please demonstrate with an example ?

// user_data is Box<Box<dyn Fn()>>
unsafe extern "C" fn callback_for_c(user_data: *mut c_void) {
    let rust_callback: &Box<dyn Fn()> = &*user_data.cast(); 
    (rust_callback)();
}

or

// user_data is Box<F> where F: Fn()
unsafe extern "C" fn callback_for_c<F: Fn() + Sized>(user_data: *mut c_void) {
    let rust_callback: &F = &*user_data.cast();
    (rust_callback)();
}
1 Like

I'm really sorry but I still get a SIGSEV crash:

Signal: SIGSEGV (signal SIGSEGV: address not mapped to object (fault address: 0x28))
Signal: SIGSEGV (signal SIGSEGV: address not mapped to object (fault address: 0x28))

Here is the on_property_change code:

pub fn on_property_change<C>(&mut self, callback: C)
    where
        C: Fn(&str, &DBusValue) + 'static,
    {
        const MEMBER_NAME: &str = "PropertiesChanged";

        let connection = unsafe { glib_sys::gio::g_dbus_proxy_get_connection(self.handle) };
        let sender = std::ptr::null_mut::<glib_sys::gio::gchar>();
        let member =
            unsafe { std::ffi::CStr::from_ptr(MEMBER_NAME.as_ptr().cast::<std::ffi::c_char>()) };
        let interface_name = std::ptr::null_mut::<std::ffi::c_char>();
        
        // double Box
        let callback = Box::new(Box::new(callback));

        let object_path = std::ffi::CString::new(self.object_path.as_str()).unwrap();

        let flags = glib_sys::gio::GDBusSignalFlags_G_DBUS_SIGNAL_FLAGS_NONE;

        let arg0 = std::ptr::null::<std::ffi::c_char>();

        unsafe {
            glib_sys::gio::g_dbus_connection_signal_subscribe(
                connection,
                sender,
                interface_name,
                member.as_ptr(),
                object_path.as_ptr(),
                arg0,
                flags,
                Some(Self::on_property_changed),
                Box::into_raw(callback).cast::<std::ffi::c_void>(),
                None,
            );
        }
    }

Here is the Self::on_property_changed function:

extern "C" fn on_property_changed(
        _connection: *mut glib_sys::gio::GDBusConnection,
        _sender_name: *const glib_sys::gio::gchar,
        _object_path: *const glib_sys::gio::gchar,
        _interface_name: *const glib_sys::gio::gchar,
        _signal_name: *const glib_sys::gio::gchar,
        parameters: *mut glib_sys::gio::GVariant,
        user_data: glib_sys::gio::gpointer,
    ) {
        let callback = unsafe { &*user_data.cast::<Box<dyn Fn(&str, &DBusValue)>>() };

        let dbus_value = DBusValue::from_gvariant_raw_ptr(parameters);
        if let Some(dbus_value) = dbus_value
            && let Ok(dbus_struct) = TryInto::<DBusStruct>::try_into(dbus_value)
            && let Ok(dbus_array) =
                TryInto::<DBusArray>::try_into(dbus_struct.iter().nth(1).unwrap())
        {
            for property_changed in dbus_array.iter() {
                if let Ok(property_changed) = TryInto::<DBusDictEntry>::try_into(property_changed)
                    && let Ok((property_name, property_value)) =
                        TryInto::<(String, DBusValue)>::try_into(property_changed.clone())
                {
                    callback(&property_name, &property_value);
                }
            }
        }
    }

Thank you very much in advance for any help

You have a mix of type inference and casts that's dangerous. Your code mixed up C and dyn Fn types. If you box C, you get Box<C>, not Box<dyn Fn>. Without explicit types the compiler is never told to convert to dyn Fn.

Since you're taking generic C, don't bother with double boxing and dyn Fn. Make on_property_changed generic too, and call C in it. Generics will work with C – the compiler will make a simple function for every generic type.

2 Likes

Elaborating what @kornel said, this line creates a Box<Box<C>>. You’d need to give an explicit type hint, like let callback: Box<Box<dyn Fn(…)>> = _, to get the callback to coerce into a dyn Fn (and maybe it’d require an as Box<dyn Fn(…)> cast inside the outer Box::new, I’m not entirely sure without trying). Though, as mentioned, it’d be better to take advantage of the C generic and just do a single Box, in which case an explicit type hint for callback: Box<C> is unnecessary.

A modified version of the code using the C generic would then want to cast user_data to a pointer to C rather than a pointer to Box<dyn Fn(…)>.

1 Like

I wrote down a little demo as a playground:

// example of passing something like
// a Box<dyn FnMut(i32)> to a C function

// the C function (let’s pretend this function would actually be defined in C somehow)
pub unsafe extern "C" fn subscribe(
    callback: unsafe extern "C" fn(arg: i32, data: *mut ()),
    data: *mut (),
    destroy_data: Option<unsafe extern "C" fn(*mut ())>,
) {
    // for demo purposes, let’s just make this "subscribe" function call the callback twice, then destroy it:
    unsafe {
        callback(42, data);
        callback(1337, data);
        if let Some(destructor) = destroy_data {
            destructor(data);
        }
    }
}

////////////////////////////////////////////////////////////////////////////////

// now let’s create a Rust binding for this!

// ideally, we’ll end up with something straightforward like
// fn subscribe(callback: impl FnMut(i32)) { … }

// let's start with something like this, because these function ponters + data pointer
// really kinda belong together, anyway:
struct Callback {
    callback: unsafe extern "C" fn(arg: i32, data: *mut ()),
    data: Data,
}
// we’ll separate out the `data`-related fields because we don’t need the `callback` function for destruction
struct Data {
    data: *mut (),
    destroy_data: Option<unsafe extern "C" fn(*mut ())>,
}
// these types (Callback & Data) own the data behind `data`, so the’ll destroy it on `Drop`:
impl Drop for Data {
    fn drop(&mut self) {
        unsafe {
            // same idea as in the stub "C code" above
            if let Some(destructor) = self.destroy_data {
                destructor(self.data);
            }
        }
    }
}

// the `Callback` struct already allows us a safe wrapper, as follows

use std::mem::ManuallyDrop;
fn subscribe_safe(callback: Callback) {
    // we’ll delegate responsibility for destruction to the "C code"
    // so we must not do the destruction ourselves then
    let callback = ManuallyDrop::new(callback);
    unsafe {
        subscribe(
            callback.callback,
            callback.data.data,
            callback.data.destroy_data,
        );
    }
}

// now how to transform this into a more Rust-like interface type?

// first, simple approach:
fn subscribe_rust_ver_1(callback_box: Box<dyn FnMut(i32)>) {
    // let's skip generics for simplicity for now

    // how to transform Box<dyn FnMut(i32)> into a `Callback`?

    // first idea: we can have the data pointer `*mut ()` be a cast version of
    // the `Box<dyn FnMut(i32)>`?
    // problem: `Box<dyn FnMut(i32)>` is a fat pointer.
    // straightforward solution: add another `Box`:

    let boxed: Box<Box<dyn FnMut(i32)>> = Box::new(callback_box);
    let boxed_raw: *mut Box<dyn FnMut(i32)> = Box::into_raw(boxed);
    let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`
    // to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
    unsafe extern "C" fn destroy_data(data: *mut ()) {
        let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
        let boxed: Box<Box<dyn FnMut(i32)>> = unsafe { Box::from_raw(boxed_raw) };
        drop(boxed);
    }
    let data = Data {
        data,
        destroy_data: Some(destroy_data),
    };

    // now finally, the `Callback` will need some code to be able to _call_ that
    // wrapped-up `dyn FnMut(i32)`. For this, we don’t turn `*mut Box<dyn FnMut(i32)>`
    // into an owned `Box<Box<dyn FnMut(i32)>>`` because we don’t want to destroy it when the callback ends,
    // but instead just make a reference to the inner box,
    // of type `&mut Box<dyn FnMut(i32)>`, out of it
    unsafe extern "C" fn callback(arg: i32, data: *mut ()) {
        let reference_to_box: &mut Box<dyn FnMut(i32)>;
        unsafe {
            let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
            reference_to_box = &mut *boxed_raw;
        };
        // now we’re in Rust land, and can just call it:
        reference_to_box(arg);
    }

    let callback = Callback { callback, data };

    subscribe_safe(callback);
}

// let's run this with miri to see that everything works properly, no UB, no leaks
fn main1() {
    // some data with ownership to capture:
    let s: String = "Hello, number is: ".into();
    subscribe_rust_ver_1(Box::new(move |n: i32| {
        println!("{s}{n}");

        // recent-ish Rust versions also nicely handle
        // properly disallowing panics to cross "C" function/interface
        // boundaries (the program aborts instead).
        // If a C api *can* handle error cases, you’d have to e.g.
        // `catch_unwind` the error before going back to "C", and then probably
        // report / pass / whatever… the error somewhere appropriately

        // you can test it by uncommenting
        // if n == 1337 {
        //     panic!("woah!");
        // }
    }));
}

// One adjustment for a `subscribe`-like function: if the callback can be
// called *after* `subscribe` returns we’ll want to add a `+ 'static` bound
// on the `FnMut`. If the callback can be called from a different thread,
// we’ll want to add a `+ Send` bound on the `FnMut`.
// If it can be called concurrently or re-entreantly (is that a word!?)
// we may need to switch to `Fn` and then also add a `+ Sync` bound.

fn subscribe_rust_ver_2(callback_box: Box<dyn FnMut(i32) + Send + 'static>) {
    subscribe_rust_ver_1(callback_box);
}

// Problems so far:
// * we don’t use the flexibility of `callback` being an arbitrary `fn` pointer
// * we add 2 layers of indirection
// * we don’t ever make use of the possibility of `None` destructors

// let's start at the end:
// we already know how to construct a `Data` struct for owning
// a `Box<Box<dyn FnMut(i32)>>`.
/*
    let boxed: Box<Box<dyn FnMut(i32)>> = Box::new(callback_box);
    let boxed_raw: *mut Box<dyn FnMut(i32)> = Box::into_raw(boxed);
    let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`
    // to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
    unsafe extern "C" fn destroy_data(data: *mut ()) {
        let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
        let boxed: Box<Box<dyn FnMut(i32)>> = unsafe { Box::from_raw(boxed_raw) };
        drop(boxed);
    }
    let data = Data {
        data,
        destroy_data: Some(destroy_data),
    };
*/

// The same code works more generically
// for any `Box<T>` actually:

impl Data {
    fn from_box<T>(boxed: Box<T>) -> Self {
        let boxed_raw: *mut T = Box::into_raw(boxed);
        let data: *mut () = boxed_raw.cast(); // cast `*mut T` to `*mut ()`
        // to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
        unsafe extern "C" fn destroy_data<T>(data: *mut ()) {
            let boxed_raw: *mut T = data.cast();
            let boxed: Box<T> = unsafe { Box::from_raw(boxed_raw) };
            drop(boxed);
        }
        // we can add one improvement:
        // boxes to zero-sized data don’t own memory,
        // and if the target type is also without destructors
        // we don’t *need* to destroy anything actually.
        let needs_drop = std::mem::needs_drop::<T>() || std::mem::size_of::<T>() > 0;

        // Thus, we make the `destroy_data` be `None` if nothings needs to happen:
        Data {
            data,
            destroy_data: needs_drop.then_some(destroy_data::<T>),
        }
    }
}

// you can run test with miri to see there are no leaks
#[test]
fn test_data_none() {
    struct Foo;
    struct Bar(u8);
    struct Baz;
    impl Drop for Baz {
        fn drop(&mut self) {
            println!("dropping");
        }
    }
    let boring_type = Foo;
    let non_zero_sized = Bar(123);
    let has_destructor_zero_size = Baz;

    let data1 = Data::from_box(Box::new(boring_type));
    assert!(data1.destroy_data.is_none());

    let data2 = Data::from_box(Box::new(non_zero_sized));
    assert!(data2.destroy_data.is_some());

    let data3 = Data::from_box(Box::new(has_destructor_zero_size));
    assert!(data3.destroy_data.is_some());
}

// also let's have a `Data` accessor
impl Data {
    // # Safety
    // must be called with the same `T` as in the `Box<T>`
    // that this `Data` was constructed from
    unsafe fn inner_mut<T>(&mut self) -> &mut T {
        unsafe {
            // we never construct `Data` with a null pointer, anyway
            // so just dereferencing is fine if the type is correct
            &mut *self.data.cast::<T>()
        }
    }
}

// let's revisit `callback`

// the first approach of constructing `Callback` from above
// can now be written as follows
impl Callback {
    fn from_dyn_fn(callback_box: Box<dyn FnMut(i32) + Send + 'static>) -> Self {
        let data = Data::from_box(Box::new(callback_box));

        unsafe extern "C" fn callback(arg: i32, data: *mut ()) {
            let reference_to_box: &mut Box<dyn FnMut(i32)>;
            unsafe {
                let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
                reference_to_box = &mut *boxed_raw;
            };
            // now we’re in Rust land, and can just call it:
            reference_to_box(arg);
        }
        Callback { callback, data }
    }
}

// now, we *could* use this (or the previous equivalent version anyway) to
// create a convenient api that accepts any kind of `impl FnMut` parameter:
fn subscribe_rust_ver_3(callback: impl FnMut(i32) + Send + 'static) {
    let callback = Callback::from_dyn_fn(Box::new(callback));
    subscribe_safe(callback);
}
// but as mentioned before *now* we’re *doubly boxing* the closure,
// and still not making use of the possibility of passing different
// function pointers into the `unsafe extern "C" fn callback(i32, *mut ())` field

// so let’s fix that.


// The approach here basically just generalizes the exact same code from above
// to any kind of type `F` in place of `Box<dyn FnMut(i32)>` now.
// We have actually already seen this kind of thing in the previous section,
// but it’s worth highlighting this: `unsafe extern "C" fn`s can be made
// generic functions without any issue (and you saw this on `destroy_data<T>` above)
// and this does actually simply amount to compiling down to different concrete
// function pointers to be passed to `C` code in different monomorphized instances
// of a calling generic Rust function. 
//
// Hence the same is all that we need to make good use of the flexibility in
// being able to use arbitrary callback function pointers:

// Basically, just replace `Box<dyn FnMut(i32) + Send + 'static>`
// with some generic `F` in the same code as `from_dyn_fn` above:
impl Callback {
    fn from_generic_fn_mut<F>(callback_closure: F) -> Self
    where
        F: FnMut(i32) + Send + 'static,
    {
        let data = Data::from_box(Box::new(callback_closure));

        unsafe extern "C" fn callback<F>(arg: i32, data: *mut ())
        where
            F: FnMut(i32) + Send + 'static,
        {
            let reference_to_closure: &mut F;
            unsafe {
                let boxed_raw: *mut F = data.cast();
                reference_to_closure = &mut *boxed_raw;
            };
            // now we’re in Rust land, and can just call it:
            reference_to_closure(arg);
        }
        Callback {
            callback: callback::<F>,
            data,
        }
    }
}

// since we’re always ensureing `Callback` be constructed from `Send` closures,
// we can officially allow it to be sent between threads
unsafe impl Send for Callback {}

// since `Callback` doesn’t actually have *any* APIs that allow you to do anything with it
// through only an immutable reference, anyway, we can also implement `Sync`
unsafe impl Sync for Callback {}

// (for the above impls and comments to make much sense,
// let's imagine that in reality `Callback` was actually a `pub struct`,
// but still with private fields otherwise there’s no encapsulation.)

// finally, putting it all together:

fn subscribe_rust_final_version(callback: impl FnMut(i32) + Send + 'static) {
    let callback = Callback::from_generic_fn_mut(callback); // yay, no more extra `Box::new`
    subscribe_safe(callback);
}

// let's run this with miri to see that everything works properly, no UB, no leaks
fn main2() {
    // some data with ownership to capture:
    let s: String = "Hello again, one of the numbers still is: ".into();
    subscribe_rust_final_version(Box::new(move |n: i32| {
        println!("{s}{n}");
    }));
}

// uncomment `main2` if you want to run it ;-)
fn main() {
    main1();
    // main2();
}

(playground)

I would go on and help adapting this to g_dbus_connection_signal_subscribe but then I’m looking at documentation text such as this section

If user_data_free_func is non-NULL, it will be called (in the thread-default main context of the thread you are calling this method from) at some point after user_data is no longer needed. (It is not guaranteed to be called synchronously when the signal is unsubscribed from, and may be called after connection has been destroyed.)

As callback is potentially invoked in a different thread from where it’s emitted, it’s possible for this to happen after g_dbus_connection_signal_unsubscribe() has been called in another thread. Due to this, user_data should have a strong reference which is freed with user_data_free_func, rather than pointing to data whose lifecycle is tied to the signal subscription. For example, if a GObject is used to store the subscription ID from g_dbus_connection_signal_subscribe(), a strong reference to that GObject must be passed to user_data, and g_object_unref() passed to user_data_free_func. You are responsible for breaking the resulting reference count cycle by explicitly unsubscribing from the signal when dropping the last external reference to the GObject. Alternatively, a weak reference may be used.

and I honestly don’t want to try to understand all these details right now. (This kind of shit is why Rust users would prefer to have this stuff just enforced by the compiler. In light of the surprisinly vague explanations about “threads” I’m even starting to doubt whether or not the existing safe-Rust bindings in this crate are completely sound, as I see no Send or Sync restrictions in there [but neither do I understand the whole context agound “thread-default main context of the thread you are calling this method from” and whatnot]. On the other hand, looking at these existing bindings, I’m also noticing they are using Fn instead of FnMut, so perhaps there might be a good reason for choosing Fn, or perhaps not.[1])


  1. I wouldn’t know which it is; the gtk.org documentation you had linked kinda fail to mention whether or not concurrent or re-entreant calls to the callback are possible, but failure to even properly mention all the relevant basic memory-safety-critical preconditions on unsafe APIs seems to be a somewhat common theme in unsafe languages like C, unfortunately. ↩︎

5 Likes

Thank you for the complete example.

So if I understand well, you create the unsafe extern "C" function on the fly which lets you have a concrete function every time you call subscribe_rust_ver_xx or from_generic_fn_mut ? That's the "secret" I suppose ?

In every subscribe_rust_ver_xx and from_generic_fn_mut you have something like:

unsafe extern "C" fn callback(arg: i32, data: *mut ()) { ... }

Why does from_generic_fn_mut take a FnMut ? Can't Fn be called multiple times ? Or is it due to the fact that it can modify the data ?

I like your subscribe_rust_final_version since the Callback struct can be private (despite your comment saying it can be pub).

I'm really sorry but I still have some questions about subscribe_rust_ver_1:

fn subscribe_rust_ver_1(callback_box: Box<dyn FnMut(i32)>) {
    // let's skip generics for simplicity for now

    // how to transform Box<dyn FnMut(i32)> into a `Callback`?

    // first idea: we can have the data pointer `*mut ()` be a cast version of
    // the `Box<dyn FnMut(i32)>`?
    // problem: `Box<dyn FnMut(i32)>` is a fat pointer.
    // straightforward solution: add another `Box`:

    let boxed: Box<Box<dyn FnMut(i32)>> = Box::new(callback_box);
    let boxed_raw: *mut Box<dyn FnMut(i32)> = Box::into_raw(boxed);
    let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`
    // to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
    unsafe extern "C" fn destroy_data(data: *mut ()) {
        let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
        let boxed: Box<Box<dyn FnMut(i32)>> = unsafe { Box::from_raw(boxed_raw) };
        drop(boxed);
    }
    let data = Data {
        data,
        destroy_data: Some(destroy_data),
    };

    // now finally, the `Callback` will need some code to be able to _call_ that
    // wrapped-up `dyn FnMut(i32)`. For this, we don’t turn `*mut Box<dyn FnMut(i32)>`
    // into an owned `Box<Box<dyn FnMut(i32)>>`` because we don’t want to destroy it when the callback ends,
    // but instead just make a reference to the inner box,
    // of type `&mut Box<dyn FnMut(i32)>`, out of it
    unsafe extern "C" fn callback(arg: i32, data: *mut ()) {
        let reference_to_box: &mut Box<dyn FnMut(i32)>;
        unsafe {
            let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
            reference_to_box = &mut *boxed_raw;
        };
        // now we’re in Rust land, and can just call it:
        reference_to_box(arg);
    }

    let callback = Callback { callback, data };

    subscribe_safe(callback);
}

There is this code:

let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`

Why is there a cast to *mut () ? And then in destroy_data and callback the data is cast back to Box<dyn Fn>.

I suppose the double Box is to keep a "persistent" memory address / pointer to the callback when calling Box::into_raw ?

In your code Data is just some kind of container for the callback that destroys the data and a pointer to the callback if I understand well ?

Sorry for all these questions.

Thank you for your patience and help.

FnMut gives the caller of the API more flexibility since they’ll gain mutable access to captured variables; but it must only be used if the C code never actually concurrently (or in a reentreant manner) call the function multiple times at the same time. Fn can be called multiple times, too; but Fn only needs shared-reference access to the captured closure data. So yes you can use Fn, too, and it’ll be the “safer” choice actually, anyway (since as I mention I don’t really have a complete understanding of the way this callback function is going to be called at the C side for your actual use-case). The only downside is that it gives you less flexibility when actually using your new wrapper API then – if you still want the callback to mutate some state, then you’ll have to use a Mutex or something like that in order to be able to do it from the Fn callback :wink:

Same idea as a void pointer in C. It’s to “erase” the concrete type information; from the C implementation’s point of view, the data pointer is just some abstract pointer to “whatever” kind of thing, and all it needs to do is pass it on to the callback function pointer, and the destructor. For your actual use-case you may of course directly cast to *mut c_void and/or to the destroy_data one.

That’s of course also why inside of these implementations the thing is converted right back into what it actually is supposed to be.

The double-box is because Box<dyn FnMut(i32)> is actually a fat pointer in Rust (consisting of 2 pointers at run-time, one to the allocated data, and another one to a static vtable). And there’s no way to convert that into a *mut () pointer and then back again, because *mut () is only a single pointer. This is basically the kind of thing that @kornel and @robofinch were trying to explain earlier in this thread, as far as I can tell.

The double-Box is not to keep a “persistent address” or anything; in fact a main point of the later / refined versions of the wrapper in that playground is to get rid of the need for this kind of double-indirection :slight_smile:

The use of helper structs Data and Callback here is in order to explain the whole code structure as simply as possible. You won’t actually necessarily need such types in practice, but it can help to cleanly break up the process into multiple steps, and its can also overall be a good exercise for understanding some of the basic design ideas/steps involved in creating things like safe abstract data types with unsafe internal implementation details in Rust.

2 Likes

For your actual use-case you may of course directly cast to *mut c_void and/or to the destroy_data one.

So if I understand well *mut () is the equivalent of *mut c_void ? They both represend void or something like "it's just a generic memory address without the need to know what it points to" ?

Same idea as a void pointer in C. It’s to “erase” the concrete type information; from the C implementation’s point of view, the data pointer is just some abstract pointer to “whatever” kind of thing, and all it needs to do is pass it on to the callback function pointer, and the destructor. For your actual use-case you may of course directly cast to *mut c_void and/or to the destroy_data one.

So if I understand well it's a way to only have a single memory address (where Fn is "2 pointers at run-time, one to the allocated data, and another one to a static vtable'") ? Or am I understanding it wrong ?

I suppose also that the creation of the function inside the subscribe_rust_ver_xx and from_generic_fn_mut:

unsafe extern "C" fn callback(arg: i32, data: *mut ()) { ... }

I suppose it is one of the secrets to store it like you did:

struct Callback {
    callback: unsafe extern "C" fn(arg: i32, data: *mut ()),
    data: Data,
}

I suppose it's also to be compatible with the C API ?

Thank you for your detailed explanation and patience

Conceptually yes, the moral equivalent if you will. For full binary compatibility with actual C void pointers, you should probably actually use std::ffi::c_void instead, as documented there

In essence, *const c_void is equivalent to C’s const void* and *mut c_void is equivalent to C’s void* .


I was just using () as it was the first option that came to mind, and it’s a common choice for representing something like raw void pointers in (unsafe) pure Rust code [for example you can see it used in this rule in this unstable API for splitting up fat pointers into their constituent parts]; sorry if this caused any confusion.


Yes that sounds right. Note though – on the “Fn is "2 pointers at run-time” remark – that this is not actually due to what “Fn” is, but instead it’s the nature of what a pointers to dyn Trait<…> types are.

Fn is a trait[1]. So for the type Box<dyn Fn(i32)>, note that Box<_> is a pointer-type, and the target of that pointer is dyn Fn(i32), which is a dyn Trait-style type [values of these kinds of types are called “trait objects” in Rust], so that a value of type Box<dyn Fn(i32)> *indeed is an example of a “pointer to dyn Trait<…>”


Yes, tha’s completely correct: the extern "C" is in order to ensure we are creating function pointers compatible with the ABI of C function pointers. (And the unsafe I’ve added because logically callback and destroy_data are both unsafe to call, though I skipped the work of writing down the exact safety conditions,[2] since these aren’t actually used as any sort of API surface anyway.)

Rust has its own, native kind of function pointers, too, which you’d just write down as fn(…) -> …. I guess you could literally remove every mention of extern "C" from the example playground (resulting in this – note that the comments are still speaking about "C") and it still works (since the demo doesn’t actually interact with C); but with Rust function pointers, the construction can become a bit more convenient:

If it isn’t a "C"-style pointer, Rust supports using non-capturing closures to define function pointers, so you can re-write

    unsafe fn callback(arg: i32, data: *mut ()) {
        let reference_to_box: &mut Box<dyn FnMut(i32)>;
        unsafe {
            let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
            reference_to_box = &mut *boxed_raw;
        };
        // now we’re in Rust land, and can just call it:
        reference_to_box(arg);
    }

    let callback = Callback { callback, data };

more simply into:

    let callback = Callback {
        data,
        callback: |arg, data| {
            let reference_to_box: &mut Box<dyn FnMut(i32)>;
            unsafe {
                let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
                reference_to_box = &mut *boxed_raw;
            };
            reference_to_box(arg);
        }
    };

(compare this playground to see the niceties that come from using this syntax throughout)

unfortunately, for extern "C" function pointers, we must fall back to manually defining actual fn items, with full function signatures, and then create function pointers from those.


  1. a generic trait, with a little bit of extra syntax sugar so that you can write something like F: Fn(Arg) -> ReturnType as a trait bound, and don’t need to use-angle-backed syntax like F: Fn<Arg, Output = ReturnType>

    [this kind of syntax is not actually supported at all for Fn, but it would be how ordinary / user-defined traits would need to be used] ↩︎

  2. conditions like: you must only call destroy_data on the data pointer that came with it; and only once; and you must only call callback on the data pointer that came with it, and only before destroying it, and not concurrently (as long as we’re still using FnMut); violating any of these can result in unsoundness and ultimately undefined behavior / memory-unsafety ↩︎

1 Like

Thank you for precising this

So since I seem to have understood correctly, I presume that it would have been impossible to achieve the root problem of this topic without creating the callback at runtime inside the subscribe_rust_ver_xx and from_generic_fn_mut functions. And we are also creating them at runtime in order to be able to call them back multiple times I suppose and store them ? And fn can also be seen as a single function pointer ?

From my previous researches I found this question on stackoverflow which seemed to be the closest to what I was searching for (but still didn't solve / answer my problem). But I assume the upvoted answer doesn't talk about creating the unsafe extern "C" function (like you did) since the OP didn't worry about calling the function multiple times ?

Thank you for precising this. I suppose the compiler detects whether the closure needs to capture its environment or not ?

Unfortunately that's the case I'm in:

pub type GDBusSignalCallback = ::std::option::Option<
    unsafe extern "C" fn(
        connection: *mut GDBusConnection,
        sender_name: *const gchar,
        object_path: *const gchar,
        interface_name: *const gchar,
        signal_name: *const gchar,
        parameters: *mut GVariant,
        user_data: gpointer,
    ),
>;

And since it's the case I'm in, I suppose that is why closures won't work ?

Also I presume that for calling the callback multiple times one has to store (like you did) the fn ? Otherwise it won't be persistent ?

Thank you for your help and patience.