I wrote down a little demo as a playground:
// example of passing something like
// a Box<dyn FnMut(i32)> to a C function
// the C function (let’s pretend this function would actually be defined in C somehow)
pub unsafe extern "C" fn subscribe(
callback: unsafe extern "C" fn(arg: i32, data: *mut ()),
data: *mut (),
destroy_data: Option<unsafe extern "C" fn(*mut ())>,
) {
// for demo purposes, let’s just make this "subscribe" function call the callback twice, then destroy it:
unsafe {
callback(42, data);
callback(1337, data);
if let Some(destructor) = destroy_data {
destructor(data);
}
}
}
////////////////////////////////////////////////////////////////////////////////
// now let’s create a Rust binding for this!
// ideally, we’ll end up with something straightforward like
// fn subscribe(callback: impl FnMut(i32)) { … }
// let's start with something like this, because these function ponters + data pointer
// really kinda belong together, anyway:
struct Callback {
callback: unsafe extern "C" fn(arg: i32, data: *mut ()),
data: Data,
}
// we’ll separate out the `data`-related fields because we don’t need the `callback` function for destruction
struct Data {
data: *mut (),
destroy_data: Option<unsafe extern "C" fn(*mut ())>,
}
// these types (Callback & Data) own the data behind `data`, so the’ll destroy it on `Drop`:
impl Drop for Data {
fn drop(&mut self) {
unsafe {
// same idea as in the stub "C code" above
if let Some(destructor) = self.destroy_data {
destructor(self.data);
}
}
}
}
// the `Callback` struct already allows us a safe wrapper, as follows
use std::mem::ManuallyDrop;
fn subscribe_safe(callback: Callback) {
// we’ll delegate responsibility for destruction to the "C code"
// so we must not do the destruction ourselves then
let callback = ManuallyDrop::new(callback);
unsafe {
subscribe(
callback.callback,
callback.data.data,
callback.data.destroy_data,
);
}
}
// now how to transform this into a more Rust-like interface type?
// first, simple approach:
fn subscribe_rust_ver_1(callback_box: Box<dyn FnMut(i32)>) {
// let's skip generics for simplicity for now
// how to transform Box<dyn FnMut(i32)> into a `Callback`?
// first idea: we can have the data pointer `*mut ()` be a cast version of
// the `Box<dyn FnMut(i32)>`?
// problem: `Box<dyn FnMut(i32)>` is a fat pointer.
// straightforward solution: add another `Box`:
let boxed: Box<Box<dyn FnMut(i32)>> = Box::new(callback_box);
let boxed_raw: *mut Box<dyn FnMut(i32)> = Box::into_raw(boxed);
let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`
// to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
unsafe extern "C" fn destroy_data(data: *mut ()) {
let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
let boxed: Box<Box<dyn FnMut(i32)>> = unsafe { Box::from_raw(boxed_raw) };
drop(boxed);
}
let data = Data {
data,
destroy_data: Some(destroy_data),
};
// now finally, the `Callback` will need some code to be able to _call_ that
// wrapped-up `dyn FnMut(i32)`. For this, we don’t turn `*mut Box<dyn FnMut(i32)>`
// into an owned `Box<Box<dyn FnMut(i32)>>`` because we don’t want to destroy it when the callback ends,
// but instead just make a reference to the inner box,
// of type `&mut Box<dyn FnMut(i32)>`, out of it
unsafe extern "C" fn callback(arg: i32, data: *mut ()) {
let reference_to_box: &mut Box<dyn FnMut(i32)>;
unsafe {
let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
reference_to_box = &mut *boxed_raw;
};
// now we’re in Rust land, and can just call it:
reference_to_box(arg);
}
let callback = Callback { callback, data };
subscribe_safe(callback);
}
// let's run this with miri to see that everything works properly, no UB, no leaks
fn main1() {
// some data with ownership to capture:
let s: String = "Hello, number is: ".into();
subscribe_rust_ver_1(Box::new(move |n: i32| {
println!("{s}{n}");
// recent-ish Rust versions also nicely handle
// properly disallowing panics to cross "C" function/interface
// boundaries (the program aborts instead).
// If a C api *can* handle error cases, you’d have to e.g.
// `catch_unwind` the error before going back to "C", and then probably
// report / pass / whatever… the error somewhere appropriately
// you can test it by uncommenting
// if n == 1337 {
// panic!("woah!");
// }
}));
}
// One adjustment for a `subscribe`-like function: if the callback can be
// called *after* `subscribe` returns we’ll want to add a `+ 'static` bound
// on the `FnMut`. If the callback can be called from a different thread,
// we’ll want to add a `+ Send` bound on the `FnMut`.
// If it can be called concurrently or re-entreantly (is that a word!?)
// we may need to switch to `Fn` and then also add a `+ Sync` bound.
fn subscribe_rust_ver_2(callback_box: Box<dyn FnMut(i32) + Send + 'static>) {
subscribe_rust_ver_1(callback_box);
}
// Problems so far:
// * we don’t use the flexibility of `callback` being an arbitrary `fn` pointer
// * we add 2 layers of indirection
// * we don’t ever make use of the possibility of `None` destructors
// let's start at the end:
// we already know how to construct a `Data` struct for owning
// a `Box<Box<dyn FnMut(i32)>>`.
/*
let boxed: Box<Box<dyn FnMut(i32)>> = Box::new(callback_box);
let boxed_raw: *mut Box<dyn FnMut(i32)> = Box::into_raw(boxed);
let data: *mut () = boxed_raw.cast(); // cast `*mut Box<dyn FnMut(i32)>` to `*mut ()`
// to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
unsafe extern "C" fn destroy_data(data: *mut ()) {
let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
let boxed: Box<Box<dyn FnMut(i32)>> = unsafe { Box::from_raw(boxed_raw) };
drop(boxed);
}
let data = Data {
data,
destroy_data: Some(destroy_data),
};
*/
// The same code works more generically
// for any `Box<T>` actually:
impl Data {
fn from_box<T>(boxed: Box<T>) -> Self {
let boxed_raw: *mut T = Box::into_raw(boxed);
let data: *mut () = boxed_raw.cast(); // cast `*mut T` to `*mut ()`
// to destroy the `boxed` value, we’ll need to cast back, rebuild the `Box`, then just drop it
unsafe extern "C" fn destroy_data<T>(data: *mut ()) {
let boxed_raw: *mut T = data.cast();
let boxed: Box<T> = unsafe { Box::from_raw(boxed_raw) };
drop(boxed);
}
// we can add one improvement:
// boxes to zero-sized data don’t own memory,
// and if the target type is also without destructors
// we don’t *need* to destroy anything actually.
let needs_drop = std::mem::needs_drop::<T>() || std::mem::size_of::<T>() > 0;
// Thus, we make the `destroy_data` be `None` if nothings needs to happen:
Data {
data,
destroy_data: needs_drop.then_some(destroy_data::<T>),
}
}
}
// you can run test with miri to see there are no leaks
#[test]
fn test_data_none() {
struct Foo;
struct Bar(u8);
struct Baz;
impl Drop for Baz {
fn drop(&mut self) {
println!("dropping");
}
}
let boring_type = Foo;
let non_zero_sized = Bar(123);
let has_destructor_zero_size = Baz;
let data1 = Data::from_box(Box::new(boring_type));
assert!(data1.destroy_data.is_none());
let data2 = Data::from_box(Box::new(non_zero_sized));
assert!(data2.destroy_data.is_some());
let data3 = Data::from_box(Box::new(has_destructor_zero_size));
assert!(data3.destroy_data.is_some());
}
// also let's have a `Data` accessor
impl Data {
// # Safety
// must be called with the same `T` as in the `Box<T>`
// that this `Data` was constructed from
unsafe fn inner_mut<T>(&mut self) -> &mut T {
unsafe {
// we never construct `Data` with a null pointer, anyway
// so just dereferencing is fine if the type is correct
&mut *self.data.cast::<T>()
}
}
}
// let's revisit `callback`
// the first approach of constructing `Callback` from above
// can now be written as follows
impl Callback {
fn from_dyn_fn(callback_box: Box<dyn FnMut(i32) + Send + 'static>) -> Self {
let data = Data::from_box(Box::new(callback_box));
unsafe extern "C" fn callback(arg: i32, data: *mut ()) {
let reference_to_box: &mut Box<dyn FnMut(i32)>;
unsafe {
let boxed_raw: *mut Box<dyn FnMut(i32)> = data.cast();
reference_to_box = &mut *boxed_raw;
};
// now we’re in Rust land, and can just call it:
reference_to_box(arg);
}
Callback { callback, data }
}
}
// now, we *could* use this (or the previous equivalent version anyway) to
// create a convenient api that accepts any kind of `impl FnMut` parameter:
fn subscribe_rust_ver_3(callback: impl FnMut(i32) + Send + 'static) {
let callback = Callback::from_dyn_fn(Box::new(callback));
subscribe_safe(callback);
}
// but as mentioned before *now* we’re *doubly boxing* the closure,
// and still not making use of the possibility of passing different
// function pointers into the `unsafe extern "C" fn callback(i32, *mut ())` field
// so let’s fix that.
// The approach here basically just generalizes the exact same code from above
// to any kind of type `F` in place of `Box<dyn FnMut(i32)>` now.
// We have actually already seen this kind of thing in the previous section,
// but it’s worth highlighting this: `unsafe extern "C" fn`s can be made
// generic functions without any issue (and you saw this on `destroy_data<T>` above)
// and this does actually simply amount to compiling down to different concrete
// function pointers to be passed to `C` code in different monomorphized instances
// of a calling generic Rust function.
//
// Hence the same is all that we need to make good use of the flexibility in
// being able to use arbitrary callback function pointers:
// Basically, just replace `Box<dyn FnMut(i32) + Send + 'static>`
// with some generic `F` in the same code as `from_dyn_fn` above:
impl Callback {
fn from_generic_fn_mut<F>(callback_closure: F) -> Self
where
F: FnMut(i32) + Send + 'static,
{
let data = Data::from_box(Box::new(callback_closure));
unsafe extern "C" fn callback<F>(arg: i32, data: *mut ())
where
F: FnMut(i32) + Send + 'static,
{
let reference_to_closure: &mut F;
unsafe {
let boxed_raw: *mut F = data.cast();
reference_to_closure = &mut *boxed_raw;
};
// now we’re in Rust land, and can just call it:
reference_to_closure(arg);
}
Callback {
callback: callback::<F>,
data,
}
}
}
// since we’re always ensureing `Callback` be constructed from `Send` closures,
// we can officially allow it to be sent between threads
unsafe impl Send for Callback {}
// since `Callback` doesn’t actually have *any* APIs that allow you to do anything with it
// through only an immutable reference, anyway, we can also implement `Sync`
unsafe impl Sync for Callback {}
// (for the above impls and comments to make much sense,
// let's imagine that in reality `Callback` was actually a `pub struct`,
// but still with private fields otherwise there’s no encapsulation.)
// finally, putting it all together:
fn subscribe_rust_final_version(callback: impl FnMut(i32) + Send + 'static) {
let callback = Callback::from_generic_fn_mut(callback); // yay, no more extra `Box::new`
subscribe_safe(callback);
}
// let's run this with miri to see that everything works properly, no UB, no leaks
fn main2() {
// some data with ownership to capture:
let s: String = "Hello again, one of the numbers still is: ".into();
subscribe_rust_final_version(Box::new(move |n: i32| {
println!("{s}{n}");
}));
}
// uncomment `main2` if you want to run it ;-)
fn main() {
main1();
// main2();
}
(playground)
I would go on and help adapting this to g_dbus_connection_signal_subscribe but then I’m looking at documentation text such as this section
If user_data_free_func is non-NULL, it will be called (in the thread-default main context of the thread you are calling this method from) at some point after user_data is no longer needed. (It is not guaranteed to be called synchronously when the signal is unsubscribed from, and may be called after connection has been destroyed.)
As callback is potentially invoked in a different thread from where it’s emitted, it’s possible for this to happen after g_dbus_connection_signal_unsubscribe() has been called in another thread. Due to this, user_data should have a strong reference which is freed with user_data_free_func, rather than pointing to data whose lifecycle is tied to the signal subscription. For example, if a GObject is used to store the subscription ID from g_dbus_connection_signal_subscribe(), a strong reference to that GObject must be passed to user_data, and g_object_unref() passed to user_data_free_func. You are responsible for breaking the resulting reference count cycle by explicitly unsubscribing from the signal when dropping the last external reference to the GObject. Alternatively, a weak reference may be used.
and I honestly don’t want to try to understand all these details right now. (This kind of shit is why Rust users would prefer to have this stuff just enforced by the compiler. In light of the surprisinly vague explanations about “threads” I’m even starting to doubt whether or not the existing safe-Rust bindings in this crate are completely sound, as I see no Send or Sync restrictions in there [but neither do I understand the whole context agound “thread-default main context of the thread you are calling this method from” and whatnot]. On the other hand, looking at these existing bindings, I’m also noticing they are using Fn instead of FnMut, so perhaps there might be a good reason for choosing Fn, or perhaps not.)