FFI w/ callbacks: am I doing something wrong? compiler bug?

I've been trying to write some Rust code that passes a callback into a C library which then invokes it. I think that I've got the basic structure, and it works when I compile in debug mode (1.22.1, x86_64-unknown-linux-gnu) ... but everything breaks when I compile in release mode. Would anyone be able to help me figure out if I'm doing something wrong or if I'm triggering some obscure compiler bug here?

Here's my C code (native.c):

typedef struct RustCallback {
    void (*function)(void *closure_data);
    void *closure_data;
} RustCallback;

int
testcase(RustCallback *callback)
{
    callback->function(callback->closure_data);
    return 0;
}

And here's my Rust code (testcase.rs):

use std::os::raw::{c_void, c_int};

#[repr(C)]
struct RustCallback {
    pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
    pub closure_data: *mut c_void,
}

extern "C" {
    fn testcase(callback: *mut RustCallback) -> c_int;
}

extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
    let f: &mut F = unsafe { &mut *(ctxt as *mut F) };
    f()
}

fn new_cb<F>(mut f: F) -> RustCallback where F: FnMut() {
    RustCallback {
        function: Some(c_callback::<F>),
        closure_data: &mut f as *mut _ as *mut c_void,
    }
}

fn main() {
    let x = 7usize;

    println!("OUTSIDE: {:12x} {:12x}", x, &x as *const _ as u64);

    let mut cb = new_cb(|| {
        println!(" INSIDE: {:12x} {:12x}", x, &x as *const _ as u64);
    });

    let rv = unsafe { testcase(&mut cb) };

    if rv != 0 {
        println!("prevent rv from disappearing");
    }
}

If it comes in handy, here's a ninja script (build.ninja) that compiles debug and release versions of the code. Hopefully it is easy to parse even if you are not familiar with ninja:

rule cc
  command = cc -O0 -ffunction-sections -fdata-sections -fPIC -g -m64 -Wall -Wextra -MD -MF $out.d -o $out -c $in
  description = CC $out
  depfile = $out.d
  deps = gcc

rule ar_rel
  command = ar crs $out $in
  description = AR $out

rule rustc
  command = rustc --crate-name $out $in --crate-type bin --emit=link -C debuginfo=2 $extra_args -L native=. -l static=$static_name
  description = RUST $out

build native.o: cc native.c

build libnative.a: ar_rel native.o

build debug: rustc testcase.rs | libnative.a
  static_name = native

build release: rustc testcase.rs | libnative.a
  static_name = native
  extra_args = -C opt-level=3

Here's the type of output I get:

$ ./debug 
OUTSIDE:            7 7ffd47cf28e0
 INSIDE:            7 7ffd47cf28e0
$ ./release 
OUTSIDE:            7 7ffd6299d290
 INSIDE: 55669ae31b40 55669b0474d0

And here are some of the fun things I've found:

  • If I restructure the code to not use the RustCallback struct, I can get something that works reliably in both modes
  • If I ignore the return value of the testcase function, it works in both modes (!)
  • The code shown above fails in both modes if I compile with nightly

These features tend to make me think that I'm doing something that's not quite right, but if I am, I can't figure out what's going on. Anyone have any insight?

Thanks!

Aren't you leaking a dangling pointer to a temporary when setting the closure_data member of the output RustCallback struct in the new_cb function?

If so, I see two possible courses of action:

  • The safest one is to have closure_data be a (casted) Box<F> instead of a (casted) &mut F. You would modify new_cb to box the input callback, and downcast the box into the corresponding *mut F via Box::into_raw. Note that the corresponding dynamic allocation would then be leaked unless you add a proper Drop implementation to RustCallback.
  • A more dangerous, but potentially better performing path is to stick with an &mut F, but modify new_cb to take an &mut F instead of an F and design your RustCallback struct so that it properly accounts for the lifetime of the closure_data. I think you can do this by adding a PhantomData member with the proper lifetime to RustCallback.

That's a good point, but if I add a lifetime to the struct using a PhantomData item, I'm still getting the same problem. Although this is one of those things where I'm definitely not 100% sure that I've done it right. Here's the diff:

@@ -1,9 +1,11 @@
+use std::marker::PhantomData;
 use std::os::raw::{c_void, c_int};
 
 #[repr(C)]
-struct RustCallback {
+struct RustCallback<'a> {
     pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
     pub closure_data: *mut c_void,
+    _lifetime: PhantomData<&'a ()>,
 }
 
 extern "C" {
@@ -15,10 +17,11 @@ unsafe extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
     f()
 }
 
-fn new_cb<F>(mut f: F) -> RustCallback where F: FnMut() {
+fn new_cb<'a, F>(mut f: F) -> RustCallback<'a> where F: 'a + FnMut() {
     RustCallback {
         function: Some(c_callback::<F>),
         closure_data: &mut f as *mut _ as *mut c_void,
+        _lifetime: PhantomData,
     }
 }

The Box approach does seem to work, though!

I do see a bit of danger since I'd like to implement Drop for the callback struct, to ensure that I can call Box::from_raw() and properly clean up the closure pointer. However, the struct itself does not know the type behind the closure_data pointer, so would be dropping a Box<c_void>. I don't know if that's problematic.

I am not sure if it would be possible to add an explicitly called method that does the drop in a way that has the information about the type F and then zeros out the closure_data pointer. And then of course users have to remember to call it — I suppose you could have a Drop implementation that panics if the closure_data pointer isn't zero'ed out?

If you're going with the PhantomData approach, you need to pass a f: &'a mut F to new_cb, not f: F (this is what @HadrienG was suggesting above). If you pass just f: F, then new_cb now owns the closure and it'll be dropped as soon as new_cb returns. If you pass &'a mut F, then you must keep the closure alive outside of new_cb; the lifetime parameter in RustCallback will ensure it doesn't outlive the closure.

1 Like

Ahh! That makes it work. Thanks for the pointers!

(If anyone has any knowledge about the things I mentioned regarding the Box<F> approach, I'd be curious to hear, but this definitely gets me going.)

1 Like

To properly drop the closure in the Box<F> approach, you could build some some kind of generic drop_closure_data<F>(closure_data: *mut c_void) function which does the right thing for a known closure type F, then store a reference to the correct version of that function in the RustCallback struct, and finally call that on Drop.

Crafty! That sounds like it would work.

I've had this kind of problem before ... I should really remember that when things break on release mode, it's probably because a value is being dropped earlier than it is in debug mode.

More generally, release mode-specific breakages are a pretty strong indicator that your program is doing something in that list.

Could you post your updated working example using lifetimes? I'm interested in seeing that solution.

Here's what worked for me:

use std::marker::PhantomData;
use std::os::raw::{c_void, c_int};

#[repr(C)]
struct RustCallback<'a> {
    pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
    pub closure_data: *mut c_void,
    _lifetime: PhantomData<&'a ()>,
}

extern "C" {
    fn testcase(callback: *mut RustCallback) -> c_int;
}

unsafe extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
    let f: &mut F = &mut *(ctxt as *mut F);
    f()
}

fn new_cb<'a, F>(f: &'a mut F) -> RustCallback<'a> where F: FnMut() {
    RustCallback {
        function: Some(c_callback::<F>),
        closure_data: f as *mut _ as *mut c_void,
        _lifetime: PhantomData,
    }
}

fn main() {
    let x = 7usize;

    println!("OUTSIDE: {:12x} {:12x}", x, &x as *const _ as u64);

    let mut closure = || {
        println!(" INSIDE: {:12x} {:12x}", x, &x as *const _ as u64);
    };
    
    let mut cb = new_cb(&mut closure);

    let rv = unsafe { testcase(&mut cb) };

    if rv != 0 {
        println!("prevent rv from disappearing");
    }
}
2 Likes