FFI w/ callbacks: am I doing something wrong? compiler bug?


#1

I’ve been trying to write some Rust code that passes a callback into a C library which then invokes it. I think that I’ve got the basic structure, and it works when I compile in debug mode (1.22.1, x86_64-unknown-linux-gnu) … but everything breaks when I compile in release mode. Would anyone be able to help me figure out if I’m doing something wrong or if I’m triggering some obscure compiler bug here?

Here’s my C code (native.c):

typedef struct RustCallback {
    void (*function)(void *closure_data);
    void *closure_data;
} RustCallback;

int
testcase(RustCallback *callback)
{
    callback->function(callback->closure_data);
    return 0;
}

And here’s my Rust code (testcase.rs):

use std::os::raw::{c_void, c_int};

#[repr(C)]
struct RustCallback {
    pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
    pub closure_data: *mut c_void,
}

extern "C" {
    fn testcase(callback: *mut RustCallback) -> c_int;
}

extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
    let f: &mut F = unsafe { &mut *(ctxt as *mut F) };
    f()
}

fn new_cb<F>(mut f: F) -> RustCallback where F: FnMut() {
    RustCallback {
        function: Some(c_callback::<F>),
        closure_data: &mut f as *mut _ as *mut c_void,
    }
}

fn main() {
    let x = 7usize;

    println!("OUTSIDE: {:12x} {:12x}", x, &x as *const _ as u64);

    let mut cb = new_cb(|| {
        println!(" INSIDE: {:12x} {:12x}", x, &x as *const _ as u64);
    });

    let rv = unsafe { testcase(&mut cb) };

    if rv != 0 {
        println!("prevent rv from disappearing");
    }
}

If it comes in handy, here’s a ninja script (build.ninja) that compiles debug and release versions of the code. Hopefully it is easy to parse even if you are not familiar with ninja:

rule cc
  command = cc -O0 -ffunction-sections -fdata-sections -fPIC -g -m64 -Wall -Wextra -MD -MF $out.d -o $out -c $in
  description = CC $out
  depfile = $out.d
  deps = gcc

rule ar_rel
  command = ar crs $out $in
  description = AR $out

rule rustc
  command = rustc --crate-name $out $in --crate-type bin --emit=link -C debuginfo=2 $extra_args -L native=. -l static=$static_name
  description = RUST $out

build native.o: cc native.c

build libnative.a: ar_rel native.o

build debug: rustc testcase.rs | libnative.a
  static_name = native

build release: rustc testcase.rs | libnative.a
  static_name = native
  extra_args = -C opt-level=3

Here’s the type of output I get:

$ ./debug 
OUTSIDE:            7 7ffd47cf28e0
 INSIDE:            7 7ffd47cf28e0
$ ./release 
OUTSIDE:            7 7ffd6299d290
 INSIDE: 55669ae31b40 55669b0474d0

And here are some of the fun things I’ve found:

  • If I restructure the code to not use the RustCallback struct, I can get something that works reliably in both modes
  • If I ignore the return value of the testcase function, it works in both modes (!)
  • The code shown above fails in both modes if I compile with nightly

These features tend to make me think that I’m doing something that’s not quite right, but if I am, I can’t figure out what’s going on. Anyone have any insight?

Thanks!


#2

Aren’t you leaking a dangling pointer to a temporary when setting the closure_data member of the output RustCallback struct in the new_cb function?

If so, I see two possible courses of action:

  • The safest one is to have closure_data be a (casted) Box<F> instead of a (casted) &mut F. You would modify new_cb to box the input callback, and downcast the box into the corresponding *mut F via Box::into_raw. Note that the corresponding dynamic allocation would then be leaked unless you add a proper Drop implementation to RustCallback.
  • A more dangerous, but potentially better performing path is to stick with an &mut F, but modify new_cb to take an &mut F instead of an F and design your RustCallback struct so that it properly accounts for the lifetime of the closure_data. I think you can do this by adding a PhantomData member with the proper lifetime to RustCallback.

#3

That’s a good point, but if I add a lifetime to the struct using a PhantomData item, I’m still getting the same problem. Although this is one of those things where I’m definitely not 100% sure that I’ve done it right. Here’s the diff:

@@ -1,9 +1,11 @@
+use std::marker::PhantomData;
 use std::os::raw::{c_void, c_int};
 
 #[repr(C)]
-struct RustCallback {
+struct RustCallback<'a> {
     pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
     pub closure_data: *mut c_void,
+    _lifetime: PhantomData<&'a ()>,
 }
 
 extern "C" {
@@ -15,10 +17,11 @@ unsafe extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
     f()
 }
 
-fn new_cb<F>(mut f: F) -> RustCallback where F: FnMut() {
+fn new_cb<'a, F>(mut f: F) -> RustCallback<'a> where F: 'a + FnMut() {
     RustCallback {
         function: Some(c_callback::<F>),
         closure_data: &mut f as *mut _ as *mut c_void,
+        _lifetime: PhantomData,
     }
 }

#4

The Box approach does seem to work, though!

I do see a bit of danger since I’d like to implement Drop for the callback struct, to ensure that I can call Box::from_raw() and properly clean up the closure pointer. However, the struct itself does not know the type behind the closure_data pointer, so would be dropping a Box<c_void>. I don’t know if that’s problematic.

I am not sure if it would be possible to add an explicitly called method that does the drop in a way that has the information about the type F and then zeros out the closure_data pointer. And then of course users have to remember to call it — I suppose you could have a Drop implementation that panics if the closure_data pointer isn’t zero’ed out?


#5

If you’re going with the PhantomData approach, you need to pass a f: &'a mut F to new_cb, not f: F (this is what @HadrienG was suggesting above). If you pass just f: F, then new_cb now owns the closure and it’ll be dropped as soon as new_cb returns. If you pass &'a mut F, then you must keep the closure alive outside of new_cb; the lifetime parameter in RustCallback will ensure it doesn’t outlive the closure.


#6

Ahh! That makes it work. Thanks for the pointers!

(If anyone has any knowledge about the things I mentioned regarding the Box<F> approach, I’d be curious to hear, but this definitely gets me going.)


#7

To properly drop the closure in the Box<F> approach, you could build some some kind of generic drop_closure_data<F>(closure_data: *mut c_void) function which does the right thing for a known closure type F, then store a reference to the correct version of that function in the RustCallback struct, and finally call that on Drop.


#8

Crafty! That sounds like it would work.

I’ve had this kind of problem before … I should really remember that when things break on release mode, it’s probably because a value is being dropped earlier than it is in debug mode.


#9

More generally, release mode-specific breakages are a pretty strong indicator that your program is doing something in that list.


#10

Could you post your updated working example using lifetimes? I’m interested in seeing that solution.


#11

Here’s what worked for me:

use std::marker::PhantomData;
use std::os::raw::{c_void, c_int};

#[repr(C)]
struct RustCallback<'a> {
    pub function: Option<unsafe extern "C" fn(closure_data: *mut c_void)>,
    pub closure_data: *mut c_void,
    _lifetime: PhantomData<&'a ()>,
}

extern "C" {
    fn testcase(callback: *mut RustCallback) -> c_int;
}

unsafe extern "C" fn c_callback<F>(ctxt: *mut c_void) where F: FnMut() {
    let f: &mut F = &mut *(ctxt as *mut F);
    f()
}

fn new_cb<'a, F>(f: &'a mut F) -> RustCallback<'a> where F: FnMut() {
    RustCallback {
        function: Some(c_callback::<F>),
        closure_data: f as *mut _ as *mut c_void,
        _lifetime: PhantomData,
    }
}

fn main() {
    let x = 7usize;

    println!("OUTSIDE: {:12x} {:12x}", x, &x as *const _ as u64);

    let mut closure = || {
        println!(" INSIDE: {:12x} {:12x}", x, &x as *const _ as u64);
    };
    
    let mut cb = new_cb(&mut closure);

    let rv = unsafe { testcase(&mut cb) };

    if rv != 0 {
        println!("prevent rv from disappearing");
    }
}