FFI callback and segfault

Hi, I'm currently learning Rust and experimenting with FFI callbacks. I wrote a library that does the following:

  • Initializes a C data structure.
  • Initializes a Rust struct for this data structure (a wrapper), the constructor expects a function pointer (the callback test_cb):
    let test_cb = move |n: i32| {
        n*2
    };
    let mut _test_worker = new(test_cb);
  • Stores a pointer to the Rust data structure on the C side and vice versa: there's a pointer to the C structure in Rust:
pub struct Worker {
    ptr: WorkerPtr,        // Pointer to the C data structure
    cb: fn(i32) -> i32,    // Callback function
}
struct WorkerPtr(NonNull<worker>);
unsafe impl marker::Send for WorkerPtr {}
struct worker_s {
  void* rust_object;  // Pointer to the Rust data structure
};
  • Implements a callback function: C calls Rust indicating the Rust struct pointer, Rust deref the pointer and calls a method on it: Worker.trigger_callback, the result is sent back to the C world:
#[no_mangle]
pub extern fn rust_callback(w: *mut Worker, raw_n: c_int) -> c_int {
    let n = raw_n as i32;
    println!("rust_callback: {}", n);
    unsafe {
        let out = (*w).trigger_callback(n);
        out as c_int
    }
}
  • Everything works fine, the method in the Rust object is successfully called, as expected, the program output looks like this (following the implementation showed above):
worker_new: worker = 0x7f9834d000c0
worker_set_rust_object: worker = 0x7f9834d000c0 rust_object = 0x10ca21000
trigger_callback: worker = 0x7f9834d000c0 rust_object = 0x10ca21000
rust_callback: 100
trigger_callback: 100
got: 200

In my real use case I'm using the bytes library and my callback is a fn(bytes::Bytes) -> bytes::Bytes. My tests work fine and Worker::trigger_callback is successfully called, under certain scenarios I'm getting a segmentation fault and EXC_BAD_ACCESS when calling the function pointer that's stored in the structure (self.cb/Worker.cb). On the Rust side I'm using a single thread. I can also confirm that the bytes object is correct, I can event print its length using bytes.len(), the only issue is the self.cb/Worker.cb call.

I've been trying to debug and find the cause with no luck, so I'm looking for feedback on this approach, etc. The full code is available here.

Could this issue occur when multiple threads (on the C side) try to access the callback function pointer concurrently? How would you debug this?

Best

2 Likes

Compiling with RUSTFLAGS="-Z sanitizer=address" (and disabling the ODR violations) seems to help:

    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/playground3`
worker_new: worker = 0x602000000050
worker_set_rust_object: worker = 0x602000000050 rust_object = 0x602000000070
trigger_callback: worker = 0x602000000050 rust_object = 0x602000000070
rust_callback: 100
trigger_callback: 100
=================================================================
==12418==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000078 at pc 0x56456063ada9 bp 0x7ffee0e30470 sp 0x7ffee0e30468
READ of size 8 at 0x602000000078 thread T0
    #0 0x56456063ada8 in playground3::Worker::trigger_callback::h6190064bb254e604 /tmp/rust-experiment/src/main.rs:32
    #1 0x56456063bc78 in rust_callback /tmp/rust-experiment/src/main.rs:69
    #2 0x56456063ca14 in trigger_callback /tmp/rust-experiment/src/worker.cc:26:25
    #3 0x56456063b847 in playground3::main::hf9d4a6faea13b766 /tmp/rust-experiment/src/main.rs:59
    #4 0x564560639b45 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h719c12c079d0caf5 /checkout/src/libstd/rt.rs:74
    #5 0x5645606f0042 in std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::ha58db48360f1b6ce /checkout/src/libstd/rt.rs:59
    #6 0x5645606f0042 in _ZN3std9panicking3try7do_call17h42c1fd8e456f9c5eE.llvm.5099546753143347372 /checkout/src/libstd/panicking.rs:310
    #7 0x564560700589 in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:105
    #8 0x5645606e72e2 in std::panicking::try::h5db639d572b81c9b /checkout/src/libstd/panicking.rs:289
    #9 0x5645606e72e2 in std::panic::catch_unwind::hc496adcbbea624e1 /checkout/src/libstd/panic.rs:374
    #10 0x5645606e72e2 in std::rt::lang_start_internal::h4fb241c27837a847 /checkout/src/libstd/rt.rs:58
    #11 0x564560639ab6 in std::rt::lang_start::hd5e8785a6fef03ec /checkout/src/libstd/rt.rs:74
    #12 0x56456063bd2c in main (/tmp/rust-experiment/target/debug/playground3+0xbd2c)
    #13 0x7f29106e906a in __libc_start_main (/usr/lib/libc.so.6+0x2306a)
    #14 0x564560638b59 in _start (/tmp/rust-experiment/target/debug/playground3+0x8b59)

0x602000000078 is located 8 bytes inside of 16-byte region [0x602000000070,0x602000000080)
freed by thread T0 here:
    #0 0x5645606c9cb2 in __interceptor_free /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_malloc_linux.cc:68:3
    #1 0x5645606391fd in _$LT$alloc..alloc..Global$u20$as$u20$core..alloc..GlobalAlloc$GT$::dealloc::h94d8aa7e5106fee0 /checkout/src/liballoc/alloc.rs:61
    #2 0x564560639d07 in alloc::alloc::box_free::hfd1bbc2bf750bdd8 /checkout/src/liballoc/alloc.rs:132
    #3 0x56456063b592 in playground3::new::h3a2e78e9a3425966 /tmp/rust-experiment/src/main.rs:49
    #4 0x56456063b7c3 in playground3::main::hf9d4a6faea13b766 /tmp/rust-experiment/src/main.rs:55
    #5 0x564560639b45 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h719c12c079d0caf5 /checkout/src/libstd/rt.rs:74

previously allocated by thread T0 here:
    #0 0x5645606c9e63 in malloc /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_malloc_linux.cc:88:3
    #1 0x56456063ca36 in alloc_system::platform::_$LT$impl$u20$core..alloc..GlobalAlloc$u20$for$u20$alloc_system..System$GT$::alloc::h58129110b8dbffb8 /checkout/src/liballoc_system/lib.rs:114
    #2 0x56456063ca36 in __rg_alloc /checkout/src/librustc_asan/lib.rs:27
    #3 0x564560638ddc in alloc::alloc::exchange_malloc::hf517281913dbd024 /checkout/src/liballoc/alloc.rs:114
    #4 0x56456063b331 in _$LT$alloc..boxed..Box$LT$T$GT$$GT$::new::hbda25516d394704d /checkout/src/liballoc/boxed.rs:94
    #5 0x56456063b331 in playground3::new::h3a2e78e9a3425966 /tmp/rust-experiment/src/main.rs:46
    #6 0x56456063b7c3 in playground3::main::hf9d4a6faea13b766 /tmp/rust-experiment/src/main.rs:55
    #7 0x564560639b45 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h719c12c079d0caf5 /checkout/src/libstd/rt.rs:74

SUMMARY: AddressSanitizer: heap-use-after-free /tmp/rust-experiment/src/main.rs:32 in playground3::Worker::trigger_callback::h6190064bb254e604
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa fd fa fa fa 05 fa fa fa 00 fa fa fa fd[fd]
  0x0c047fff8010: fa fa 00 fa fa fa 00 00 fa fa 00 fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==12418==ABORTING

The problem is that in the following lines

    let mut boxed_worker = Box::new(w);
    unsafe { worker_set_rust_object(_ptr, boxed_worker.as_mut())};
    *boxed_worker

you are:

  1. creating the box boxed_worker, allocating w on the heap
  2. passing the pointer of w (in the heap) to function
  3. taking the value from the box e putting it on the stack
  4. dropping boxed_worker

In this way the lifetime of your pointed object finished with the new function, and you are carrying around a completely useless (for your callback, I mean) Worker object.

Just for a better understanding, you can test the following:

fn new(cb: fn(i32) -> i32) -> &'static mut Worker {
/* ... */
    unsafe { worker_set_rust_object(_ptr, boxed_worker.as_mut()) };
    Box::leak(boxed_worker)
}

If you run the code with this, you have a bad leak, but the program works fine:

    Finished dev [unoptimized + debuginfo] target(s) in 0.66s
     Running `target/debug/playground3`
worker_new: worker = 0x602000000050
worker_set_rust_object: worker = 0x602000000050 rust_object = 0x602000000070
trigger_callback: worker = 0x602000000050 rust_object = 0x602000000070
rust_callback: 100
trigger_callback: 100
got: 200

=================================================================
==13856==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x5582d8e7d9a3 in malloc /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_malloc_linux.cc:88:3
    #1 0x5582d8defff6 in alloc_system::platform::_$LT$impl$u20$core..alloc..GlobalAlloc$u20$for$u20$alloc_system..System$GT$::alloc::h58
129110b8dbffb8 /checkout/src/liballoc_system/lib.rs:114
    #2 0x5582d8defff6 in __rg_alloc /checkout/src/librustc_asan/lib.rs:27
    #3 0x5582d8debddc in alloc::alloc::exchange_malloc::hf517281913dbd024 /checkout/src/liballoc/alloc.rs:114
    #4 0x5582d8dee85a in _$LT$alloc..boxed..Box$LT$T$GT$$GT$::new::hbda25516d394704d /checkout/src/liballoc/boxed.rs:94
    #5 0x5582d8dee85a in playground3::new::h675314a02ebfcc36 /tmp/rust-experiment/src/main.rs:44
    #6 0x5582d8deec14 in playground3::main::hf9d4a6faea13b766 /tmp/rust-experiment/src/main.rs:51
    #7 0x5582d8decd65 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h719c12c079d0caf5 /checkout/src/libstd/rt.rs:74

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x5582d8e7d9a3 in malloc /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_malloc_linux.cc:88:3
    #1 0x7f11953c4169 in operator new(unsigned long) (/usr/lib/libc++.so.1+0x8e169)
    #2 0x5582d8dee4c8 in playground3::new::h675314a02ebfcc36 /tmp/rust-experiment/src/main.rs:36
    #3 0x5582d8deec14 in playground3::main::hf9d4a6faea13b766 /tmp/rust-experiment/src/main.rs:51
    #4 0x5582d8decd65 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h719c12c079d0caf5 /checkout/src/libstd/rt.rs:74

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 2 allocation(s).

Last but not least, you also have a warning about WorkingPtr being unsafe for FFI.

3 Likes

Hi @dodomorandi, thank you very much, didn't know about the sanitizer.

Yes, it is the holy grail for unsafe code :grin:. Unfortunately at to date we have to use the nightly compiler, so it is possible to trigger different behaviour from the stable version.

We can always use Valgrind in these cases, but I think the sanitizers are enough for 99% of the time.

2 Likes