FFI Callbacks accessing outer variables

Hi everyone!

I'm trying to build a websocket server for a C library. This C library offers sync operations as well as callback operations (once an event happens).

I'm facing Segmentation Faults when the callback tries to access memory from outside the callback, and I've decided to set up the minimal example in order to reproduce it.

I've set up a repository with a minimal example in order to showcase it: https://github.com/cquintana-verbio/ffi-callback

How to trigger: Run cargo test.

The sample C library I've created offers an Accumulator which will accumulate numbers. These numbers will be summed, and once a defined N is achieved, a callback will be triggered with the sum of the previous N numbers.

In the Rust code, the trampoline and the whole callback interfacing with C has been extracted from the Unnofficial Rust FFI guide: https://s3.amazonaws.com/temp.michaelfbryan.com/callbacks/index.html#rust-closures

I've set up the following tests:

  1. Closure that checks the result. WORKS
  2. (Test that expects to fail) Closure that checks the result against an invalid value. This was done in order to prove that the callback is being triggered. WORKS.
  3. A closure that accesses an outer variable (mpsc channel). Segmentation fault.
  4. Same as 3, but with an async (tokio) channel, which is the actual case I want to build. Segmentation fault.

Whenever I try to access a variable that has been defined outside the callback, I have a segmentation fault.

Also, I assume that in the case 4 I will probably need to use a Box::pin (or Pin), as the memory may be moved between threads depending on how the async runtime schedules it. Am I right?

I'd like some help in order to make the tests pass, so I can start building the websocket server with async capabilities.

Thanks in advance!

1 Like

This code looks suspect to me:

    pub fn register_callback<F>(&mut self, cb: &mut F)
    where
        F: FnMut(i32),
    {
        let mut accumulator = move |sum: c_int| {
            cb(sum);
        };
        unsafe {
            let (closure, callback) = unpack_closure(&mut accumulator);
            registerCallback(&mut self.inner, closure, callback);
        }
    }

accumulator is a local variable to this function, and it looks like you're using a pointer to it as your callback. By the time the callback is actually called, however, this stack frame has been destroyed and the pointer is invalid. You'll need to do something to ensure the closure doesn't get deallocated until callbacks can no longer happen.

1 Like

You were right, that was it!
I've removed it and passed the cb variable directly (I don't know why I was even using it, instead of passing the cb) directly...

I've updated the repo with the correct solution. Thanks!

1 Like

That fixed the immediate segfault, but this code is still unsound. You're storing a reference provided by the caller, but nothing guarantees that the caller will keep the callback alive long enough: a user could write a function like this:

fn add_callback(acc: &mut Accumulator, dest: &mut i32) {
    let cb = move |x| { *dest = x };
    acc.register_callback(acc, &mut cb);
}

There's no unsafe, and the compiler will allow it, but you'll have another segfault to debug. One way to fix this is to tie the lifetime of the callbacks to the Accumulator: (untested)

pub struct Accumulator<'cb> {
    inner: AccumulatorSys,
    cb: std::marker::PhantomData<&'cb mut ()>
}

impl<'cb> Accumulator<'cb> {
    /* ... */
    pub fn register_callback<F>(&mut self, cb: &'cb mut F)
    where
        F: FnMut(i32),
    {
        unsafe {
            let (closure, callback) = unpack_closure(cb);
            registerCallback(&mut self.inner, closure, callback);
        }
    }
}

(NB: I'm not the most versed at unsafe code, so it'd be nice to get a second opinion from someone who does this more often)

1 Like

Wow, thanks for the side note!
Could the problem be solved if I stored the callback in the Accumulator instance?
That way it wouldn't be deallocated unless the caller registered another callback, and even then the old one would not be called anymore...

It's a reasonable strategy in general, but I'm not confident enough in my unsafe knowledge to make a concrete suggestion:

You'll need to store the closure on the heap so that the pointer doesn't change if the entire Accumulator struct gets moved. There may also be some additional subtlety around &mut aliasing that you need to be careful of.

Would a Box::pin help with that?

1 Like

Cheeky plug, but I wrote an more recent article on much the same topic. You may find it useful :slightly_smiling_face:


To answer the original question, the way you are registering callbacks is unsound. You've created a closure on the stack and passed a pointer to the registerCallback() function. Then some time after the callback is registered you'll try to invoke the callback and access something that is no longer valid.

You need to move the closure to the heap by first boxing it then passing a pointer to the boxed object into the callback. Something important to keep in mind is that unless the C code asks for some sort of destroy(void *) function it won't free your callback afterwards.

How I usually handle this is by storing the boxed object in some sort of Context object which will outlive the time my callback is registered. You can see this better in the libsignal-protocol-rs project, in particular things like ContextInner which is where we store long-lived state (in this case, just a closure used for logging).

Not really. Something behind a Box is already guaranteed to have a stable location in memory, and because almost everything is Unpin you'll be able to move it anyway. Pin is mainly used when you have self-referential types where overwriting/mutating the object would invalidate references.

I realise this directly contradicts the state in the previously mentioned libsignal_protocol_rs::ContextInner but that's because I didn't really understand Pin when I wrote the code. I just haven't removed it because it's a no-op and haven't really found the time/need.

The main use case is generators which can retain references across yield points (i.e. when you await inside an async function). So as long as your state object doesn't contain a running generator it should be fine, and even then you'll probably Box::pin() the generator individually, so it still won't be a problem.

3 Likes

Hi!
First of all, thanks for your guide, your articles and your time answering the thread!

I've created another branch with the WS Server: https://github.com/cquintana-verbio/ffi-callback/tree/websocket

My "state" is not as complicated as yours. Do you think I'm missing something?

What I'm missing is a way to free the callback, as you mentioned (I'll try to add it during the weekend).

Apart from that free, do you think my usage is unsound? I register the callback and it lives for the whole connection time. Maybe I could try to store the callback in the State instance, but I've not needed it.
I've needed to unsafe impl Send for AccumulatorSys and unsafe impl Sync for AccumulatorSys.

What "scares" me is that due to how async/await works in Rust, its location in memory might vary, which is why I was considering the Box::pin. However, I've run some tests and it has not crashed.

If you want to test it, cargo run and in another window:

$ websocat "ws://127.0.0.1:8000/ws"
{"number": 1}
{"number": 1}

You will receive a response

{"result":2}

Thank you very much!

That's not a guarantee that it won't crash in the future. "Testing can detect only the presence of errors , not their absence."

1 Like

This is true, although more testing can give you confidence that the situations being tested are correct.

I'm not 100% sold that it's correct.

You've just said the callback lives for the whole connection time, however here you
register a borrowed (&mut) closure. If your Accumulator is taking ownership of the closure (which it is, because it's cleaning it up afterwards) then you should pass by value.

As it is, your callback will be destroyed when run() finishes, which may or may not be when the Accumulator is done with it. So you could have a use-after-free depending on when things happen, and when you implement Drop for Accumulator you will have a double-free.


This is roughly how I'd implement the Accumulator wrapper.

If you needed to call some accumulator_free() function to destroy the AccumulatorSys, I'd also wrap the closure in std::mem::ManuallyDrop and implement Drop so I can explicitly specify exactly when things get destroyed. That means you aren't leaving things up to the drop code automatically injected by the compiler (e.g. when something leaves scope or a variable gets overwritten).

Hi!

And thanks again @Michael-F-Bryan for your time and help.

I've been trying some things, and I've now tried to implement some of the things I've learned from both the article and your comments.

However, I'm now facing a new change, which I hoped not to be too hard, but it's driving me a little crazy.

The C library I'm wrapping has the concept of Callback, and it works like the following:

  1. You create a Callback instance, and in the construction you pass the void* userData.
  2. The Callback has internally 2 callbacks, which are function pointers. It exposes methods for registering callbackA and callbackB.

I've implemented the same thing in my code in a new branch of the repo (see the changes in the inc/ directory), and I've managed to create some test-cases that trigger the segmentation fault I'm facing: https://github.com/cquintana-verbio/ffi-callback/tree/test/callback-struct

I have created a CallbackHolder struct that contains the C Callback struct instance and my two closures.

In the trampoline functions, I cast the *c_void to my struct, which also lives in the instance, and here's the weird thing: The first time the callback is invoked, it works, but if the other callback is invoked, or if I invoke the same callback twice, it crases with a segmentation fault.

I've added some tests, and only the ones named correct_ pass. What am I missing now? I think I'm not using anything in the stack right now.

PS: By using CLion's debugger, I've been able to see that the cast at the trampoline functions does not map to a valid representation of the struct, but I'm not able to see why.

This ought to be &mut *boxed as ...:

- let inner = unsafe { bindings::createCallback(&mut boxed as *mut _ as *mut c_void) };
+ let inner = unsafe { bindings::createCallback(&mut *boxed as *mut CallbackHolder as *mut c_void) };

Note that you wouldn't have had that issue if you had explicitely annotated the types instead of relying on _ type inference.

Given that unsafe and what you are doing is already pretty unsafe, avoid footguns and explicitely annotate the types as much as possible :wink:

Hi!

Thanks for stepping in and helping, and for finding that sneaky bug.

With this and performing this change, now all the tests pass!

- let holder: &mut Box<CallbackHolder> = unsafe { &mut *(data as *mut Box<CallbackHolder>) };
+ let holder: &mut CallbackHolder = unsafe { &mut *(data as *mut CallbackHolder) };

I'll try performing this changes in the real project and see if it works :crossed_fingers:

1 Like

I just wanted to report that the solution ended up working perfectly.

I'll try to set up the repo with the final solution for demonstration purposes, and replace the links in the thread with permalinks pinned to the exact commit when the comment was made, for historic purposes.

Thank you all for your help!

2 Likes