Closure appears to be dropped too soon


#1

Hi,
I’m trying to write a binding for a native C/C++ library that employs callbacks.

This is the code that registers a callback:

pub fn foo_register<F>(&self, cb: F) -> TraitObject where F: Fn(i32) {
	let to = unsafe {
		glue::foo_register(self.0, mem::transmute(&cb as &Fn(i32)))
	};
	mem::forget(cb);
	to
}

#[no_mangle]
pub extern fn foo_trampoline(callback: TraitObject) {
	let cb: &Fn(i32) = unsafe { mem::transmute(callback) };
	cb(3);
}

The glue is a C glue code that does the actual registering. I have an equivalent of TraitObject defined in C, the function saves that and a native callback function passes that to a rust trampoline function, which turns it back into the original closure type.

I also have the glue register function call the trampoline just to verify (which works).
And then I also have the rust register function return the trait object as well for testing purposes.

Usage:

struct Droptest(i32);

impl Drop for Droptest {
	fn drop(&mut self) {
		println!("Dropped! ({})", self.0);
	}
}

fn main() {
	let ctx = unsafe { glue::ctx_create() };
	
	let d = Droptest(1);
	let to = ctx.foo_register(move |arg: i32| {
		println!("Callback called! *{:p} = {}", &d, d.0);
	});
	foo_trampoline(to);

	unsafe { glue::ctx_do_something(ctx); }
}

ctx_do_something() triggers the callback.
The output:

Callback called! *0x7ffd98087bf8 = 1
Callback called! *0x7ffd98087bf8 = 1495883296
Callback called! *0x7ffd98087bf8 = 1493443512

The first line is when the closure is called while registering, just to verify that the pass-through through C/C++ code works. The second line is calling foo_trampoline() on the trait object returned from the register function. By this time, the closure already contains invalid data.
The third line is the native library triggering the callback, same story as the second line.

I also tried passing the closure by a trait object right from the main function, ie:

pub fn foo_register<F>(&self, cb: &F) -> TraitObject where F: Fn(i32) {
//...
let to = ctx.foo_register(& move |arg: i32| {
		println!("Callback called! *{:p} = {}", &d, d.0);
	});
//...

With that varitaion, I also get a drop logged from the Droptest thingy:

Callback called! *0x7ffd98087bf8 = 1
Dropped! (1)
Callback called! *0x7ffd98087bf8 = -2102046192
Callback called! *0x7ffd98087bf8 = 281314120

Both of those seem really weird to me.
I would’ve expected the mem::forget() to prevent the closure from being dropped.

How do I prevent the closure and it’s environment from being dropped?

Thanks.


#2

Are you sure it’s the closure being dropped, I almost wonder if d is being dropped after the first time the callback is being called.


#3

I don’t think so. Removing the first call doesn’t change anything.


#4

^^ This.

The problem here is the closure lives on the stack (isn’t boxed). Calling mem::forget on it would prevent its destructor from running (it doesn’t actually have one in this case) but doesn’t magically extend its lifetime (it still lives on the stack). The proper way to do this is to box the closure:

pub fn foo_register<F>(&self, cb: F) -> TraitObject
    where F: Fn(i32) + 'static // 'static is important, don't want this closure to reference anything on the stack.
{
        unsafe {
            let boxed_cb: Box<Fn(i32) + 'static> /* assert static */ = Box::new(cb);
            glue::foo_register(self.0, Box::into_raw(boxed_cb));
        }
}

edit: Sorry, I didn’t notice the move. Do you still get the dropping behavior after doing the above^^.


#5

@stebalien I’ll try. Will take me some time, because a raw pointer from a boxed closure is apparently still 128 bits. I’ll try to figure this out.

What does the 'static on a closure actually mean? I never fully understood that. Of course, I realize the closure shouldn’t reference the stack it is created in, but I thought move would take care of that…

Thanks for the help.

EDIT: It’s fixed now with the closure boxed. Thanks!


#6

That shouldn’t be any different than &Fn(i32). If your &Fn(i32) pointer wasn’t 128 bits (two pointers wide), something is seriously wrong.

In this case, it means that the closure doesn’t reference anything with a non-static lifetime (in the general case, it means that the type is valid for the life of the program). move || ... just means “capture by value”. The values it captures can themselves have non-static lifetimes. For example, the following will fail:

fn main() {
    let a = 0;
    let a_ref = &0;
    let not_static: Box<Fn() + 'static> = Box::new(move || println!("{}", a_ref));
}
/* 
tmp.rs:3:18: 3:19 error: borrowed value does not live long enough
tmp.rs:3     let a_ref = &0;
                          ^
tmp.rs:3:18: 3:19 note: reference must be valid for the static lifetime...
tmp.rs:3:20: 5:2 note: ...but borrowed value is only valid for the block suffix following statement 1 at 3:19
tmp.rs:3     let a_ref = &0;
*/

Also, in general, when exposing a safe interface for some unsafe functionality you can’t just trust that the caller did the right thing.


#7

Well, unless I’m wrong, &Fn(i32) should be a trait object, which should be two pointers wide… If that’s not true I don’t understand anything :slight_smile:

Ok, I get the idea, but the notion of “valid for the life of the program” is a bit confusing, because apparently I can capture Arcs in such a closure (which is good! :slight_smile:), but that doesn’t make the Arc live for the duration of the whole program, at least I hope so.

EDIT: I should probably also mark the closure Send as the callback might be called from another thread…


#8

Well, yes. They’re both fat pointers so they’re both 2 pointers wide. As a matter of fact, they should both have the same representation (data), just different types.

Values don’t have lifetimes, types do. This is one of those cases where this distinction is very important. Whenever you borrow something on the stack, rust constructs a new type that’s only defined in the region below the borrow on the stack.

The type Arc<T + 'static> is valid for the life of the program. That is, a value of this type could stick around for the the life of the program (it doesn’t mean that it will). That’s why Arcs need dynamic memory management (so the backing memory can be freed early).

Yes. Or Sync if it might be called concurrently from multiple threads.


#9

@stebalien Ok. Thanks a lot for the explanation!