I have a C API which has a function that does some work asynchronously and notifies you of completion through a callback. This function needs to take a reference to data which needs to be valid until the work is completed, i.e. until the callback is called, not just until the function itself returns.
What would be the idiomatic way to wrap this in Rust? Requiring to execute the function in a closure passed to a scope (which waits for completion) - as std::thread:scope does - is not an option, because the calling function in Rust is in a callback itself which needs to return before the async work is done.
My idea is to have the function return a join handle with a lifetime annotation, which blocks in drop() until the callback is called:
fn c_api_func_wrapper<'a, C: FnOnce()+Send+'static>(param: &'a Param, callback: C) -> JoinHandle<'a>`
So it could be used like this:
let param = Param::new();
let join_handle = c_api_wrapper(¶m, || println!("completed));
join_handle.join();
With join() consuming the handle, and drop() for the handle also blocking for completion.
Is that a good idea, or are there problems with that?
don't pass references to it, pass the owned data instead, then in the callback, you drop the data, or reclaim the buffer if you want to reuse it later.
What I mean is, that I can't wait in the function which calls the api until the work is done.
So it could be something like this:
fn my_func(work_items: &mut Vec<WorkItem>) {
let param = Param::new();
let work_item = WorkItem::new(param);
// some messy stuff needed here to actually transfer ownership of join_handle into work_item
work_item.join_handle = c_api_wrapper(¶m, || println!("completed));
work_items.push(work_item);
}
Then the function returns, and waiting for completion can be done elsewhere (where mut access to work_items is available, too).
That sounds like an option, though less ergonomic. But can't be helped, I guess...
the "lifetime" of references in rust is statically checked by the compiler, you cannot express "the usage starts at this function and ends in another function" using borrowed data.
if, for whatever reason, you cannot transfer the ownership of the data to the worker, you'll have to use some runtime checked construct, such as RwLock (I don't think RefCell is appropriate in this case for thread safety sake), and transfer the ownership of the guard objects instead.
I know. But see the workaround above. You can transfer ownership by returning or place in a container which outlives the function.
Together with the fact that an element of a struct can reference another one (in the same struct) when using correct lifetime annotations, there are options to achieve something close to that.
that's what I was suggesting, but you need to "take" the ownership of the data from the old location first. I would usually use something like Option<Box<...>>, but it's similar for Vec, for example:
fn my_func(work_items: &mut Vec<WorkItem>) {
// `mem::take()` is equivalent to `mem::replace(&old, vec![])` for `Vec`
let work_items = std::mem::take(work_items);
// `Vec` is not ffi safe, so turn it into the raw parts, which needs nightly
let raw_data = work_items.into_raw_parts();
wrapper(raw_data, |raw_data| drop(Vec::from_raw_parts(raw_data)));
}
self referential data structures need two-phase construction and is infamously hard to work with in rust. you are also likely to run into the borrowed forever issue.
you can use borrowed data for the ffi argument, but the only safe way to do it is to make the wrapper function blocking on the API until the callback fires, which defeats the purpose of an asynchronous API in the first place:
fn wrapper(data: &[WorkItem]) {
// turn slice into raw pointer to pass through ffi boundary
let data_ptr = data.as_ptr();
let data_len = data.len();
// note: this is for illustrative purpose only,
// use `park()` and `unpark()` in this way WILL have race conditions
// proper synchronization should be used in real code
let t = std::thread::current();
// safety: blahblah
unsafe {
ffi_api(data_ptr, data_len, move || t.unpark());
}
std::thread::park();
}
Hmm, interesting. So there isn't a way to downgrade an exclusive reference to a shared reference...
I remember there was a bigger discussion about that matter at the time. Unfortunately I cannot find it right now. I would be interested in the reasoning that came to the conclusion that it needs to be like that.
If I think about it, what's quite surprising to me, is that it seems to imply one can destroy an object even if something holds still a reference to it.
Object A references B. For some reason the compiler decides to not call the destructor of A even though all references go out of scope. Then object B goes out of scope and will be destroyed/freed. Doesn't that mean object A is still 'live' and now holds a dangling reference?
It's (at least partially) because it's impossible to ensure drop code runs without severely limiting the language. For example, one could make an Rc cycle to effectively forget an object.
Of course. But why does that mean that an object B, the non-dropped object A holds a reference to, can be destroyed? You cannot wrap A in an Rc (or call mem::forget on it), if it is not guaranteed A cannot outlive B in the first place.
I'm not quite following your prose in place of code, but...
trivial destructors (such as a reference going out of scope) are known not to observe their borrows (lifetimes in their types) and can thus be dropped after the borrowed place drops
there's an unsafe, unstable feature that allows non-builtin types to opt into that guarantee to some extent as well (which the std collections use)
non-trivial destructors not using said feature are considered to be able to observe their borrows, cause borrow checker errors if they occur after the borrowed place drops
Ah, thanks a lot. Yes, that was what I was looking for. I couldn't fully follow the Scoped Task Trilemma article, but I see now what the gap in my understanding was:
One can actually transfer ownership of an object to a function, even though it is holding references to something with lifetime not longer than the return of the function. Rather logical when thinking about it Then the whole thing isn't surprising anymore...