I'm looking for general design guidance around patterns for handling mem::forget in situations where a buffer is owned by some asynchronous library.
I'm contributing to the Rust MPI library rsmpi. MPI is a standard for scalable programs in scientific computing.
One of the features of MPI is its asynchronous communication routines. E.g.:
int MPI_Isend(const void *buf, int count, MPI_Datatype datatype, int dest, int tag,
MPI_Comm comm, MPI_Request *request);
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Request *request);
MPI_Isend
(the I stands for "immediate"), takes a buffer buf
of size count
and type datatype
, sends it to dest
in the communicator comm
and returns a request
. Conceptually, request
owns a borrow of buf
until the user completes the request by calling MPI_Wait(request)
, at which point MPI has release the buffer back to the user. MPI_Irecv
does the same, but with a mutable borrow.
A naive Rust-managed version of MPI_Request
would look like this:
struct Request<'a> {
request: MPI_Request,
phantom: PhantomData<Cell<&'a ()>>,
}
impl<'a> Drop for Request<'a> {
fn drop(&mut self) {
assert!(request == MPI_REQUEST_NULL,
"The user did not complete the request before it went out of scope! \
This is unsafe because an uncompleted Request may still use its attached buffers.");
}
}
Then ideally, something like this would be safe:
{
let mut my_recv_buffer = [0i32; 8];
let request = comm.process_at_rank(src).immediate_recv(my_recv_buffer);
// If code does not take this branch, for whatever reason, the code should still
// be safe because the program will crash rather than continue with a potential
// use after free.
if ... {
request.wait();
}
}
Unfortunately this hypothetical API would not be safe - in light of std::mem::forget
, the following code could be written completely in safe code.
{
let mut my_recv_buffer = [0i32; 8];
let request = comm.process_at_rank(src).immediate_recv(&mut my_recv_buffer);
std::mem::forget(request);
}
// Uh-oh! The request persists, but the buffer is no longer borrowed!
// These APIs are not actually safe!
Because forget
causes the Request destructor to not run, it can defeat our guarantee that the request does not yield ownership of the buffer before it is completed.
rsmpi
solves the problem in the following way - all "immediate" functions take a scope parameter, where the scope is an un-forgettable type. E.g.
let mut recv_buffer = [0i32; 8];
mpi::request::scope(|scope: &LocalScope| {
let request = world.process_at_rank(src)
.immediate_recv(scope, &mut recv_buffer);
});
LocalScope
is defined such that each Request
registers with scope. At the end of the lambda, which is a FnOnce
, the code panics if there are any registered Request
s that haven't been completed. It goes without saying that the lifetime of any buffers must be greater than the lifetime of scope
.
One of the changes I'm making is to remove the scope
field, and instead allow the Request
to take ownership of the buffers that the request owns. This makes the API a little less arduous if you just want to pass Vec<T>
as the send or receive buffer. Unfortunately, you still need the scope
concept if you want to use a buffer borrow. e.g. there would be some API like let scoped_buffer = scope.attach(send_buffer)
with the same semantics we currently have. This change would mean for a substantial amount of code, it's possible to avoid using scope
.
Now my question - is there a better code pattern for this besides scope
? It's kind of frustrating that the ownership semantics of Rust can perfectly guarantee that the request's buffers outlive the request, but due to mem::forget
, cannot guarantee that the Request is safely "cleaned up" before yielding ownership of the buffers.