Working around the `'static` requirement in `tokio::task::spawn_blocking`

Context

Recently I found myself writing an async function that performs an expensive computation. Something in the lines of:

async fn my_async_fn(buf: &mut Vec<u8>) {
    // ...
    // (async operations omitted for brevity)

    // Problem: the function below will block the
    // thread because it takes long to complete!
    expensive_computation(buf)
}

fn expensive_computation(_buf: &mut Vec<u8>) {
    // Body omitted for brevity
}

As far as I know, there is a problem with this implementation because it can block the futures on the current thread from progressing.

Current solution

The solution to the problem above is to offload the expensive computation to a different thread, using a method such as tokio::task::spawn_blocking. However, spawn_blocking has the 'static bound on the passed closure, making it impossible to do the straightforward thing:

async fn my_async_fn(buf: &mut Vec<u8>) {
    // Compile error: `&mut buf` does not satisfy the 'static lifetime
    tokio::task::spawn_blocking(move || expensive_computation(buf))
        .await
        .unwrap()
}

A somewhat straightforward solution to the compile error is to change the buf parameter to be of type Arc<Mutex<Vec<u8>>. In my view that is undesirable, because the fact that we are using spawn_blocking here is an implementation detail of my_async_fn and should not leak to its parameter types (i.e. the caller shouldn't need to be aware of it). Furthermore, using Arc and Mutex feels like overkill, because I know buf will outlive the spawned task (I am using await right after the spawn_blocking call).

Because I was unsatisfied, I kept fiddling with the code until I settled on the good old mem::swap trick, and got the following to compile, without having to change my_async_fn's signature:

async fn my_async_fn(buf: &mut Vec<u8>) {
    let mut owned_buf = Vec::default();
    std::mem::swap(buf, &mut owned_buf);
    *buf = tokio::task::spawn_blocking(move || {
        expensive_computation(&mut owned_buf);
        owned_buf
    })
    .await
    .unwrap()
}

Basically: I ensure the spawned task operates on an owned Vec<u8>, thereby getting rid of lifetime issues, and pass ownership back to buf once the spawned task is done.

My question

The final approach using mem::swap feels like a perfect solution in this case, but there is one big drawback that might prevent it from working in other situations: it requires a sensible default value you can use with mem::swap. That means, for instance, that the approach would not work if expensive_computation operated on a &mut File, because there is no such thing as File::default().

Is this just the way it is, making it impossible to use spawn_blocking without modifying the function's signature? Or is there some other solution/workaround I'm missing?

By the way, I have created a playground link in case you want to try things out!

Is the buffer (with performed mutations) needed after the call to my_async_fn?

I don't think there's a way around this without mem::take() or mem::replace(). There are several approaches that might work for you:

  • You are already using mem::replace() instead of mem::take() – maybe it's intentional, maybe you are just unaware of mem::take(), but this does mean that the dummy replacement value doesn't have to be the Default. You can, for example, create a dummy file by opening it in a temporary directory. File::create("/dev/null") works in a Unix environment, too.
  • Passing a mutable reference is semantically equivalent with passing a value and returning it immediately. You could change your function's signature to take the buffer by value and return it at the end of the operation.
  • There are crates that unsafely hack around the absence of a dummy replacement value; they simply mem::read() from the mutable reference and promise™ that they'll put the value back or do something sensible upon a panic. I wouldn't recommend this approach, as it is an anti-pattern and it's unnecessarily unsafe.
2 Likes

In this specific case it would be unsound to use such a crate. my_async_fn may be canceled before expensive_computation returns, in which case the buf needs to be immediately available to the caller that caused the cancelation.

1 Like

I mean, there's probably also a reason why the called closure must be 'static in the first place. I couldn't prove to myself based on some quick back-of-the-envelope reasoning why this must be the case, but I'm pretty sure that static bound has a purpose without which the interface would be unsound, too.

See also Scoped threads in Rust, and why its async counterpart would be unsound | Wisha Wanichwecharungruang

1 Like

Unfortunately, since the my_async_fn future can be mem::forgeted at any time during the execution of your spawn_blocking, the caller can take ownership of the target of the mutable reference at any time. When the caller does this, the target must be a valid value of the given type, so you cannot use mem::read as this results in the value being incorrectly duplicated. Furthermore, you also cannot access the mutable reference directly as this would be a data race if the mem::forget happens.

3 Likes

I'm gonna insist on two points @H2CO3 discussed:

  • very often you can create a dummy value for your type, so that mem::replace() can use it. When you really can't, you could wrap the referee in an Option so as to automagically make None such a value (None is Default Done Right™);

  • fn(&mut T) is indeed quite[1] equivalent to fn(T) -> T.

    From there, I thus add that if you really had to deal with a singleton type or some other hard-to-dummy-ize type, then you could try to amend my_async_fn itself to go for the latter pattern:

    async fn my_async_fn(mut buf: Vec<u8>) -> Vec<u8> {
        ::tokio::task::spawn_blocking(move || {
            expensive_computation(&mut buf);
            buf
        })
        .await
        .unwrap_or_else(|err| ::std::panic::resume_unwind(err.into_panic()))
    }
    

  1. you technically can't go from the former to the latter without mem::replace() or unsafe-ly promising to put the value back whence you took it from, so the latter is strictly more restrictive. But it's still conceptually the same idea, so you can push for the latter all the way to where the value was owned and could thus be given and gotten back :slightly_smiling_face: ↩︎

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.