How to handle deterministic but not 'static lifetime in tokio

I used tokio to run async jobs and I found there's some tricky cases involving lifetimes. To give a simple example:

struct People {
    id: u32,
}

#[tokio::main]
async fn main() {
    loop {
        let people = People { id: 1 };
        tokio::spawn(print_id(&people)).await;
    }
}

async fn print_id(people: &People) {
    tokio::time::sleep(std::time::Duration::from_millis(1)).await;
    println!("id: {}", people.id);
}

It won't compile because error:

`people` does not live long enough

I found tokio has limitation about arguments that it must have 'static lifetime:

pub fn spawn<T>(future: T) -> JoinHandle<T::Output>
    where
        T: Future + Send + 'static,
        T::Output: Send + 'static,
{
    // ...
}

I understand that actually: since the job is submitted to tokio, it's not guaranteed when it would start to run and when it would finish. In the worst-case scenario, it might never finish executing. Therefore it's marked as 'static.

But in my case above, the job's lifetime is actually deterministic: the .await make sure the async job would finish before people is dropped or never finish and people won't be dropped neither. So what should I do in this case? Assumed async move and clone is not appropriate, reference is only what we have.

Not quite. It is marked 'static, not because the task might run forever, but because the referent might get dropped before the task runs and accesses the referent. That's a use-after-free, one of the main memory errors Rust's ownership model helps you prevent.

You probably want to wrap the value in an Arc and pass a clone of the Arc to the task. Arcs are inexpensive to clone and allow you to access the underlying value, without the risk that is was freed previously, because as long as the strong reference count of the Arc doesn't reach zero, the value isn't dropped:

use std::sync::Arc;

struct People {
    id: u32,
}

#[tokio::main]
async fn main() {
    let people = Arc::new(People { id: 1 });
    tokio::spawn(print_id(people)).await.unwrap();
}

async fn print_id(people: Arc<People>) {
    tokio::time::sleep(std::time::Duration::from_millis(1)).await;
    println!("id: {}", people.id);
}

Playground.

I just found a solution that uses tokio_scoped:

A scoped tokio Runtime that can be used to create Scopes which can spawn futures which can access stack data. That is, the futures spawned by the Scope do not require the 'static lifetime bound. This can be done safely by ensuring that the Scope doesn’t exit until all spawned futures have finished executing. Be aware, that when a Scope exits it will block until every future spawned by the Scope completes. Therefore, one should take caution when created scopes within an asynchronous context, such as from within another spawned future.

Now the code looks like:

struct People {
    id: u32,
}

#[tokio::main]
async fn main() {
    loop {
        let people = People { id: 1 };
        tokio_scoped::scope(|scope| {
            // Use the scope to spawn the future.
            scope.spawn(print_id(&people));
        });
    }
}

async fn print_id(people: &People) {
    tokio::time::sleep(std::time::Duration::from_millis(1)).await;
    println!("id: {}", people.id);
}

which needs neither 'static nor Arc. I think that seems what I want.

thanks a lot still

2 Likes

Note that there is a deprecation notice in the tokio-scoped README, imploring users to use async-scoped instead.

3 Likes

You should probably not use either.

In this specific case I would recommend just cloning the value.

I know cloning is always the best solution for all lifetime stuffs. The given case is just simple enough for demonstration. I think there's some case cloning is not appropriate, for example People doesn't implement Clone trait or cloning costs a lot.

To reiterate what I stated above, that's usually where reference counted smart pointers like Arc and Rc come into play. They are inexpensive to clone (they only need to increase the reference counter internally) and let you avoid the issues around lifetimes you have to deal with when using shared references in asynchronous contexts.

2 Likes

Keep in mind that Arc construction typically clones the value too!

I don't think I agree with this statement. I'd say typical construction of an Arc happens through Arc::new, which does not involve any cloning of the value but instead takes the data by value and moves it into an allocation on the heap (i.e. creates a new Box to store the data in).

1 Like

Yes, Arc can usually resolve lifetime issues I think. There may be still some tricky cases that Arc is not good at.
First is mutability, consider a print_and_increase_id function that takes &mut People as argument:

struct People {
    id: u32,
}

#[tokio::main]
async fn main() {
    loop {
        let mut people = People { id: 1 };
        tokio_scoped::scope(|scope| {
            // Use the scope to spawn the future.
            scope.spawn(print_and_increase_id(&mut people));
        });
        println!("id: {}", people.id);
    }
}

async fn print_and_increase_id(people: &mut People) {
    tokio::time::sleep(std::time::Duration::from_millis(1)).await;
    println!("id: {}", people.id);
    people.id += 1;
}

With scope, the code works exactly what I want: change it in async jobs and access it after the job finishes, no lifetime issues. With Arc, we cannot get &mut People directly from a Arc<People>, maybe some extra transformation can help but it seems more tedious.
You may claim that we should change the signature of function to receive a Arc as argument. That's the second case what I'm gonna talk: what if the function is inside some external crates that we can't change.

To conclude:

  1. Just clone directly, if it costs a lot, consider the others then;
  2. Use Arc instead, it works for most cases. If atomic calculation still costs a lot or we can't change the function signature, consider scope then;
  3. Use Scope instead;

For mutability you need to use interior mutability, like an RwLock or a Mutex inside your Arc. Here an example of how I tend to abstract shared memory into types:

use std::sync::{Arc, RwLock};

#[derive(Clone)]
struct People {
    inner: Arc<RwLock<PeopleInner>>,
}

impl People {
    fn new(id: u32) -> Self {
        Self {
            inner: Arc::new(RwLock::new(PeopleInner { id })),
        }
    }

    fn id(&self) -> u32 {
        self.inner.read().unwrap().id
    }

    fn inc(&self) -> u32 {
        let mut guard = self.inner.write().unwrap();
        let res = guard.id;
        guard.id += 1;
        res
    }
}

struct PeopleInner {
    id: u32,
}

#[tokio::main]
async fn main() {
    let people = People::new(1);

    let people_task = people.clone();

    tokio::task::spawn(async move {
        println!("id: {}", people_task.inc());
    })
    .await
    .unwrap();

    println!("id: {}", people.id());
}

Playground.

1 Like

It depends on how you're getting the data, if you have something like

#[inline(never)]
fn into_arc<T>(data: T) -> Arc<T> {
  Arc::new(data);
}

this involves copying the data into the newly created box. Of course this won't happen if you can construct the Arc directly but that's not always the case.

1 Like

tokio_scoped does not do what you think it does. It blocks the thread, which means that it's only suitable for use from outside from a runtime.

As jofas said, for this kind of thing you need to rely on interior mutability. For such a simple thing you can just use an AtomicI32

1 Like

Sometimes you just don't have 'static data at the point you want to spawn and can therefore not put it into an Arc, and you don't want or can't clone it. In that case, i think you can get away with tokio::task::spawn_local?

But in the worst case though, you could avoid the spawn altogether and just sprinkle in some tokio::task::yield_now().await; to breakup the amount of time you are potentially blocking

1 Like

Moving data is not the same as cloning or copying in Rust. Neither Clone::clone is involved when calling Arc::new, nor is the value copied (from the language perspective). It is simply moved. You are right that moving a value might involve copying bytes to a new location in memory, like in this case to a new heap allocation, but that happens under the hood of the compiler as part of the move semantics Rust has and might be optimized away. Your example is the same as calling Arc::new directly by the way when we look at heap allocation and moving the value to it.

4 Likes

and it's important to note that the byte-wise copy that may happen during a move is shallow, which usually is very cheap. For example, move of a Vec is a quick constant-time operation, which doesn't touch Vec's content.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.