Lifetimes Across Async Task Boundaries

Hello

I've encountered a recurring pattern in most async code that I have written which I believe highlights a limitation in Rust's lifetime model. I'm curious if others have found elegant solutions or if there are any planned language features that might address this.

The Problem

The code that I am about to show is just a high level pseudo code to highlight the problem, not the actual implementation

I have a structure that owns a database connection and spawns async tasks that need to access this connection. The DB library correctly uses lifetimes (clients are tied to connection lifetime), but this creates friction with async tasks:

// The DbClient from the library
pub struct DbClient<'db, CrawlerClientConfigType, ProxyType>
where
    CrawlerClientConfigType: CrawlerClientConfig,
    ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
    pub client: Client<CrawlerClientConfigType, ProxyType>,
    pub db: &'db DatabaseConnection,  // Reference with lifetime
}

impl<'db, CrawlerClientConfigType, ProxyType> DbClient<'db, CrawlerClientConfigType, ProxyType>
where
    CrawlerClientConfigType: CrawlerClientConfig,
    ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
    // Library function that loads a client with reference to the DB connection
    pub async fn load_random_from_db(db: &'db DatabaseConnection) 
        -> Result<DbClient<'db, CrawlerClientConfigType, ProxyType>, DbErr> {
        // Implementation details...
    }
}

struct Crawler {
    db: DatabaseConnection,
    // other fields
}


impl Crawler {
    // My wrapper to get a client
    pub async fn obtain_acc_from_db<'a>(&'a self) 
        -> Option<DbClient<'a, CrawlerProdConfig, Oxylabs>> {
        match DbClient::<CrawlerProdConfig, Oxylabs>::load_random_from_db(&self.db).await {
            Ok(client) => Some(client),
            Err(_) => None,
        }
    }

    // The problematic task spawning function
    pub async fn start(&self) {
        let mut tasks = JoinSet::new();
        
        //Assume loop is hooked onto exit codes
        loop {
            // This won't compile with safe Rust:
            let Some(client) = self.obtain_acc_from_db().await else {
                continue;
            };
            
            // This fails because the compiler can't verify that 'self' will
            // outlive the spawned task, even though in practice it will
            tasks.spawn(async move {
                // Error: `client` contains a reference to `self.db`
                // which won't necessarily outlive the spawned task
                Self::task(self, client).await;
            });
        }
      //Some task cleanup stuff here
    }
}

The Workaround (Unsafe)

I can make this work using unsafe code:

pub async fn start(&self) {
    let mut tasks = JoinSet::new();
     //Assume loop is hooked onto exit codes
    loop {
        let Some(mut client) = self.obtain_acc_from_db().await else {
            continue;
        };
        
        // Convert to raw pointers and transmute lifetimes
        unsafe {
            let self_ptr = self as *const Crawler as usize;
            let client_ptr = &mut client as *mut DbClient<_, _> as usize;
            mem::forget(client);
            
            tasks.spawn(async move {
                let mut db_client = ptr::read(client_ptr as *mut DbClient<_, _>);
                let self_instance = &*(self_ptr as *const Crawler);
                
                Self::task(self_instance, db_client).await;
            });
        }
    }
      //Some task cleanup stuff here
}

This works because I know the invariants that the compiler can't verify:

  1. The tasks are guaranteed to finish before self is destroyed
  2. No task will outlive self or access the DB connection after it's gone

Existing Solutions and Their Limitations

  1. Using Arc: I could change the library to use:
pub struct DbClient<CrawlerClientConfigType, ProxyType>
where
    CrawlerClientConfigType: CrawlerClientConfig,
    ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
    pub client: Client<CrawlerClientConfigType, ProxyType>,
    pub db: Arc<DatabaseConnection>,
}

But this requires changing library code which isn't always possible.

  1. Scoped tasks: Libraries like tokio-scoped tried to address this problem but there are certain limits to this solution.

The Question

I feel this highlights a gap in Rust's async model. We need a way to express "these async tasks are guaranteed to terminate before the parent object" without resorting to unsafe code.

Has anyone found elegant solutions to this problem? Are there any language features planned that might address this issue?

I'd appreciate hearing others' experiences with similar patterns and how you've addressed them in production code.

I've seen this pattern mentioned when managing a resource (network, filesystem) separate from the async logic. I'm not sure how cleanly this could be integrated for database usage, though. (If there are many, many query functions, then encoding the function call as data for the channels can become cumbersome)

The fundamental issue is that this is not in control of whoever manages the tasks, it it also responsibility of whoever manages the parent future. This is because whoever manages it could leak it and there's nothing the task can do to detect it and stop before that happens.

If you want to go more in depth into the issues regarding async task scopes I suggest you these two reads

1 Like

This is in part a limitation of Rust but also a code smell. It's hard to guarantee when an error occurs that your connection isn't destroyed until after all its children tasks are polled to completion. (In general it's hard to handle errors the more tasks you spawn and centralizing error handling is one reason to prefer the actor model.)

To address both this and the previous post by SkiFire13:

These lifetime guarantees are indeed difficult to establish in general, and would be problematic in library code where you can't anticipate all usage patterns.

However, in my specific case, this code exists at the highest layer of the application - essentially the main runtime loop. This is exactly the point where you should have enough context to make reliable guarantees that the compiler simply cannot verify through its type system. The application's structure ensures that:

  1. The main loop (containing self) won't exit until all spawned tasks complete
  2. Task cancellation is handled in a controlled manner
  3. The DB connection's lifetime is tied to the application itself

I completely understand why Rust's borrow checker can't validate these relationships statically. My wish isn't to bypass safety, but rather to have a way to express these application-level guarantees within the language. This seems to be a recurring friction point specifically for long-running async applications.

For now, I'll either use the carefully documented unsafe approach or restructure with Arc, but it would be interesting to see if future Rust features might address this pattern more elegantly.

Under those conditions, you should be able to use tokio-scoped (or something using the same pattern) to make tasks that borrow from main.

2 Likes

Note that that's not enough. You need to ensure that whoever polls the Future returned by the start function won't leak it (and if that's another async function then you need to do the same for that function and so on).