Hello
I've encountered a recurring pattern in most async code that I have written which I believe highlights a limitation in Rust's lifetime model. I'm curious if others have found elegant solutions or if there are any planned language features that might address this.
The Problem
The code that I am about to show is just a high level pseudo code to highlight the problem, not the actual implementation
I have a structure that owns a database connection and spawns async tasks that need to access this connection. The DB library correctly uses lifetimes (clients are tied to connection lifetime), but this creates friction with async tasks:
// The DbClient from the library
pub struct DbClient<'db, CrawlerClientConfigType, ProxyType>
where
CrawlerClientConfigType: CrawlerClientConfig,
ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
pub client: Client<CrawlerClientConfigType, ProxyType>,
pub db: &'db DatabaseConnection, // Reference with lifetime
}
impl<'db, CrawlerClientConfigType, ProxyType> DbClient<'db, CrawlerClientConfigType, ProxyType>
where
CrawlerClientConfigType: CrawlerClientConfig,
ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
// Library function that loads a client with reference to the DB connection
pub async fn load_random_from_db(db: &'db DatabaseConnection)
-> Result<DbClient<'db, CrawlerClientConfigType, ProxyType>, DbErr> {
// Implementation details...
}
}
struct Crawler {
db: DatabaseConnection,
// other fields
}
impl Crawler {
// My wrapper to get a client
pub async fn obtain_acc_from_db<'a>(&'a self)
-> Option<DbClient<'a, CrawlerProdConfig, Oxylabs>> {
match DbClient::<CrawlerProdConfig, Oxylabs>::load_random_from_db(&self.db).await {
Ok(client) => Some(client),
Err(_) => None,
}
}
// The problematic task spawning function
pub async fn start(&self) {
let mut tasks = JoinSet::new();
//Assume loop is hooked onto exit codes
loop {
// This won't compile with safe Rust:
let Some(client) = self.obtain_acc_from_db().await else {
continue;
};
// This fails because the compiler can't verify that 'self' will
// outlive the spawned task, even though in practice it will
tasks.spawn(async move {
// Error: `client` contains a reference to `self.db`
// which won't necessarily outlive the spawned task
Self::task(self, client).await;
});
}
//Some task cleanup stuff here
}
}
The Workaround (Unsafe)
I can make this work using unsafe code:
pub async fn start(&self) {
let mut tasks = JoinSet::new();
//Assume loop is hooked onto exit codes
loop {
let Some(mut client) = self.obtain_acc_from_db().await else {
continue;
};
// Convert to raw pointers and transmute lifetimes
unsafe {
let self_ptr = self as *const Crawler as usize;
let client_ptr = &mut client as *mut DbClient<_, _> as usize;
mem::forget(client);
tasks.spawn(async move {
let mut db_client = ptr::read(client_ptr as *mut DbClient<_, _>);
let self_instance = &*(self_ptr as *const Crawler);
Self::task(self_instance, db_client).await;
});
}
}
//Some task cleanup stuff here
}
This works because I know the invariants that the compiler can't verify:
- The tasks are guaranteed to finish before
self
is destroyed - No task will outlive
self
or access the DB connection after it's gone
Existing Solutions and Their Limitations
- Using Arc: I could change the library to use:
pub struct DbClient<CrawlerClientConfigType, ProxyType>
where
CrawlerClientConfigType: CrawlerClientConfig,
ProxyType: Proxy + IntoDb<db::proxies::ActiveModel> + FromDb<db::proxies::Model>,
{
pub client: Client<CrawlerClientConfigType, ProxyType>,
pub db: Arc<DatabaseConnection>,
}
But this requires changing library code which isn't always possible.
- Scoped tasks: Libraries like
tokio-scoped
tried to address this problem but there are certain limits to this solution.
The Question
I feel this highlights a gap in Rust's async model. We need a way to express "these async tasks are guaranteed to terminate before the parent object" without resorting to unsafe code.
Has anyone found elegant solutions to this problem? Are there any language features planned that might address this issue?
I'd appreciate hearing others' experiences with similar patterns and how you've addressed them in production code.