Ensuring async tasks run to completion

Hello! I work on the mongodb crate, which is the official MongoDB driver for Rust. We've recently converted to be async (with a beta release coming soon!). One complication we've run into that we often need to update internal state while performing a user-specified operation, and this could potentially be canceled at any time if the user stops polling the future for some reason. For example, if a user executes an insert operation, and we create a new connection to perform the operation on, the user could stop polling the future returned by the insert method while the connection is being authenticated, leaving the connection in a state where the handshake is only partially completed, making it undesirable to check it into the internal connection pool. While we could handle each of these cases separately (e.g. dropping a connection rather than checking it into the pool unless we mark it as having completed authentication), this relies on us noticing every place in the library that we perform any work that spans across an await and ensuring that the state is properly reset in the case that it's not finished. A better solution would be to be execute all user operations in backgrounds tasks and then await the result of those tasks, which means the runtime will continue running each operation to completion even if a user does not poll it to completion. However, this is tricky due to the fact that spawning tasks in both tokio and async-std (the two runtimes we support) requires that the futures being executed are 'static, which won't be the case sometimes due to our API accepting references to user-owned options in certain places, meaning we can't move them into background tasks. From what I understand, the reason that task::spawn requires the 'static lifetime on futures is that the join handles won't necessarily be joined (meaning the background task could last indefinitely). Assuming this is the case, the 'static lifetime wouldn't be necessary on a spawn variant where the join handle is guaranteed to be joined. Does a spawn variant like this exist somewhere?

1 Like

Currently there are no real solutions to non-'static spawning. This is because for this to be sound, the task that performed the spawn must stay valid for the duration of the spawned task, but the task might get cancelled at any time, invalidating any references in the spawned task.

I thought that might be the case, although I was hoping I might be wrong. Thanks for the quick reply!

I imagine that some operations such as the handshake could be turned into 'static futures without expensive copies. Perhaps an approach like that would be possible?

Yes, we'd probably be able to do the handshake as a 'static future, and many of the other places as well. We were hoping there might be a more holistic solution though, e.g. wrapping the entirety of our execute_operation function (which contains all of the await calls that occurs in user operations) in a single background task, rather than trying to do it in several different places in the lower levels of the driver.

I'd imagine that others have run into the same general issue before; do you know how other libraries deal with the fact that users might stop polling the futures they return at any time?

I'm not directly aware of any. I would look into existing async database pooling libraries such as bb8, and perhaps even sync ones such as r2d2. I've forwarded your question to Tokio's discord server.

I don't know if this will work with Mongo's protocol as well, but rust-postgres deals with this by having separate Client and Connection objects. The Connection is the thing that has the actual socket and performs all of the IO with the server. It's typically just spawned off in the background. A user interacts with the Client, which serializes the request to the Postgres wire format and ships that over to the connection via a channel. Even if the user gives up on the future half way through, the connection stays alive and is able to keep on processing.

More tricky stateful things like transactions are handled via RAII - the transaction object sends a ROLLBACK request to the connection if it drops without having been committed. Since the connection handles everything from there, we don't need to worry about not being able to run async code in a destructor.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.