Cancel safety in async and tokio::select!{}

I want to make sure that I understand the basic idea behind cancel safety & async.

Assuming something along the line of:

loop {
  tokio::select!{
    msg = buffered_stream.read_msg() => {
    }
    _ = wait_for_kill => {
      break;
    }
  }
}

Just looking at this select! example; what are the exact conditions that are required to achieve cancel safety (or not)?

My assumption is that iff read_msg()'s Future does not store any of the data from its underlying transport in itself, then it's cancel safe. (I.e. it's safe if it will only extract data from its underlying transport when a complete message can be returned immediately using Poll::Ready).

Is cancel non-safety an effect of a partial buffer being extracted and stored in the returned Future (by read_msg()) but then wait_for_kill sneaks in and yields Pending::Ready?

Is there more to it, or is that the basic gist of it?

That is the basic gist of it.

The underlying issue with cancellation safety is that all futures in the select other than the future that yields Poll::Ready will be dropped. If those futures own some form of state, then your future is not cancellation safe, since that owned state will be dropped when another future returns Poll::Ready.

The easy way to avoid this is for the futures in the select! to not own state, but to borrow it from elsewhere; for example, if buffered_stream owns all the state, and the future returned by read_msg() mutates buffered_stream instead of storing its own state, you're cancellation-safe. This happens with recv on Tokio channels, since all the state is in the channel, and not in the future returned by recv().

4 Likes

Cancel safety is mostly not relevant for kill signals because you don't care if you lost some data from the connection if you're killing it.

4 Likes

I've collected a bunch of materials about async cancellation, but haven't got time to fully digest them:

4 Likes

Good point. The thing I'm actually doing is handling a read, a kill and receiving a translated SIGHUP-to-"reload configuration" message, but I wanted something more compact as an illustration. Going with the "kill" rather than the "reload" in this context was clearly not my brightest move.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.