Understanding tokio::time::timout behaviour

As the doc says

Note that the timeout is checked before polling the future, so if the future does not yield during execution then it is possible for the future to complete and exceed the timeout without returning an error.

As the timeout will only be checked before polling so is it also possible that the timeout's task isn't polled in case there are other long running tasks which are not yielding. And when the timeout task's turn actually comes an error is returned due to timing out, even when the the internal future completed within the required duration?

Thanks in advance

And when the timeout task's turn actually comes an error is returned due to timing out, …

This could happen, but:

… even when the the internal future completed within the required duration?

this is impossible by definition, because the internal future hasn't yet been polled. Futures only make progress when polled; "the future completes while it is not being polled" is a nonsensical statement.

Now, something like a channel message, or an asynchronous read operation, might happen while the future is not being polled due to the operation of another thread, but that doesn't count. The only effects of that happening are that the future's waker is invoked, telling the executor that the future is ready to be polled; the future itself is not running.

So, yes, a Timeout future might return a timeout even if the inner future would have returned Poll::Ready if it was polled, but that is not possible to know without actually polling the future.

Systems with concurrent activities will always have potential race conditions. You can perhaps shorten the window of “I might lose a value that was ready” by making a timeout future that checks after polling instead of before, but that's not the same thing as eliminating it; you would still have the case where a message is dropped because it arrived between the time when the receiving future returned Poll::Pending and when the timeout was checked.

In practice, the way you get good behavior here is by treating “long running tasks which are not yielding” as already a problem: fix them so they yield more, or put them on dedicated thread pools (whether async or simple blocking) which aren't shared with tasks that have shorter polling cycles and short timeouts. This isn't just about timeouts — if you want your async concurrency to have low latency, then you need to make sure that there's room for tasks to get polled quickly.

2 Likes

This makes sense, thanks for clarity @kpreid