Async: How to Explicitly "yield"?

If I have async code that is doing a computation, and it might occasionally be handed work that will run a bit long, what is the best way to explicitly yield, to make sure other code isn't being starved of CPU time?

Do a very short sleep()?

It seems there must be a better way, something that is essentially zero-cost if no other code needs to execute, but politely yields the CPU if it does. I know I could spawn a thread and do the heavy lifting work there, but if this is only occasionally heavy lifting, that seems silly.

(No, I don't have this problem, I am just trying to understand this stuff.)

Thanks,

-kb

2 Likes

Tokio has yield_now for that.

7 Likes

Alternatively, you could spawn a future that runs that cpu-heavy task, then await that, such as with spawn_blocking() with tokio.

Which exact solutions are available does depend on which async runtime you use.

1 Like

Cool. That's exactly what I wanted to know!

Searching for other runtimes it looks like async-std also has a yield_now(), I see smol has a version, too. If I get into embedded async there must be a way down there, too, but I am not doing that any time soon.

-kb

P.S. I have to get better at Rust searching, I swear I searched for "yield" in the Tokio docs, I was even typing asterisks, wondering whether I need to wildcard when I don't know an exact name, and I didn't find yield_now(). I do not know why I couldn't find it.

2 Likes

...this might actually be a hint that this computation should not be async (that is, it should actually be on a separate thread, maybe in a threadpool to not spawn them too much). Not always, but in some cases it is. You guessed correctly that this will be problematic in async context - rule of thumb, is "don't spend too long without an await".

5 Likes

Exactly "don't spend too long without an await", which is why if I have code that isn't naturally hitting an await I need to be sure I do something that makes one happen regularly. (Or, put it in a separate thread.)

-kb, the Kent who is still learning about async.

Oh this thread is interesting.

Should I generously sprinkle in some yield_nows in order to break up some computation that may be a little bit too long, but doesn't want to put inside a spawn_blocking because of the Send problem?

2 Likes

Sprinkle? Yes. Generously? Maybe not. Every yield_now() does slow down your computation, because you're asking the runtime to try to do something else, and it has to check what else there is to do, which there might not be.

Ideally you'd do it only when an appropriate amount of time has elapsed, but checking the clock has its own cost. So, the least-overhead option is to be very clever about exactly when you call the yield_now()s, studying the behavior with profiling to tune how often you yield. On the other hand, maybe you care much more about overall responsiveness than throughput of this task, in which case, yes, use yield_now() generously.

Also remember that every yield_now().await is another await point, which means the future has to store whatever state was held across the await, so the future may become larger.

3 Likes

The way I see it an async run-time can juggle large numbers of async tasks on a single thread on a single core. In that situation if one has a compute intensive task to do that will take a long time, it's better to farm that work out to a new sync thread where it cannot block the async run-time.

On a multi-core machine the sync thread doing that compute work can now run on its own core and make use of the performance the machine has to offer.

Things get muddied by the fact that async run-times like Tokio can spread their async tasks over multiple threads and hence multiple cores.

Still, I'm kind of allergic to sprinkling yields around a long living, blocking, process. It just feels like kludge.

1 Like

@arifd,

It sounds like "sprinkle" is almost always a bad idea, but if code sometimes (always?) takes a long time, maybe there is just one place (or two) that will regularly get executed in the long-loop case and will help responsiveness if a yield_now() were put there. Finding that best spot might be tricky; if there is a spot in an outer loop that will be called regularly but always one time too many (that is, in a minimal-work case it gets called once when zero is the right number)? That might still be the right place, as long as it doesn't happen often.

Interesting stuff to think about.

The questions that come to my mind:

  • How long might this function run? (Always that long? Sometimes?)
  • Who cares if it runs too long?
  • How long is too long?
  • How short is short enough?
  • How much waste (calling yield_now() unnecessarily) is okay in exchange for greater responsiveness?
  • How do these answers shift if later running on slower or faster hardware?
  • What measurements can you make to verify?
  • Is your data really doomed to be not send?

Certainly seems you should resist writing new code that decides whether to call yield_now(), it is easy to spend more than you save that way. Better if you can be Rust-like and put effort in upfront, into thinking about where is already the right place for a yield_now(), a place that will naturally be called with an appropriate regularity.

Definitely interesting stuff, I understand things better than I did. I expect I will never have reason to call yield_now(), at least not outside of experimenting or maybe debugging things.

Thanks,

-kb

P.S. I like that the call will look like yield_now().await, to remind us that yielding will go through the same mechanisms as any other async call, and that in a pointless no-op case it is still not free.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.