Learning async: a simple progress bar

I am trying to grok the basic concepts of async (Future, poll, Wake...). To keep it simple I thought maybe I avoid using tokio and async-std and just use what std and futures crates provide.

Let's say you have an expensive computation that needs to run for a long time and you'd like to report back the progress of computation, say, at 50, 75, 90 and 100%.

This is me thinking in an async-noob style:

  1. main thread/executor creates the Future that's responsible to run the heavy computation: decode_covid19_rna()
  2. The Future spawns a new thread to actually run decode_covid19_rna()
  3. main thread poll()s the future to get the current progress value: progress_percent
  4. main thread checks progress_percent and if is not 100, waits to receive wake call from Future in order to go to step 3.
  5. If progress_percent is 100, terminate everything.

However, reading the the (unfinished) async book, I realize that Futures can only tell main thread to wait (Poll::Pending) or return a value (Poll::Ready(value)) and essentially be done (no more polling)

Additionally, the docs say this about polling:

...Instead, the current task is scheduled to be woken up when it's possible to make further progress by poll ing again

I am confused about:

  • make further progress! But wasn't heavy computation already making progress toward completion?!
  • I guess the wording (and more likely my English) is source of confusion: Future has called wake(), so it's already awake so why does it need to be scheduled to be woken? Isn't it better to say that Future is waking up the executor so it comes back to check on Future and get some result back?

So, do I have to create a new Future every time I get a non-100 report from poll(), and somehow handle linking the new Future to the already running computation thread to get further progress report?

How would you do it?
Is this type of problem not suitable for async?

I guess similar question can be framed for a socket that's downloading a file from a client and we want to see remain download size WITHOUT fully waiting for Future to complete.

"Make further progress" only applies when you (or the underlying code) calls .await. Otherwise, a single poll need to wait for the entire code to finish.

async fn foo() {
  // computation
}

When polling foo, the first poll blocks the worker thread until the computation finishes.

async fn bar() {
  // computation a
  something.await;
  // computation b
}

When polling bar, the first poll blocks the worker thread until the computation a finishes. Then await is called. Effectively yield the worker thread. I.e., tells the runtime to poll some other thing.
The second poll for bar will block until computation b finishes.

Does it makes sense? The key is await yields the worker thread. Without calling await, your code will keep running until finish.
To answer your progress bar question, you need to update some variable, and call await, so the runtime can poll the UI that read the variable and draws the progress.

1 Like

polling of Futures is purely intended to be a thing executors are doing in order to make progress on async tasks. The poll and Waker mechanisms that allow the scheduling of multiple tasks on an executor in a flexible fashion, in a similiar fashion as on operating system can schedule multiple threads.

It is not intended to be a mechanism that is intended to be used by application to query for progress. There exist higher level abstractions for this, and most users should never be required to interact with poll and wake.

E.g. in your example your main task or thread could send it's current progress into a channel. And the task which waits for the result can receive() on the receiving half of the channel in order to check the progress. That could be an async channel (e.g. futures::sync::mpsc or tokio::sync::mpsc), or also an synchronous channel (e.g. from crossbeam).

Since your computation is purely CPU bound it is actually a bad fit for futures, and a synchronous channel would do just fine.

2 Likes

Hi! Async await is intended for IO-bounded code, which is why futures sometimes cannot make progress: It's waiting for some IO operation to complete. If you want to do CPU-bound code, you should probably be looking more in the direction of the rayon crate, which is intended for that.

As for avoiding Tokio and async-std, well that isn't going to make your code simpler. You need to choose an executor to run your futures — Rust doesn't come with one like you may be used to from other languages. You may want to read this thread that gives some intuition of how futures work and the surrounding ecosystem, and perhaps also this blog post that goes more in-depth on how futures work and what cooperative scheduling is.

3 Likes

Thanks Alice, for providing useful links and information. (agreed CPU-bound example was not a good ex for the question)

I think I am almost getting there building a mental model of async.
Combining your great description of Future and Wake notification:

  • Is it safe to assume that the external code/thread will only trigger the wake() function of futures and don't touch anything else? I am asking because if there is another thread with access to futures, why can't futures use it to do some work while mother executor is busy with other children?
  • If events that are supposed to wake up a future are not part of mainstream class of network/file IO, timer,... events then one has to implement their own other thread to send notification for polling? or tokio et. al provide utilities for this scenario?
  • Can one think of a task in async rust as collection of some futures plus an executor ctx (which has information about the current thread running the task)?

by the way, the Philipp Oppermann's blog series are amazing source of well-written information. Just became an sponsor :slight_smile:

Sure, external code can totally do the work. Nothing wrong with that. In fact, often when you have some CPU bound tasks in an otherwise IO bound codebase, one nice approach is to spawn the task on rayon and use a channel to send the result back.

And yes, you can spawn your own thread and have it send off the notifications if the usual notifiers don't fit your use case.

A task is indeed some collection of futures in the sense that you spawn some top level future that internally awaits some other futures that internally await some other futures and so on. That collection of recursively awaited futures is a task.

1 Like

Thank you again,
So the task doesn't know (or need to know) about the thread it's currently running on (unless it looks at ctx?!)

I mean, it can ask for the thread id while it's being polled? The executor might move the future between threads though.

yes, something along those lines... for example Future can check if the current thread is the helper thread responsible for calling wake() or it's the actually executor that's calling poll()

There's no way to know who woke you up, and the executor is allowed to poll a future even if no wakeups have been emitted.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.