A more ergonomic async?

In my understanding , asynchronous means 2 things

  1. running a job in another thread and
  2. get back a stub, by which you can check for status of the job.
    In rust the stub is corresponding to Future.
    It seems that async in rust just do the second part. If async also do the first part , we can write code like this
fn test() {
  let x= async {
    println!("one thing");
    1
  };
  let y= async || {
    println!("another thing");
    2
  };
  async fn z()->u8 {
    println!("third thing");
    3
  }
  let y = y();
  let z = z();
  // check for status
  x.poll(...);  // pseudo code
  y.poll(...);
  z.poll(...);
  // or decide to wait until completed
  assert_eq!(x.await,1);  // should allow await in non-async (currently not allowed)
  assert_eq!(y.await,2);
  assert_eq!(z.await,3);
}

all we need is let async do 1 the same time.

async has nothing to do with running a job in another thread (or even threads for that matter). You can do that but it's not limited to that. For example, the default tokio runtime uses only one thread - you need to configure it separately to use mulitple threads. One of the biggest uses of async is async-IO. This basically means that a thread doesn't "block" on an IO operation. An expensive but async IO-operation can be polled multiple times until completion by the runtime, while it polls other async "tasks".
This video by Steve Klabnik gives a good overview of what async is all about.
This video by Jon Gjengset gives a Rusty techincal intro to async-await.

As for your actual question, I think the above videos and explanation should give you a good idea about why async in Rust is the way it is. The thing you want to achieve (as far as I understood) is already accomplished by a async-runtime, such as tokio. If you want to leverage multiple-threads (because you have multiple cores in your machine), you can configure tokio to do so as well.

1 Like

No.

6 Likes

Brilliant talk. A must for anyone interested in concurrent programming.

2 Likes

In my understanding , the big issue goroutine addressed is the overhead of huge amount of thread (by multiplexing many goroutines on limited threads ). But we can address this by means of thread pool. And it's interesting that goroutine just do the first part.

No. The whole point of async is to avoid the overheads of actually using threads.

Imagine code that looks like, in pseudo code:

do forever 
    wait for some data from some source to process
    process the data
    make a response with the result
end

Further imagine that "wait for some data" takes a significant amount of time, perhaps a very long time. It blocks execution until data arrives. And that there are many instances of this loop running that accept data from different sources.

Doing this without threads is not going to work well. Once a wait is hit all processing stops until the data from that source arrives. Even if there is data available from other sources in the mean time. You processor will spend much of it time idle, just waiting.

Traditional threads solve this problem by swapping the processors execution time amount the many instances of loops like that. That way if any source has any data available the thread that is waiting on it can proceed to do work daily quickly.

But threads have problems. A execution "context" has to me maintained for each thread, it's stack, it's processor register state etc. All this eats memory and takes time to constantly swap around as different threads get their share of CPU time.

Enter "async". Async code can have loops in it like the above, those loops can wait on data as long as they like.The compiler generates the required code to do the swapping around of processor execution. It's all done without threads and their overheads. Kind of how you might write a bunch of state machines to implement those loops if you had to get the job done on a system with no threads. The messy code to do all that is generated by the compiler instead.

Note that in the Rust async world they talk of "tasks" not "threads" to make the distinction.

In general async is good for when you have lot of waiting to do. Threads are good for when you have a lot of computing to do, and preferably have multiple cores to run them on.

As for "ergonomics" I don't know. Rather than use the low level async facilities provided by Rust I ave only ever created async code using the Tokyo crate. So far I have never needed to even think about futures and polling. My code just looks like normal threaded code, but with async and await sprinkled around.

That all seems pretty neat to me. I don't have much desire to delve further under the hood.

2 Likes

That's not the point. I don't particularly care about the specific implementation details of goroutines (I'm not a Go user). The reason why I linked this talk is that Rob Pike clearly addresses the question of concurrency vs parallelism.

Concurrency and async programming can be done on purely a single thread. Concurrent design is not about parallelizing via multi-threading. It's about sharing computational time fairly between jobs, and effectively doing something useful while waiting for I/O.

3 Likes

A bit of rumbling.

It's funny how "async" is often viewed as something new, while it's effectively user-space repackaging of the good ol' cooperative multitasking, which is quite older than the preemptive multitasking (i.e. the familiar threads and processes). Yes, there is also the matter of OS interfaces for async IO, but fundamentally "async" is about driving a bunch of finite state machines in response to various events. Async programming at its root can be viewed as an ergonomic way for creating FSMs using sequential-like code.

Interestingly, one of the main issues of coop multitasking is left unsolved in all popular async systems I know of, namely lack of guarantee that FSM transition functions take bounded execution time. In practice, this issue often shows itself in the form of accidental blocking or running CPU-intensive code inside async loops. We try to paper over it by introducing lints or automatically failing back to preemptive multitasking, but to me it feels like using sanitizers to solve memory issues in C/C++ instead of relying on borrow checker. Notably, bounded execution time guarantees not only useful for async code, but also really important for real-time systems.

3 Likes

I'm not sure how much of this is new and how much is old now.

Back in the day I worked on a few systems that used cooperative multi-tasking schedulers. In languages like C and PL/M.

The big and obvious difference is that those languages had no notion of threads, async or otherwise, built into the actual languages. It was all done by a separate cooperative scheduler library that juggled processor registers and stacks around. Basically the thread state is maintained by that scheduler rather than the user having to code a bunch of state machines or the compiler generating them for you.

Meanwhile async in Rust is a language feature. With, as far as I understand, the compiler building those FSMs as required.

Those old cooperative systems were very simple and small. So much so that even I could work on the kernels when required without months of study.

The question of bounded execution time and real-time systems crosses my mind everyone and then. Not that I work on real-time systems anymore.

I have only used two languages/programming systems that had any notion of such a thing.

One was the Lucol language used by Lucas aerospace in creating software for their avionics control systems. Lucol is the only compiler I have ever used that would include a statement of the maximum execution time of every module and the complete system after every compilation. Lucol. Could do that because the language had no facility for loops. Code started at the top, ran and branched down to the bottom, them stopped. Only to be run again 10 or 100ms later. If one wanted to iterate one had to do it one step at a time as that timer tick dictated. Never had I worked on a system that was so easy to debug and create real-time code with.

The other is the XC language used with XMOS microcontrollers. A language a lot like C but with the means to spin up threads on the many cores built into the language. And a means for the compiler to determine the exact running time of suitable annotated code.

In general I think it's impossible for a compiler for a general purpose langue with the facilities of Rust, or even C, to say anything about run time. Something to do with the Halting Problem.

In my opinion, the only new part with async is the (huge) ergonomic improvement of constructing FSMs, which I've mentioned in the comment. Everything else is, fundamentally, collection of very old ideas revived by the popularity of large-scale (C10K and beyond) network programming. But now we have both preemptive and cooperative multitasking in our system, which non-trivially interact with each other. And whether the scaling properties provided by async is indeed needed in practice by most async practitioners is a whole another discussion...

It can be solved by introducing a total subset of a language. With Rust async code you already bound by several restrictions (e.g. you can not use recursion) to guarantee bound size of FSM states (i.e. virtual stack). Note that with async code you have to guarantee only that transition functions (i.e. code between await points) have bound execution time, not that the whole FSM provable terminates. If compiler can not prove that a given piece of code does not have a bound execution time, users may add additional suspension points, e.g. inside data processing loop you may suspend on each iteration. Yes, it may degrade performance in some cases, but so do the bound checks.

Granted, it's not an easy problem to solve for designers of a general purpose language. We not only need to prove that code terminates, but calculate bound for maximum execution time (otherwise our system would gladly accept code which tries to prove Goldbach’s conjecture for 64-bit integers between suspension points). But function execution time usually strongly depends on optimizations, compilation target, and, even worse, on modern CPUs a simple memory load can take from several to 100+ cycles depending on cache state (with memory mapped files the difference is even worse). But nevertheless, I think this is a problem worth solving, even non-ideally. And having an ergonomic solution for it in a general purpose language would indeed be novel.

1 Like

I emphasis the "huge" part there. The async thing is now built into the very syntax and semantics of programming languages in recent times. Javascript, Python, Rust, (any others?) which it was not before.

I liken into the introduction of loops, while, for, etc in high level languages with the advent of "structured programming" ideas. An old assembler language program could well have said "Bah, nothing new there, we have had such loops in assembler since the beginning".

That is a bit of a puzzle to me. The problem is described nicely here: What Color is Your Function? – journal.stuffwithstuff.com

So far I have managed to mix red and blue code (sync and async) successfully in Rust by using channels to communicate between them. It's a bit of a cognitive overhead but not so terrible.

Indeed. I get the impression many have been jumping on the async thing without really looking into whether it's appropriate for their task.

I think I found the some clues : the main thread id is different from the wake thread id , so , the actual work is done in background.

#[tokio::main]
async fn main() {
  let mut x = tokio::fs::File::open("tmp.txt");
  let mut x = Box::pin(x);
  // first poll on Future will schedule it to run, this is a must
  // there is no need for extra poll , we can just wait for wake call back 
  // is called , at that time , Future must be Ready
  // poll and wake is the way Future and scheduler 
  // cooperate with each other
  poll(&mut x);  
  println!("main thread {:?}",thread::current().id());
  thread::park();
}

fn poll<T:Future+Unpin>(fut:&mut T)->bool {
  let waker = Arc::new(MyWaker).into();
  let mut cx = Context::from_waker(&waker);
  let mut fut = Pin::new(fut);
  let x = fut.as_mut().poll(&mut cx);
  match x {
    Poll::Ready(y) => {
      println!("ready");
      true
    },
    Poll::Pending => {
      println!("pending");
      false
    },
  }
}

struct MyWaker;
impl Wake for MyWaker {
  fn wake(self: Arc<Self>) {
    println!("wake thread {:?}",thread::current().id());
  }
}

I haven't read this entire discussion, but Tokio does use multiple threads. However, it will not use a thread per task. Instead, Tokio spawns a fixed set of threads (usually around 8), and distributes all the tasks you spawn across those — even if you spawn a thousand tasks you only have 8 threads.

1 Like

If you don't need to mix CPU-bound work with I/O-bound work, then consider using rayon instead. It has scope that can wait for multiple jobs. You can also mix it with channels to have very flexible way of waiting for results.

1 Like

It's a good link, but I think this is a little blunt. In the users forum, we'd like people to feel comfortable posting questions regardless of their level of experience.

2 Likes

The key win of async code in Rust is memory savings. A thread that does nothing at all requires around 20KiB just to exist, whereas a simple Rust async task might take only 0.4KiB. (Experimental setup) So if your application depends on supporting as many concurrent tasks as possible, that's two orders of magnitude less memory required for your basic execution structure.

This is why async code is mostly useful when you have some resource like a net connection that you need to feed as much work to as possible. If you can increase the number of transactions you can handle simultaneously on a single machine, then you can actually use the capacity you're paying for.

Your original question assumes that spawning an async task just spawns a thread - but if Rust did that, there would be no savings. Crates like Tokio and async-std all provide good ways for many async tasks to share a few threads.

Of course, if you have big per-task expenses elsewhere, the memory savings that async gives you aren't going to matter, and you might as well just stick to threads. In Rust, at least, threads are simpler, and your operating system's tools (debuggers, profilers, etc.) probably support them better.

The other win async has over threads is a plausible cancellation model, although this is tricky to use correctly. I'm not sure Rust got that part right, so I'm not listing that up front.

You might enjoy reading Ron Pressler's blog post, On the Performance of User-Mode Threads and Coroutines, which gets into the details of why increasing the degree of concurrency is helpful - and why context switch overhead is probably not as important as you think.

1 Like

Thank you guys for your tolerance of my stupid question. The fundamental thing in Rust async is the coroutine or generator (yield). Based on that is a cooperative scheduler which schedules multi tasks in one thread of execution. It's a nice experience to learn how things are geared up!

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.