Why does tokio expose its runtime but async-std doesn't?

lambdakappatheta · October 8, 2021, 8:52pm

In tokio, block_on is a method that is called on an instance of a Runtime. In async-std, block_on is a plain function, meaning that the runtime of async-std is somehow magically set up at some point and available from anywhere even without using #[async_std::main]. What is the reason behind this difference? Are there situation where having multiple runtimes is useful? Does the approach of async-std create some overhead?

Gilnaa · October 8, 2021, 9:05pm

Multiple runtimes means, possibly, multiple threads waiting on epoll(or equivalent), which may improve some workloads.

There was actually an article not long ago exploring this solution, but I have no idea as to how to find it.

alice · October 8, 2021, 9:10pm

Well, async-std has decided that they want only a single global runtime, so they can just put everything in globals.

drewkett · October 8, 2021, 9:12pm

It seems you can have separate executors (ie runtimes) for the async-* ecosystem using the standalone crate async-executor. I'm guessing for simplicity of use and maybe performance as @Gilnaa mentions they chose to only use a lazily started global executor with async-std. But thats just speculation from me.

Ideator · October 8, 2021, 9:29pm

There's nothing magical about the async_std 's block_on:

pub fn block_on<F, T>(future: F) -> T
where
    F: Future<Output = T>,
{
    Builder::new().blocking(future)
}

It does what you can do yourself by just calling the Builder directly. Does there have to be an explicit reason to incorporate a tiny convenience function, besides making things more straightforward?

The two projects do differ in terms of their focus: tokio was built from the beginning as a solid, universal, capable engine for asynchronous execution (focused quite a bit on the networking side of the equation) - while async_std came later to focus on the async alternatives to the std library, first and foremost. The headers of the two projects reflect just that:

Tokio: A runtime for writing reliable network applications without compromising speed.

Async-Std: Async version of the Rust standard library.

If, for some reason, you have a part of your project that runs on libraries, built for tokio, while another part relies purely on async_std - in theory, you might want to create several threads, running their runtimes separately to combine the functionality of both eco-systems. I couldn't possibly imagine which kind of reasoning would lead you down that rabbit hole, but hey - it's possible.

Lastly, if by "the approach" of async-std you mean the fact that it has a separate function to do exactly what tokio requires you to do to start executing the first future you pass into it, albeit in a bit more explicit manner, then the strict answer would be "yes", as calling an additional function to do an additional job for you by definition provides some overhead - but it's definitely not something to worry about if the only thing you're doing in either case is just passing your async alternative of the main() into the runtime for execution.

lambdakappatheta · October 8, 2021, 11:01pm

Well, here's an example:

Let's say I have this shiny gen() function that someone implemented and I want to use from a non-async context:

use tokio::{
    runtime::Runtime,
    sync::mpsc::{channel, Receiver},
    task::spawn,
};

async fn gen() -> Receiver<usize> {
    let (s, r) = channel(1);

    spawn(async move {
        for i in 0.. {
            s.send(i).await.unwrap();
        }
    });

    return r;
}

Let's turn gen() into an Iterator:

struct Gen {
    rt: Runtime,
    r: Receiver<usize>,
}

impl Gen {
    fn new() -> Gen {
        let rt = Runtime::new().unwrap();
        let r = rt.block_on(gen());
        Gen { rt, r }
    }
}

impl Iterator for Gen {
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        self.r.blocking_recv()
    }
}

fn main() {
    let gen = Gen::new();

    let x = 5;
    let results: Vec<usize> = gen.take(x).collect();
    assert_eq!(results.len(), x);
}

Here, I have to store the Runtime in the struct. If I don't store it, the Future that ~~gen() returns~~ is created inside the async block in gen() is gone at the end of new() :

struct Gen {
    r: Receiver<usize>,
}

impl Gen {
    fn new() -> Gen {
        let rt = Runtime::new().unwrap();
        let r = rt.block_on(gen());
        Gen { r }
    }
}

impl Iterator for Gen {
    // unchanged
}

fn main() {
    let gen = Gen::new();

    let x = 5;
    let results: Vec<usize> = gen.take(x).collect();
    assert!(results.len() < 2);
}

By contrast, here is the same thing using async-std:

struct Gen {
    r: Receiver<usize>,
}

impl Gen {
    fn new() -> Gen {
        let r = block_on(gen());
        Gen { r }
    }
}

impl Iterator for Gen {
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        block_on(async { self.r.recv().await }).ok()
    }
}

fn main() {
    let gen = Gen::new();

    let x = 5;
    let results: Vec<usize> = gen.take(x).collect();
    assert_eq!(results.len(), x);
}

The runtime does not have to be stored, in fact, I've never had a reference to it. By the way, when async-std is used s.send(i).await.unwrap(); almost always panics but I haven't seen the tokio implementation panic, which suggests that async-std's runtime is shut down after r is dropped.

Now, let's say I want to have multiple such generators:

    let mut gens = vec![];
    for _ in 0..10 {
        gens.push(Gen::new());
    }

With tokio the generators run on different executors and with async-std they run on the same one. Or at least it seems like as if that was the case.

So I guess I simply noticed this difference and started to wonder what are the pros and cons of having the runtime in a global vs exposing it directly to the users.

Ideator · October 9, 2021, 12:28am

All right, that's certainly unexpected.

There are people here that know much more about this than I do, but I'm fairly confident about one thing: intertwining your async code with your synchronous code just because you want to use it is, in general, a bad idea. Call it an anti-pattern, if you will.

The whole purpose of having an async runtime, executing tasks, represented by Future-s, in general, is to allow your OS to deal with pauses in between I/O events more efficiently. For example, it doesn't make sense to listen synchronously for TCP connections in a loop all the time, as often there just aren't going to be any. If you block your thread each and every time you wait for some external event (which is very often the case in networking applications, for instance), you'll need a ton of threads with a ton of computing power to process them - and in the end, it simply isn't neither practical, nor efficient.

Now, for the code:

lambdakappatheta:

async fn gen() -> Receiver<usize> {
    let (s, r) = channel(1);

    spawn(async move {
        for i in 0.. {
            s.send(i).await.unwrap();
        }
    });

    return r;
}

Leaving the question of why would you want to do something that you could easily do synchronously this way aside (a fairly important question, if you ask me) - I'm not sure (by the questions raised by you later) you understand what this function does, and perhaps more importantly, what it expects to happen.

First, you code gets converted into a regular fn that returns an impl Future - a task that can be executed in an asynchronous context. The way async tasks work is, in essence, by performing small chunks of work that can be done for sure before they encounter a chunk of work that, for some reason, can't be completed right away - in which case they "yield" control back to the execution context (the async runtime) for it to try to make progress on some other pending tasks.

But you're not just waiting for some sequence of async tasks to complete inside of it - you spawn a totally separate async thread (which is going to assume that it has a runtime available - along with enough time to complete all of its inner calculations), before returning. Fair enough.

Yes, this is what you have to do - otherwise the async thread that you've spawn inside your gen() (and which requires an async runtime to execute) won't get too far, as you'll have dropped it as soon as you've received and returned your Receiver r back from the task that you've blocked_on.

No, this is not what happens - the (impl) Future survives, and it completes just fine by returning your Receiver. But inside of it you've told it to spawn another async thread, which will run for much longer - and which needs its executor, which you've dropped before returning. This thread doesn't know anything about the future you've returned early - it runs for as long as it has to, and as long as it has an executor to actually push it forward to do the work that needs to be done. See the comment above.

.Given what I've told you above, you should be able to understand what's wrong here.

Wrong. It does have to be stored if you want your spawned async task to keep doing its work. References have nothing to do with it. As mentioned previously, there's no "magic" in the block_on function that you can use on its own without an explicitly defined runtime in async_std case - take a look at the source code again and see what it does. It creates a runtime for you and blocks until it's done (until the impl Future that you expect to get from your task gets returned).

blocking_on your gen() task and returning just the receiver once again drops your executor, created inside the block_on and thus the panic show ensues - you've dropped the thing that was supposed to keep executing your inner async thread. It doesn't happen for tokio because you've stored the executor - without understanding what it was actually responsible for, apparently.

Please, just use one regular async runtime - I sincerely hope you're not planning to use any such generator madness anywhere where clarity, predictability and performance matters.

No. Wrong - again, your create a bunch of executors in either case. With async-std you simply drop them as soon as your task returns, discarding the async thread inside. With tokio, you manage to save them, but now you've got 10 different execution contexts doing the same kind of work for nothing.

There's nothing preventing you from creating just one runtime at the beginning of your program - then turning your fn new() -> Gen into an async fn new() -> Gen and using the same kind of Iterator implementation to get the feedback from the other thread. With 10 different generators of this kind it's going to be fairly awkward to collect results, as each next() you call is going to block, as you've told it to - but the meaning of the program will stay the same without a bunch of async runtime executors flying all around the place for no reason whatsoever. If you're worried about overhead, that's the first thing you should think about.

One global runtime can use all the resources available to it - especially if it leverages multiple threads in a clever way, which is the case for tokio in its default state. Not sure about async_std, but I'm fairly certain it does a similar kind of work. "Exposing" async runtimes "just because" doesn't make any sense. If you want a simple synchronous execution, use synchronous libraries and keep it simple. If your program is likely to wait for external input / feedback / events a lot and there's a lot that can be done during that time, create an async runtime, run it once and do all the work inside of it.

On the side note, AsyncIter are currently being actively developed, and there are already experimental in the standard library as Stream-s. If you want something more straightforward, though, take a look at the async-stream library. From the few experiments I've done with it, it seems to only be usable provided you can "pin" it in one place - which does put certain restrictions on its usage, but still, it's quite useful for any kind of async stream of events you might be waiting for in your application.

lambdakappatheta · October 9, 2021, 7:32am

Are you sure that this is what happens?

With async-std the runtime doesn't seem to get dropped:

use async_std::task::{block_on, spawn};
use std::time::Duration;

fn func() {
    block_on(async {
        spawn(async {
            for i in 0.. {
                println!("{}", i);
                async_std::task::sleep(Duration::from_secs(1)).await;
            }
        });
    });
}

fn main() {
    func();
    std::thread::sleep(Duration::from_secs(10));
    println!("Bye")
}

Output:

Actually, spawn does not even have to be called from an asynchronous context:

use async_std::task::spawn;
use std::time::Duration;

fn func() {
    spawn(async {
        for i in 0.. {
            println!("{}", i);
            async_std::task::sleep(Duration::from_secs(1)).await;
        }
    });
}

But with tokio the runtime seems to get dropped at the end of func():

use std::time::Duration;
use tokio::{runtime::Runtime, task::spawn};

fn func() {
    let rt = Runtime::new().unwrap();
    rt.block_on(async {
        spawn(async {
            for i in 0.. {
                println!("{}", i);
                tokio::time::sleep(Duration::from_secs(1)).await;
            }
        });
    });
}

fn main() {
    func();
    std::thread::sleep(Duration::from_secs(10));
    println!("Bye")
}

Output:

0
Bye

By the way, I think the magic happens on line 41 in build:

#[cfg(not(target_os = "unknown"))]
        once_cell::sync::Lazy::force(&crate::rt::RUNTIME);

Ideator · October 10, 2021, 11:20am

My bad, I haven't gone through the code extensively enough - the async_std does, in fact, hold its runtime in a static variable, initialized separately. Nothing magical, really - just one global variable.

Neither function is synchronous in nature, so you don't have to call them in async context - but in case of async_std, you have a global static runtime, initialized on demand, which will get picked up for the spawn-ed task, while in case of tokio, it expects you to provide a runtime for it to work in, so without the block_on, you won't get anywhere.

Since async-std's runtime is global, effectively - yes, all the tasks you spawn with it will get executed by just one runtime. For tokio, you'll need to initialize it separately - or put it in a static variable as well (explicitly, no magic), and refer to it in your code on demand.

This is why you've got your tokio's spawn-ed tasks dropping early - when you tell it to block_on the spawned task, you tell it to wait until the spawn function produces the JoinHandle, which you ignore by terminating the line with ; and exit. Since the handle gets produced right away (and you don't await for it to terminate), the whole runtime gets dropped on exit.

Guess then if you want to do some kind of a weird mix between your async and sync code, your best bet would be to use async-std - because (not so) magically they will all get passed to the single runtime for execution. Looks like we both learnt something by exploring this subject.

system · January 8, 2022, 11:21am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Why Tokio runtimes aren't independent? help	5	2617	August 10, 2020
Is there a good read on different async runtimes? help	8	8552	July 3, 2020
Async code: Tokio::main vs futures::executor::block_on vs runtime::main? help	10	7991	November 20, 2019
Difference between async block and async fn help	3	424	September 21, 2020
Tokio & async_std compatibility help	5	424	February 16, 2024

Why does tokio expose its runtime but async-std doesn't?

Related Topics