In tokio
, block_on
is a method that is called on an instance of a Runtime
. In async-std
, block_on
is a plain function, meaning that the runtime of async-std
is somehow magically set up at some point and available from anywhere even without using #[async_std::main]
. What is the reason behind this difference? Are there situation where having multiple runtimes is useful? Does the approach of async-std
create some overhead?
Multiple runtimes means, possibly, multiple threads waiting on epoll(or equivalent), which may improve some workloads.
There was actually an article not long ago exploring this solution, but I have no idea as to how to find it.
Well, async-std
has decided that they want only a single global runtime, so they can just put everything in globals.
It seems you can have separate executors (ie runtimes) for the async-*
ecosystem using the standalone crate async-executor. I'm guessing for simplicity of use and maybe performance as @Gilnaa mentions they chose to only use a lazily started global executor with async-std
. But thats just speculation from me.
There's nothing magical about the async_std
's block_on
:
pub fn block_on<F, T>(future: F) -> T
where
F: Future<Output = T>,
{
Builder::new().blocking(future)
}
It does what you can do yourself by just calling the Builder
directly. Does there have to be an explicit reason to incorporate a tiny convenience function, besides making things more straightforward?
The two projects do differ in terms of their focus: tokio
was built from the beginning as a solid, universal, capable engine for asynchronous execution (focused quite a bit on the networking side of the equation) - while async_std
came later to focus on the async
alternatives to the std
library, first and foremost. The headers of the two projects reflect just that:
Tokio: A runtime for writing reliable network applications without compromising speed.
Async-Std: Async version of the Rust standard library.
If, for some reason, you have a part of your project that runs on libraries, built for tokio
, while another part relies purely on async_std
- in theory, you might want to create several threads, running their runtimes separately to combine the functionality of both eco-systems. I couldn't possibly imagine which kind of reasoning would lead you down that rabbit hole, but hey - it's possible.
Lastly, if by "the approach" of async-std
you mean the fact that it has a separate function to do exactly what tokio
requires you to do to start executing the first future you pass into it, albeit in a bit more explicit manner, then the strict answer would be "yes", as calling an additional function to do an additional job for you by definition provides some overhead - but it's definitely not something to worry about if the only thing you're doing in either case is just passing your async
alternative of the main()
into the runtime for execution.
Well, here's an example:
Let's say I have this shiny gen()
function that someone implemented and I want to use from a non-async context:
use tokio::{
runtime::Runtime,
sync::mpsc::{channel, Receiver},
task::spawn,
};
async fn gen() -> Receiver<usize> {
let (s, r) = channel(1);
spawn(async move {
for i in 0.. {
s.send(i).await.unwrap();
}
});
return r;
}
Let's turn gen()
into an Iterator
:
struct Gen {
rt: Runtime,
r: Receiver<usize>,
}
impl Gen {
fn new() -> Gen {
let rt = Runtime::new().unwrap();
let r = rt.block_on(gen());
Gen { rt, r }
}
}
impl Iterator for Gen {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
self.r.blocking_recv()
}
}
fn main() {
let gen = Gen::new();
let x = 5;
let results: Vec<usize> = gen.take(x).collect();
assert_eq!(results.len(), x);
}
Here, I have to store the Runtime
in the struct. If I don't store it, the Future
that is created inside the async block in gen()
returnsgen()
is gone at the end of new()
:
struct Gen {
r: Receiver<usize>,
}
impl Gen {
fn new() -> Gen {
let rt = Runtime::new().unwrap();
let r = rt.block_on(gen());
Gen { r }
}
}
impl Iterator for Gen {
// unchanged
}
fn main() {
let gen = Gen::new();
let x = 5;
let results: Vec<usize> = gen.take(x).collect();
assert!(results.len() < 2);
}
By contrast, here is the same thing using async-std
:
struct Gen {
r: Receiver<usize>,
}
impl Gen {
fn new() -> Gen {
let r = block_on(gen());
Gen { r }
}
}
impl Iterator for Gen {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
block_on(async { self.r.recv().await }).ok()
}
}
fn main() {
let gen = Gen::new();
let x = 5;
let results: Vec<usize> = gen.take(x).collect();
assert_eq!(results.len(), x);
}
The runtime does not have to be stored, in fact, I've never had a reference to it. By the way, when async-std
is used s.send(i).await.unwrap();
almost always panics but I haven't seen the tokio
implementation panic, which suggests that async-std
's runtime is shut down after r
is dropped.
Now, let's say I want to have multiple such generators:
let mut gens = vec![];
for _ in 0..10 {
gens.push(Gen::new());
}
With tokio
the generators run on different executors and with async-std
they run on the same one. Or at least it seems like as if that was the case.
So I guess I simply noticed this difference and started to wonder what are the pros and cons of having the runtime in a global vs exposing it directly to the users.
All right, that's certainly unexpected.
There are people here that know much more about this than I do, but I'm fairly confident about one thing: intertwining your async
code with your synchronous code just because you want to use it is, in general, a bad idea. Call it an anti-pattern, if you will.
The whole purpose of having an async
runtime, executing tasks, represented by Future
-s, in general, is to allow your OS to deal with pauses in between I/O events more efficiently. For example, it doesn't make sense to listen synchronously for TCP connections in a loop all the time, as often there just aren't going to be any. If you block your thread each and every time you wait for some external event (which is very often the case in networking applications, for instance), you'll need a ton of threads with a ton of computing power to process them - and in the end, it simply isn't neither practical, nor efficient.
Now, for the code:
Leaving the question of why would you want to do something that you could easily do synchronously this way aside (a fairly important question, if you ask me) - I'm not sure (by the questions raised by you later) you understand what this function does, and perhaps more importantly, what it expects to happen.
First, you code gets converted into a regular fn
that returns an impl Future
- a task that can be executed in an asynchronous context. The way async
tasks work is, in essence, by performing small chunks of work that can be done for sure before they encounter a chunk of work that, for some reason, can't be completed right away - in which case they "yield" control back to the execution context (the async
runtime) for it to try to make progress on some other pending tasks.
But you're not just waiting for some sequence of async
tasks to complete inside of it - you spawn a totally separate async
thread (which is going to assume that it has a runtime available - along with enough time to complete all of its inner calculations), before returning. Fair enough.
Yes, this is what you have to do - otherwise the async
thread that you've spawn inside your gen()
(and which requires an async runtime to execute) won't get too far, as you'll have dropped it as soon as you've received and returned your Receiver r
back from the task that you've blocked_on
.
No, this is not what happens - the (impl) Future
survives, and it completes just fine by returning your Receiver. But inside of it you've told it to spawn another async
thread, which will run for much longer - and which needs its executor, which you've dropped before returning. This thread doesn't know anything about the future you've returned early - it runs for as long as it has to, and as long as it has an executor to actually push it forward to do the work that needs to be done. See the comment above.
.Given what I've told you above, you should be able to understand what's wrong here.
Wrong. It does have to be stored if you want your spawned async
task to keep doing its work. References have nothing to do with it. As mentioned previously, there's no "magic" in the block_on
function that you can use on its own without an explicitly defined runtime in async_std
case - take a look at the source code again and see what it does. It creates a runtime for you and blocks until it's done (until the impl Future
that you expect to get from your task gets returned).
blocking_on
your gen()
task and returning just the receiver once again drops your executor, created inside the block_on
and thus the panic show ensues - you've dropped the thing that was supposed to keep executing your inner async
thread. It doesn't happen for tokio
because you've stored the executor - without understanding what it was actually responsible for, apparently.
Please, just use one regular async
runtime - I sincerely hope you're not planning to use any such generator madness anywhere where clarity, predictability and performance matters.
No. Wrong - again, your create a bunch of executors in either case. With async-std
you simply drop them as soon as your task returns, discarding the async
thread inside. With tokio
, you manage to save them, but now you've got 10 different execution contexts doing the same kind of work for nothing.
There's nothing preventing you from creating just one runtime at the beginning of your program - then turning your fn new() -> Gen
into an async fn new() -> Gen
and using the same kind of Iterator
implementation to get the feedback from the other thread. With 10 different generators of this kind it's going to be fairly awkward to collect
results, as each next()
you call is going to block, as you've told it to - but the meaning of the program will stay the same without a bunch of async
runtime executors flying all around the place for no reason whatsoever. If you're worried about overhead, that's the first thing you should think about.
One global runtime can use all the resources available to it - especially if it leverages multiple threads in a clever way, which is the case for tokio
in its default state. Not sure about async_std
, but I'm fairly certain it does a similar kind of work. "Exposing" async
runtimes "just because" doesn't make any sense. If you want a simple synchronous execution, use synchronous libraries and keep it simple. If your program is likely to wait for external input / feedback / events a lot and there's a lot that can be done during that time, create an async
runtime, run it once and do all the work inside of it.
On the side note, AsyncIter
are currently being actively developed, and there are already experimental in the standard library as Stream
-s. If you want something more straightforward, though, take a look at the async-stream
library. From the few experiments I've done with it, it seems to only be usable provided you can "pin" it in one place - which does put certain restrictions on its usage, but still, it's quite useful for any kind of async
stream of events you might be waiting for in your application.
Are you sure that this is what happens?
With async-std
the runtime doesn't seem to get dropped:
use async_std::task::{block_on, spawn};
use std::time::Duration;
fn func() {
block_on(async {
spawn(async {
for i in 0.. {
println!("{}", i);
async_std::task::sleep(Duration::from_secs(1)).await;
}
});
});
}
fn main() {
func();
std::thread::sleep(Duration::from_secs(10));
println!("Bye")
}
Output:
0
1
2
3
4
5
6
7
8
9
Bye
Actually, spawn
does not even have to be called from an asynchronous context:
use async_std::task::spawn;
use std::time::Duration;
fn func() {
spawn(async {
for i in 0.. {
println!("{}", i);
async_std::task::sleep(Duration::from_secs(1)).await;
}
});
}
But with tokio
the runtime seems to get dropped at the end of func()
:
use std::time::Duration;
use tokio::{runtime::Runtime, task::spawn};
fn func() {
let rt = Runtime::new().unwrap();
rt.block_on(async {
spawn(async {
for i in 0.. {
println!("{}", i);
tokio::time::sleep(Duration::from_secs(1)).await;
}
});
});
}
fn main() {
func();
std::thread::sleep(Duration::from_secs(10));
println!("Bye")
}
Output:
0
Bye
By the way, I think the magic happens on line 41 in build
:
#[cfg(not(target_os = "unknown"))]
once_cell::sync::Lazy::force(&crate::rt::RUNTIME);
My bad, I haven't gone through the code extensively enough - the async_std
does, in fact, hold its runtime in a static
variable, initialized separately. Nothing magical, really - just one global variable.
Neither function is synchronous in nature, so you don't have to call them in async
context - but in case of async_std
, you have a global static
runtime, initialized on demand, which will get picked up for the spawn
-ed task, while in case of tokio
, it expects you to provide a runtime for it to work in, so without the block_on
, you won't get anywhere.
Since async-std
's runtime is global, effectively - yes, all the tasks you spawn with it will get executed by just one runtime. For tokio
, you'll need to initialize it separately - or put it in a static variable as well (explicitly, no magic), and refer to it in your code on demand.
This is why you've got your tokio
's spawn
-ed tasks dropping early - when you tell it to block_on
the spawned task, you tell it to wait until the spawn
function produces the JoinHandle
, which you ignore by terminating the line with ;
and exit. Since the handle gets produced right away (and you don't await
for it to terminate), the whole runtime gets dropped on exit.
Guess then if you want to do some kind of a weird mix between your async and sync code, your best bet would be to use async-std
- because (not so) magically they will all get passed to the single runtime for execution. Looks like we both learnt something by exploring this subject.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.