Threading and Coroutine Model for Tokio

naftulikay · April 9, 2018, 5:56pm

I have been wondering how Tokio works under the hood and how to set it up properly. I have a web scraper in Rust, so concurrency is extremely important.

My understanding of threading for Tokio is that there is a thread pool with one thread per core. On top of this, to keep threads busy, I imagine Tokio uses epoll to create an event loop. Does Tokio create an isolated event loop for each thread?

Since I have a lot of IO bound tasks, how do I submit these tasks to Tokio so that they are distributed among the thread pool?

Also, if a future in created on one thread, is there any chance that it will finish executing on a different thread?

I have an entry point to my application. How do I construct the most performant Tokio setup on start of my application?

gbutler69 · April 9, 2018, 5:58pm

Tokio uses a Thread pool and "Work Stealing" to keep all threads busy and gain maximum throughput on top of "epoll-like" event loops.

naftulikay · April 9, 2018, 6:11pm

So does Tokio use epoll or its own epoll substitute?

Is there one event loop per thread or one event loop globally, shared by all threads?

vitalyd · April 9, 2018, 7:08pm

It uses actual epoll on linux.

As for threading, this depends on the tokio version. The 0.1 model had a single threaded eventloop that did IO readiness monitoring and execution; to get multiple epoll instances, you’d start an eventloop on your own threads and then associate sockets with them. In 0.2 there’s a (by default) dedicated thread that does IO servicing and a threadpool that does execution.

vitalyd · April 9, 2018, 7:12pm

Is the IO asynchronous? I imagine it’s a bunch of http(s) requests since you’re web scraping. In that case, you could probably be fine on a single thread that does processing and (async) IO. Unless your processing is cpu intensive.

naftulikay · April 10, 2018, 1:54am

They will be futures and the calls accept a Handle, so I believe they are asynchronous IO.

I'm kind of going for the overwhelming force route here, so using an actual threadpool is what I'm after. Is there no way to have threads poll and execute?

I'm trying to approximate what java.nio provides; regular threads interacting with regular APIs but internally using asynchronous IO. All threads are thus kept busy as much as possible.

vitalyd · April 10, 2018, 2:21am

“Overwhelming force” might be just a single thread doing IO and processing. Adding extra threads won’t necessarily speed things up (might actually degrade performance) - you have to know where you are (will be) bottlenecked.

If you want both IO + execution on multiple threads then the best (IMO) approach is to run multiple singlethreaded eventloops (that also use local thread executor) across threads. Then divide up the work amongst them.

Not sure what you mean by “regular threads interacting with regular APIs”. An async API will almost certainly expose a future-like or callback interface, which isn’t really a “regular” API - unless I misunderstood your meaning.

You can approximate a Java framework like netty with tokio.

naftulikay · April 10, 2018, 2:56am

My bad, on further reading of the OpenJDK source code, Java does not seem to use event loops.

My understanding was that the java.nio package provided non-blocking implementations of things like File, etc. where the threads would not block on I/O but rather schedule it and proceed to pick up whatever the thread pool should be doing, with some other thread picking things up when the I/O request finished.

I feel like I read something way back when that described it like this, but it doesn't seem to be doing this implicitly. There are Channel APIs in java.nio for doing something similar to what Tokio does.

All things aside, which model would give me the highest theoretical throughput on I/O bound tasks?

My understanding of the single core, many thread model:

Blank%20Diagram

My understanding of the multi core, many thread model:

Is this an accurate understanding of the layout? If so, how do I construct each of these in Tokio?

It might make more sense if I try to describe what I'd like:

I would like a thread-pool which executes a number of HTTP client requests. In order to maximize my throughput, I would like it to be possible that a poll operation could be created on thread A but continue execution on thread B.

inejge · April 10, 2018, 9:29am

I agree with @vitalyd, above: multiple event loops, each running together with an executor on its own OS thread. Tokio 0.1 gives you this model and no other, 0.2 defaults to single I/O reactor + threadpool executor but can be made to work like 0.1. The 0.2 default currently seems to perform suboptimally for I/O heavy workloads, but will no doubt be further tuned.

Topic		Replies	Views
Async threadpool help	4	812	May 28, 2023
Multiple event loops / reactors with threadpool help	3	536	September 21, 2021
When using a tokio/futures backed webserver, how to manage threads?	4	2421	January 12, 2023
[CLOSED] How can i make tokio event loop? help	7	1742	October 24, 2022
Tokio interaction with Epoll object help	3	1541	July 18, 2022

Threading and Coroutine Model for Tokio

Related Topics