We are writing a web service endpoint. It's a rewrite of an legacy endpoint which receives a lot of connections. So, we decided to use Tokio/Hyper to optimize it.
Currently we have three types of "first-level" async tasks that are spawned:
- acceptor (single task): accepts incoming connections and puts them on an async_channel
- worker (n tasks): retrieves incoming connection and processes them/creates response
- utility (single task): for now only handles signals (shutdown, reload configuration)
The first iteration of the server used #[tokio::main]
to spawn a single multi-threaded runtime. Someone came along pointing us to the article Building Robust Servers with Async Rust and Tokio which suggests to create three separate runtimes for acceptor (1 thread), worker (n threads), utility (1 thread). In particular, the article mentions some scenarios in which having a single thread pool is problematic as it can lead to situations where the acceptor task is starved, etc.
I didn't find a lot of information on the topic. It seems creating and managing separate tokio runtimes manually increases code complexity, so we'd like to avoid that if possible. We're unsure if the points made in the article are really valid for our use case.
It claims that all threads get used up easily (too many connections or the database is getting slow) (see section "Some Scenarios and Failures" of the article). I think, this is not quite true when the workers never do any CPU-bound work (i. e. not blocking the runtime), but only IO-bound work. The single runtime should be able to scale to many concurrent connections while still being able to accept new connections. This seems to be supported by the tokio docs which states: "If the total number of tasks does not grow without bound, and no task is blocking the thread, then it is guaranteed that tasks are scheduled fairly.".
We are currently considering three options:
- Use a single runtime/use the
#[tokio::main]
makro. - Create one runtime (with n threads) for the workers, one runtime (maybe Current-Thread Scheduler or Multi-Thread Scheduler with only 1 thread?) for the acceptor
- Three Multi-Thread Scheduler runtimes (as the article suggests): workers (n threads), acceptor (1 thread), utility (1 thread).
To complicate things even more, another article "Scalable server design in Rust with Tokio" has been brought to discussion. It suggests to facilitate SO_REUSEPORT
on the listening socket, so no single acceptor task is needed, claiming to scale much better. (This essentially results in some sort of thread load-balancing at kernel-level as SO_REUSEPORT
will make the TCP/IP stack distribute incoming connections among the worker threads if I understand correctly.)
What are best practices for our scenario? Most sources recommend creating a single runtime and don't even touch the topic of creating multiple runtimes manually.