One or multiple runtimes for Tokio Webserver

We are writing a web service endpoint. It's a rewrite of an legacy endpoint which receives a lot of connections. So, we decided to use Tokio/Hyper to optimize it.

Currently we have three types of "first-level" async tasks that are spawned:

  • acceptor (single task): accepts incoming connections and puts them on an async_channel
  • worker (n tasks): retrieves incoming connection and processes them/creates response
  • utility (single task): for now only handles signals (shutdown, reload configuration)

The first iteration of the server used #[tokio::main] to spawn a single multi-threaded runtime. Someone came along pointing us to the article Building Robust Servers with Async Rust and Tokio which suggests to create three separate runtimes for acceptor (1 thread), worker (n threads), utility (1 thread). In particular, the article mentions some scenarios in which having a single thread pool is problematic as it can lead to situations where the acceptor task is starved, etc.

I didn't find a lot of information on the topic. It seems creating and managing separate tokio runtimes manually increases code complexity, so we'd like to avoid that if possible. We're unsure if the points made in the article are really valid for our use case.

It claims that all threads get used up easily (too many connections or the database is getting slow) (see section "Some Scenarios and Failures" of the article). I think, this is not quite true when the workers never do any CPU-bound work (i. e. not blocking the runtime), but only IO-bound work. The single runtime should be able to scale to many concurrent connections while still being able to accept new connections. This seems to be supported by the tokio docs which states: "If the total number of tasks does not grow without bound, and no task is blocking the thread, then it is guaranteed that tasks are scheduled fairly.".

We are currently considering three options:

  1. Use a single runtime/use the #[tokio::main] makro.
  2. Create one runtime (with n threads) for the workers, one runtime (maybe Current-Thread Scheduler or Multi-Thread Scheduler with only 1 thread?) for the acceptor
  3. Three Multi-Thread Scheduler runtimes (as the article suggests): workers (n threads), acceptor (1 thread), utility (1 thread).

To complicate things even more, another article "Scalable server design in Rust with Tokio" has been brought to discussion. It suggests to facilitate SO_REUSEPORT on the listening socket, so no single acceptor task is needed, claiming to scale much better. (This essentially results in some sort of thread load-balancing at kernel-level as SO_REUSEPORT will make the TCP/IP stack distribute incoming connections among the worker threads if I understand correctly.)

What are best practices for our scenario? Most sources recommend creating a single runtime and don't even touch the topic of creating multiple runtimes manually.

If your tasks are well-behaved (not running sync blocking code for too long) then a single runtime should be sufficient.

I have some projects where my code is messy and can be running big CPU-bound tasks or non-async sqlite queries that sometimes take multiple seconds. For such code I use a separate sacrificial runtime that will get blocked and have high latency.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.