Unexpected behavior of tokio multithreaded runtime with a single worker thread

So, I thought I understood the difference between a multi-threaded runtime with a single worker thread and a single-threaded one, namely that the st runtime never spawns more threads and "everything has to happen on the main thread" (oversimplifying).
Then I tried the following snippet:

async fn hogger() {
    let my_id = std::thread::current().id();
    loop {
        println!("{my_id:?} Start hogging");
        std::thread::sleep(std::time::Duration::from_secs(1));
        println!("{my_id:?} End hogging");
    }
}

// The only difference when switching to the single-threded runtime
// is that I don't get two different thread ids...
//#[tokio::main(flavor = "current_thread")]
#[tokio::main(flavor = "multi_thread", worker_threads = 1)]
async fn main() {
    let my_id = std::thread::current().id();
    tokio::task::spawn(hogger());
    loop {
        println!("------- {my_id:?} async sleep");
        tokio::time::sleep(std::time::Duration::from_secs(1)).await;
        println!("------- {my_id:?} resuming");
    }
}

This is the output I get:

------- ThreadId(1) async sleep
ThreadId(2) Start hogging
ThreadId(2) End hogging
ThreadId(2) Start hogging
ThreadId(2) End hogging
ThreadId(2) Start hogging
ThreadId(2) End hogging
ThreadId(2) Start hogging
.
.

So it seems like the loop in main gets stuck and never gets past the await.
But I see that there are actually 2 threads in use, Thread(1) which is the main thread and Thread(2) which is the (single) worker thread spawned by the mt runtime.

Can someone help me understand why this happens?
In my mind the hogger task would really only block the worker Thread(2) while the (main) Thread(1) would still be able to let its task be awoken and happily loop.

I verified that I can get the expected behavior by increasing the number of worker threads to 2 as the corresponding output confirms

****** WITH 2 WORKER THREADS ******
------- ThreadId(1) async sleep
ThreadId(2) Start hogging
------- ThreadId(1) resuming
------- ThreadId(1) async sleep
ThreadId(2) End hogging
ThreadId(2) Start hogging
------- ThreadId(1) resuming
------- ThreadId(1) async sleep
ThreadId(2) End hogging
ThreadId(2) Start hogging
.
.

This really confuses me.

Why main thread + single worker thread is not enough to make this work?

TIA

For the very same reason it's not enough to have CEO and one, single, turner, to work in a factory with two lathes. CEO is not supposed to do read work, it's job is to coordinate work of others.

Sure, if you have very few works (family artel, e.g.) then you may not need a dedicated manager, but if you want to handle big factory then you need a dedicated manager.

Similarly in computers: developing separate mode where “control thread” (aka “main thread”) is not used and worker threads are doing complicated and fragile dance to ensure that no one sleeps is possible… but what would it accomplish? You wouldn't even save resources because al that special dance for special low-threads mode would need more memory to execute more code!

That's why “main thread + single worker thread” work precisely like almost everyone expect: main thread is manager, worker threads do the work.

The big question here: why, in your mind, main, special, thread should be complicated by doing two different kinds of jobs at once?

In a single-threaded executor that complexity is justified (we don't want to spawn many threads and if you only have one thread that does everything then design is, ultimately, simpler than when you have more than one thread), but why multi-threaded executor would do that? Why would we try to have dual-purpose threads and add lots of fragile code to enable two-threads-only (or, more like, only-few-threads) executor?We have already dedicated ourselves to have more than one thread! What that complication would accomplish?

It's because main is waiting for the worker thread to tell it that the timer has expired (Tokio keeps track of timers and IO on worker threads only), but the worker thread is stuck in hogger.

1 Like

Ok, so IIUC, in a st runtime the timer/IO driver runs on the only available main thread, while in a rt runtime with one worker thread it runs on that worker thread and that causes the problem since the driver is now unable to notify timers about completion.

But then this raises a curiosity: when there are N > 1 worker threads, does each of them run a timer/IO driver that handles only resources created on that thread or is there always a single driver that runs on a single worker?

For the current multithreaded Tokio runtime, it is the latter. The IO and timer drivers are specific to a runtime instance and are shared between its worker threads, who try to dispatch timer and IO events before they are parked. This section of the docs describes the scheduling behaviour in more detail.

1 Like

The IO/timer driver is only on one thread, but it will move between threads. Whenever a worker goes to sleep, it will check if any thread is currently sleeping on the driver, and if not, then that thread will go to sleep on the driver.

Specifically, "go to sleep on the driver" means "call epoll with a timeout equal to the smallest timer".

3 Likes