Tonic grpc server. Handling one blocking grpc request may block all the worker_threads

Here is my server code, it's just a little different from tonic example:

use std::thread::{self, ThreadId};
use hello_world::hello_server::{Hello, HelloServer};
use hello_world::{HelloReply, HelloRequest};
use tonic::{transport::Server, Request, Response, Status};

pub mod hello_world {

#[derive(Debug, Default, Clone)]
pub struct MyGreeter {}

impl Hello for MyGreeter {
    async fn say_hello(
        request: Request<HelloRequest>,
    ) -> Result<Response<HelloReply>, Status> {
        println!("grpc request at thread: {:?}", get_thread_id());

        let reply = hello_world::HelloReply {
            message: format!("Hello {}!", request.into_inner().name).into(),



// #[tokio::main]
#[tokio::main(flavor = "multi_thread", worker_threads = 16)]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    //task A
    tokio::spawn(async {
        loop {
            println!("hello from: {:?}", get_thread_id())

    let addr = "".parse()?;
    let svc = HelloServer::new(MyGreeter {});


fn get_thread_id() -> ThreadId {

There is a task A spawned from main thread prints "hello".
Calling "say_hello" will block thread for 20s. If "say_hello" blocks one worker_thread, often(not always) task A will not be scheduled as well, and if I make another call to "say_hello", the new calling will not be scheduled as well, actually it seems like all the worker_threads are blocked!

Though I use std::thread::sleep(not async sleep) in "say_hello", calling "say_hello" will block only one worker_thread, and I have 15 worker_threads left, the rest 15 worker_threads should handle task A or new calling as normal.

Note that blocking in a async fn/Future is a bad idea in general, try to avoid that if possible, regardless of whether you have more worker threads or not. If you want to sleep in an async fn use tokio::time::sleep to do so without blocking the executor.

Thanks for replying.
std::thread::sleep represents some unavoidable blocking operations, I wonder why they will block all the worker_threads.
If these unavoidable blocking operations will unavoidably block the runtime, even for a brief period of time, isn't this a problem?

That's why you avoid it.

Here's an article about blocking that covers some alternatives ("What if I want to block?").

Thanks for replying. I know how to handle blocking tasks, actually my confusion is 16 threads tokio runtime is blocked by only one blocking task.

In this article:

By using tokio::join!, all three tasks are guaranteed to run on the same thread, but if you replace it with tokio::spawn and use a multi-threaded runtime, you will be able to run multiple blocking tasks until you run out of threads.The default Tokio runtime spawns one thread per CPU core, and you will typically have around 8 CPU cores. This is enough that you can miss the issue when testing locally, but sufficiently few that you will very quickly run out of threads when running the code for real.

I totally agree with this. Using blocking code will very quickly run out of threads, if no thread left runtime will be blocked. But I just make only one gRPC request(tokio::spawn) to say_hello, it should block only one thread and there should be 15 threads left working as normal, Task A or new requests should be handled by them. But it's not, as I said 16 threads tokio runtime is blocked by only one blocking task. I'm so confused with this.

The tokio runtime still expects the individual tasks to yield in a reasonable amount of time, and the logic that handles splitting the work between threads probably relies on this. See One bad task can halt all executor progress forever · Issue #4730 · tokio-rs/tokio · GitHub

Use spawn_blocking if you want to do work that may block the task.

1 Like

Thanks a lot.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.