Sync short tasks blocking the tokio scheduler

I have an axum async handler as follows (simplified):

async fn put_handler(Path(method_id): Path<u16>, Query(params): Query<Params>, header_map: HeaderMap, body: Bytes) -> impl IntoResponse {
    if CACHES.put(method_id, params.key.unwrap(), params.value,unwrap()).is_ok() {
        StatusCode::OK
	}
	else {
		StatusCode::INTERNAL_SERVER_ERROR
	}
}

'CACHES.put' is ultrafast, in the nanoseconds range. That's why is sync. It only performs a 'put' in a cache and sets the params.value content to a in-memory buffer.

I also have the CACHE.get(key), which is async, because it involves to send a message to a worker (spawn_blocking) and await the response (in a oneshot channel) that comes from disk. This is in the microseconds range (usually < 500us) and, occasionally, few milliseconds.

async fn get_handler(Path(method_id): Path<u16>, Query(params): Query<Params>) -> Response<Body> {
    match CACHES.get(method_id, params.k.unwrap()).await {
        Ok((delay, maybe_document)) => {
            if let Some(document) = maybe_document {
                return (StatusCode::OK, document).into_response();
            }

            // 1. Not found and no 'delay'.
            if delay == 0 {
                return StatusCode::NOT_FOUND.into_response()
            }

            // 2. Not found yet, but it will be there soon --> return 'delay'.
            let header = [(constants::DELAY_SLEEP_HEADER_NAME, delay)];
            (StatusCode::NOT_FOUND, header).into_response()
        },
        Err(CacheError::CacheNotFound(_)) => {
            StatusCode::NOT_FOUND.into_response()
        },
        Err(err) => {
            StatusCode::INTERNAL_SERVER_ERROR.into_response()
        }
    }
}

When I perform my tests, if my requests are mainly 'get', everything works fine. But every time I increment the number of 'put', the latency in notifying the oneshot (from the spawn_blocking to the async get that is awaiting the response) increases more and more.
The more 'puts' I have, the greater the latency is.

I suspect this increasing number of sync 'puts' are starving the tokio scheduler, so every time is harder for Tokio to wake up tasks that are awaiting a notify.

So, is there any way to make these 'put' not to interfere in the tokio scheduler?

I read something about block_in_place, but not sure if this is the best solution.

Thanks.

How large is the latency increase you're seeing?

If put takes nanoseconds, your latency should increase by about as much (times the number of put requests in the queue waiting to be executed on the same thread).

block_in_place() moves tasks queued for the current thread to another thread. That is an overhead in itself, so it may not help for very fast calls.

Maybe try profiling the issue? Maybe some assumptions about set are wrong? Sometimes on page fault the OS can start doing disk io for example.

What it sounds like to me is that CACHES.put is slower than you think it is.

You can try to use the TaskMonitor to measure these numbers. Another option could be to try moving CACHES.put into spawn_blocking and see how that affects the latencies.

Sorry for the late response, I was working on it ... until I discovered this latency had nothing to do with the wake up of any task, it was due to misunderstand how the iouring submit method works. That gave me the impression that the delay was on the oneshot await, but after setting traces everywhere the reason was other. Thanks anyways.