Use of async - where to draw the line

geebee22 · November 17, 2021, 5:14pm

I am just trying to get my head around the issue of where to use async. Would it be correct to say that there is a certain amount of extra overhead when calling an async function?

I only just started thinking about the issue. I don't really think having too many threads is a likely to be an issue for the applications I have in mind. Or should I make all the functions that may perform IO ( typically file reads ) async? It seems more sensible to me at this stage to use tokio::task::spawn_blocking rather than doing that.

alice · November 17, 2021, 5:20pm

Well, what sort of stuff are you doing? You mention file IO, and putting file IO in spawn_blocking or entirely outside async code is very reasonable.

geebee22 · November 17, 2021, 5:24pm

The web server I posted on recently. My thinking is to use async for the network IO ( well Axum does all that for me anyway), but do the database query evaluation in sync code. My get handler currently looks like this:

/// Handler for http GET requests.
async fn h_get(
    state: Extension<Arc<SharedState>>,
    path: Path<String>,
    params: Query<HashMap<String, String>>,
    cookies: Cookies,
) -> ServerQuery {
    // Build the ServerQuery.
    let mut sq = ServerQuery::new();
    sq.x.path = path.0;
    sq.x.params = params.0;
    sq.x.cookies = map_cookies(cookies);

    let blocking_task = tokio::task::spawn_blocking(move || 
    {
      // GET requests should be read-only.
      let stg = Box::new(state.stg.open_read());
      let db = Database::new(stg, "");
      db.run_timed("EXEC web.Main()", &mut *sq.x);
      sq
    });
    blocking_task.await.unwrap()
}

alice · November 17, 2021, 5:25pm

It seems fine to do it that way.

jbe · November 17, 2021, 6:15pm

I might resort to tokio::task::spawn_blocking in a different scenario, where I want to execute (user provided) Lua scripts (which don't perform any blocking I/O) in an async program. Yet they may run for a while and thus keep other code from being executed.

In my case, I had the idea to install a hook that runs every x-thousand VM instructions, which could then yield from Lua regularly.

Apart from having to invest extra effort in making my Lua execution yielding regularly, I'm not sure if it's really worth the effort in my use case, as it would also impose overhead, and the overhead of using spawn_blocking might be way smaller.

But there is one other thing I wonder about: There is a maximum limit on blocking theads. Thus, depending what the blocking theads do, I wonder if it's possible that this might cause deadlocks (e.g. if 512 threads wait on a result of a 513rd thread which never will get executed until one of the 512 threads finishes its work). But maybe that's not an issue in 99% of all application cases.

geebee22 · November 17, 2021, 6:27pm

I wonder what the rationale for the default limit of 512 is? I believe 64 bit windows can have in excess of 50,000 threads, although whether that is reasonable or healthy I doubt.

alice · November 17, 2021, 6:50pm

We do need to have some limit, because if we don't then any program that spawns at a higher rate than the tasks can finish will run out of resources rather quickly and crash. It's a form of backpressure.

Why exactly we went with 512, well, it's a decently large number and seems reasonable

newpavlov · November 17, 2021, 10:09pm

If you don't plan to support tens of thousands simultaneous connections, then purely synchronous code should be simpler to write and reason about, especially considering that your DB API is synchronous as well.

geebee22 · November 17, 2021, 10:32pm

Presumably async code can actually block for a short period of time due to virtual memory page misses. Processing a database query is similar really - it may block due to the data not being in memory. I don't think it makes sense to process more than a moderate number of queries at the same time, it would probably slow the throughput rather than increase it - there are inherent limits to how much a computer can do in parallel.

So I am now convinced making the query processing code synchronous is the correct approach. Equally, I think it makes good sense to make the network IO async. What happened is I had sudden panic : do I have do re-write all my code as async? I think the answer is a clear no - it would actually be counter-productive.

geebee22 · November 17, 2021, 10:47pm

That wouldn't happen in my case, the threads are independent (albeit they are reading from a common pool of data). What could happen in principle is that large number of long-running read-only transactions could prevent a short-running read-only transaction from starting. So some kind of "fairness" issue. However I don't think it would be a problem in reality. I doubt there would reasonably be more than a handful of long-running read-only transactions running at the same time, let alone 512.

system · February 15, 2022, 10:47pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Async vs sync, # of threads help	22	2550	March 15, 2022
Async with multithreading? help	8	7992	October 11, 2020
Do most work sync? help	3	174	April 21, 2025
I've always been curious about tokio and async trait help	4	672	August 13, 2022
Async threadpool help	4	1812	May 28, 2023

Use of async - where to draw the line

Related topics