Async with multithreading?

As I understand, async functions return any type which implement the Future trait, which is nothing but an abstraction over deferred computations (and large state machines). Now, let us assume that we have a multithreaded server, say the one in The Book for demonstration purposes, then what'd be the approach to make it asynchronous?

What I initially thought was this:

  1. A request is received
  2. The TcpStream is sent to a worker thread which asynchronously reads the request
  3. The worker thread spawns a blocking task to process the request. Note: This step does disk I/O
  4. The worker thread writes to the stream, asynchronously

But I still can't outline an implementation. Can anyone point me in the right direction?

tokio or async-std take care of the multiple threads. All your code has to do is spawn on new streams accepted.

If you need more processing performance than a stream has using a single thread then you look into independent sync thread pools; channelling result back to async code.

You usually have a loop that receives new connections and spawns a task for each. The task can then do whatever it wants with the connection.

But, if I understand correctly, a task is for lightweight functions and not blocking operations. As I had mentioned earlier, the processing of the request is a blocking task - so should I simply do something like task::spawn_blocking (tokio) once I've received the request? But wouldn't that entirely defeat the purpose of having async code?

When you say the processing of the request is a blocking task, what exactly do you mean? Do you mean that code you’ve already written is synchronous?

Also generally, how long does the processing take and how frequent are the incoming requests?

My general inclination would be to make some sort of threadpool for synchronous processing of requests and then use channels to send/receive between the threadpool and the async tasks. But maybe that might be overkill for what you’re doing.

1 Like

If you need to run blocking code for some reason, then yes it must be wrapped in spawn_blocking or similar. You can still gain an advantage from async, though. Consider the following scenario:

  1. Connection is received.
  2. Blocking task is spawned, and computes something.
  3. Blocking task is done, and sends a Vec<u8> of data to async code.
  4. The vector is written into the connection asynchronously.

In this case, you need a separate thread for every concurrently running blocking operation in step two, but step four is performed asynchronously, and can run on a single thread even for a large number of connections. Since writing the data can potentially take hundreds of milliseconds to several seconds due to network latency, removing the need for a thread per connection in that step would be an advantage.

1 Like

What I mean by a blocking task is something that will do disk I/O

No

The processing would take, say about 0.1s (as per my benchmarks) and the incoming requests will be very frequent - say 10k req/sec or more.

Yes, that's the approach I'm planning to use.

I think it sounds fine to just use spawn_blocking for that. Since you have a lot of requests, try to optimize the disk IO into as few calls to spawn_blocking as possible, i.e. don't use tokio::fs::File, but instead use tokio::fs::read or spawn_blocking with the std File.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.