Async threadpool

Hi!

I have a system that until recently was using the threadpool create to have a maximum of 4 different threads running some tasks. This tasks are expensive (they read stuff from a database and process it, and the whole process can take up to 10 seconds), so that's why I was running them on a thread pool.

So far so good, the problem is that the function I need to call is now async, and I don't know how to do this in Tokio. It looks like there was a tokio thread_pool long ago, but it's now gone.

I cannot just tokio::spawn() this task, because they would all start hitting the database at the same time and they would start failing.

What would you suggest? Any directions/ideas? Thanks!

Can you elaborate on this?

If the database should only be accessed by a single task at a time, can't you use (async) locks for that?


I just noticed that you have computationally expensive tasks, so maybe tokio::spawn really isn't the best choice. How about using futures::executor::block_on (or tokio::runtime::Handle::block_on) in a blocking thread when you need to wait for a future to complete?

If I use a lock (a tokio Mutex?), I will only be able to run a single process at a time. But I'd like to have 4 concurrent proceses, so the whole process runs faster.

When the system starts, it might have hundreds of those tasks to run. But instead of running them one by one, I'd like to run them in blocks of 4.

What each process does is basically to fetch some data from the database, transform it and submit that data as json to a remote server.

There is tokio::runtime::Builder::worker_threads and tokio::runtime::Builder::max_blocking_threads. But not sure if that's the right thing to do. The tokio docs say:

CPU-bound tasks and blocking code

If your code is CPU-bound and you wish to limit the number of threads used to run it, you should use a separate thread pool dedicated to CPU bound tasks. For example, you could consider using the rayon library for CPU-bound tasks. It is also possible to create an extra Tokio runtime dedicated to CPU-bound tasks, but if you do this, you should be careful that the extra runtime runs only CPU-bound tasks, as IO-bound tasks on that runtime will behave poorly.

If you want to wait for a future to complete in a non-async function/thread, the block_on methods linked in my previous post could help, I think.


Yes, I see the problem now. A (counting) semaphore might be an alternative if you only want to limit the concurrent database access. (But not sure.)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.