I've been using Tokio and async for a few months, but I only recently started getting interested in I/O interactions. As such, I am still getting confused with bridging async and non-async code.
I saw that tokio::fs
was using std::fs
inside of a tokio::task::spawn_blocking
call. Seeing that:
Tokio’s file uses
spawn_blocking
behind the scenes, and this has serious performance consequences. To get good performance with file IO on Tokio, it is recommended to batch your operations into as fewspawn_blocking
calls as possible.
So I'm wondering a few questions about the inner workings of spawn_blocking
, for which I couldn't find the answer:
- can multiple threads spawned with
spawn_blocking
execute concurrently ? or are they sequential ? - if it's possible to have concurrent blocking operations, can a file be opened by multiple async tasks that use a
spawn_blocking
for the I/O related operations ? - again, if it's possible, what are ways to prevent concurrent file write ?
Another question had more to do with my understanding of what awaiting on a spawn_blocking
handle does to the task inside the blocking thread.
If I await on a handle, I yield back to the scheduler which then chooses another future to poll. Does it mean the task inside the blocking thread stops ? From my understanding it shouldn't. I read on a post from Alice Ryhl (Async: what is blocking) that there are about 500 threads in the thread pool.
Given that modern CPU don't have 500 cores (maybe one day ), if I spawn more blocking threads than there are cores, some blocking tasks will have to stop. Or are they handled until completion ?
My actual code application is a file that gets written on often, but which I will need to read sometimes. I already handled no concurrent write by using an mpsc
, so the request to write on the file are handled by a single task which will write on the file. But due to the way the application is built, I can't put the read task in the same place, meaning I can't guarantee I won't try to open the file to read it while another task is trying to write on it.
I considered using an Arc<AtomicBool>
to signify when the file is ok to be read or not (or other types of solution with Mutex
), but I was wondering if there is a more idiomatic way to do it with Tokio, since it seems to me that the problem of concurrent file access should come pretty quickly once one uses an async runtime and bridges with the filesystem.
Thank you for taking the time to read !