Fastest way possible to read bytes?

A simple question that ended up being not so simple after all. Seem to get very conflicting information regarding IO.

I've been trying to read large files ~100MB as fast as possible. For example I might have 1TB worth of these files (can read files in parallel). The problem is that maxing out my drive turns out to be not so easy. The processing I'm doing is very light so the program is very limited by IO.

My setup:
Linux
Samsung 980 Pro (7GB/s read)
Ryzen 9 5900x CPU

Things I've tried: Mmap, different types of async, different number of threads etc. I often do get quite good performance ~5GB/s but it still leaves IO on the table (at least this is what i suspect since SSD benchmarks get me close to the max 7GB/s). Or are the benchmarks "cheating" somehow making it unrealistic to achieve this type of performance in a real program?

The setup is quite flexible so any ideas are welcome. Also related, should i expect to max out the read speed going one file at a time, or does it normally require reading these types of files in parallel?

Since you mention that you tried async, I'd like to be clear that async traditionally does not help here. It's generally faster to use std::fs over tokio::fs.

1 Like

With SSDs you need to launch several read requests in parallel to get most of possible device throughput. The read requests should also use large enough aligned chunks (usually 4 KiB). And it's likely worth to open your file with O_DIRECT. I would recommend to try io-uring with manually implemented event loops. You could use NVMe commands with io-uring to get maximum control over communications with your SSD, but I am not knowledgeable enough about it to say whether its worth the trouble in your case or not.

You could benchmark the SSD with fio, to figure out what access patterns are fastest, then try to mimic that in your code.

I found some examples here

io-uring with O_DIRECT seems to get me quite close, around 6.3GB, seems like the SSD benchmarks were actually not quite 7GB and around this number.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.