Hello everybody!
Excuse me for this long post - it's the culmination point of some time of fiddling. I try to break it down into a minimal example, and a longer explanation.
When reading a file, tokio::fs::File seems less efficient (in CPU time) than its sync version. Do you think this will still improve, or is there an inherent performance disadvantage of file async when there is no or not much concurrency?
In this example, I count the bytes of a large file (an old CentOS image of 636 MB). It takes around 80 to 100 ms on my system (Rust 1.39, release build, after some runs so that the file is entirely in
Linux' memory cache).
use std::fs::File;
use std::io::Read;
static FILEPATH: &str = "/home/vh/tmp/CentOS-7-x86_64-Minimal-1503-01.iso";
fn main() {
let mut counter: usize = 0;
let mut file = File::open(FILEPATH).unwrap();
let mut buf = [0u8; 64*1024];
loop {
match file.read(&mut buf) {
Ok(n) => {
if n == 0 {
// eof
break;
} else {
counter += n;
}
},
Err(_) => panic!("Error while reading file")
}
}
println!("Read {} bytes.", counter);
}
However, its async counterpart takes between 535 and 620 ms, that is 6.5 times as much.
extern crate futures;
extern crate tokio; // 0.2.0-alpha.6
use futures::executor::block_on;
use tokio::fs::File;
use tokio::io::AsyncReadExt;
static FILEPATH: &str = "/home/vh/tmp/CentOS-7-x86_64-Minimal-1503-01.iso";
async fn async_main() {
let mut file = File::open(FILEPATH).await.unwrap();
let mut counter: usize = 0;
let mut buf = [0u8; 64*1024];
loop {
match file.read(&mut buf).await {
Ok(n) => {
if n == 0 {
// eof
break;
} else {
counter += n;
}
},
Err(_) => panic!("Error while reading file")
}
}
println!("Read {} bytes.", counter);
}
fn main() {
block_on(async_main());
}
I know that file operations are rather sychronous on the kernel level... Maybe the synchronous syscalls are so optimised that this result was to be expected?
Background
Of course, my goal is to handle more concurrency than in the example above
For a free audiobook web site, I want to create on-the-fly ZIP archives (store-only, without compression) of MP3 files. This is currently being done by a Node app that leaks so much memory that it gets restarted
twice a day. So for a long time I've been looking at Rust and its growing async system to replace it! Several
clients download large files with a slow speed. Many connections, no computation, only I/O-bound. Sounds like a nice use case for async!
Before going into the details of zipping, I played around with three examples of serving a file with Hyper, reading it chunk-by-chunk, more or less based on Hyper examples. I measured the time spent by 10 concurrent clients, each reading the file ten times with a five-second wait between: siege -c 10 -d 5 -r 10 $URL
.
The first version spawns one thread per client for a loop with synchronous reading and transmitting the read bytes through a Channel. It has excellent performance and memory usage.
The second version is fully async, using Tokio's codec::FramedRead::new(file, codec::BytesCodec::new())
for reading the file. The third version (branch "fileserve_hyper-async-stream" in the repo from the other links; sorry, as a new user I'm only allowed to post two links) does the same, but with a custom implementation of Stream. With pre-1.39 nightly versions, this was more performant, but not with 1.39.
Version | Peak RSS usage | RSS usage afterwards | Siege walltime | CPU time |
---|---|---|---|---|
Sync in Channel | 8.6 MB | 5.4 MB | 1:16 min | 1:05 min |
Async, codec::FramedRead | 26.2 MB | 7.1 MB | 3:40 min | 3:28 min |
Async, own Stream | 24.6 MB | 7.5 MB | 4:13 min | 4:01 min |
Of course, I don't have much practice yet in Rust and there may be many differences and possible optimisations between these versions. That's why I made the minimal example above.
I'd be inclined to continue with the threaded version - it seems to perform very well in this situation of limited concurrency. Do you think the async versions should eventually match and outperform the threaded version?
Thank you for reading!
Viktor.