I did a file copy speed test between these 2 and seems tokio::io::copy is much slower than sync std io::copy
I noticed this while implementing my own tokio AsyncWrite.
Here are the results on my machine for a file of 894MB
The thing that's slow is Tokio's file IO, not the copy function. From the Tokio tutorial:
When not to use Tokio
Reading a lot of files. Although it seems like Tokio would be useful for projects that simply need to read a lot of files, Tokio provides no advantage here compared to an ordinary threadpool. This is because operating systems generally do not provide asynchronous file APIs.
Note that the stdlib also internally uses specialization to make copy generally faster for BufReader/BufWriter. On linux and android it can even use syscalls like copy_file_range and splice to avoid loading the file data in userspace at all.
There's nothing stopping Tokio from using io_uring for tokio::fs. We just need someone to actually take the time to implement it.
The traits force you to copy the data one more time than you have to with epoll. But we already need that copy for files due to spawn_blocking, so there's no issue there.
There's nothing Tokio can do. We need specialization to do that kind of thing, but it is an unstable language feature. The standard library can do it because it is special and can use unstable things.
The tokio::io::copy function takes as arguments an AsyncRead and an AsyncWrite, so its implementation must be something that works for anything that implements those traits. There are many things that implement them, and most of them would not work with copy_file_range. There's a language feature called specialization that lets you override what a function does in specific cases, so we could use that to override the behavior when you pass it two files. However, specialization is an unstable feature so Tokio cannot use it (but the standard library can).
I have spent some time looking into workarounds for not having access to specialization (see this thread), but it's very difficult to do so.
Is tokio::io::copy known to be used for things other than files though? If there is a way to dig down to the 2 fds you could blindly pass them down to copy_file_range. Then you either have the copy made or the kernel tells you it can't do it. Not optimal for non-files as it sneaks in a syscall, but probably a worthwhile tradeoff until things can get fixed properly.
Again I know about squat about Rust, maybe these things are not even necessarily fds and it is not possible to determine that they are either (with legal means anyway).
As absolute minimum the 8K buf size should be bumped to 32K or higher.
Yes, tokio::io::copy is used with things that are not files - and even things that are not just file descriptors. For example, I've seen people use it for proxying over streams that use things like tls and compression, which means that the data being sent in userspace is not what goes over the wire.
We can bump buffer sizes, but in the particular case of files there are actually multiple buffers involved. Each file has an internal buffer for holding data as it gets transferred between the Tokio thread and the background threadpool that does the actual blocking IO, and then tokio::io::copy has another buffer for holding data as it gets copied from one to another.
Yeah, the lack of support for files in epoll and kqueue makes things suck pretty badly. The lack of copy_file_range is only a minor issue compared to the other reasons file IO sucks in async code.
The total difference in performance was bugging me, so I wrote a toy C program to just do the 8KB calls. It finishes in next to no time (0.02s user 0.90s system 99% cpu 0.923 total). On the other hand the program from OP with std::io::copy commented out takes 1.41s user 2.49s system 137% cpu 2.828 total, which is quite concerning.
From poking around with strace and perf I see tokio is using 2 threads to handle the op, resulting in a crazy number of futex calls and context switches. I presume using tokio-uring as mentioned in one of above comments manages to dodge this aspect.