Cost of an async function call

Something I only just thought about, and it is probably stating the obvious, but I think it is true to say that an async function call has to be significantly slower than a normal function call.

Firstly, is this correct? Without trying to describe it, I presume quite a lot has to happen, the executor (tokio or whatever) has to "do quite a lot of stuff" to run the async function.

The implication is that for maximum efficiency, you want to minimise the number of async function calls. For example, if processing a large number of input bytes, you probably do not want to be doing an async function call to fetch each byte from the input.

The reason I was thinking about this was in connection with reading an http request. One way is to repeatedly call read_until to fetch a header line.

Another way is read a buffer of bytes (with a single async function call) then "push" the bytes (a byte at a time) into a stateful struct with repeated sync function calls, avoiding an async function call for each line.

I think the second way would be more efficient, but is this true?

Well it kind of depends on what you mean by "calling".

Most async functions don't do much of anything when they're called other than return their opaque future. Only the future at the very top of the chain (or one spawned onto the runtime manually) is interacted with directly by the runtime. My impression is that most of the "stuff" executors have to do is preparing a future to be polled the first time, and that subsequent polls are more or less just a method call on a trait object (once the Waker is used to re-enqueue the future for polling, and the future reaches the front of a queue).

There's obviously a bit of overhead polling down into that top level future to get to the current state if you have a big complicated future, but it's usually small enough to not matter. That's kind of closer to the equivalent of "calling" a synchronous function since it's where most of the work typically happens.

I think in your example it would depend on a bunch of details about the implementations, but generally I wouldn't expect to see a huge performance difference. The I/O is almost certainly buffered so a bunch of your awaits probably aren't going to suspend, which minimizes the overhead involved.

1 Like

Hmm, the documentation for read_until is a little curious, it says:

" Equivalent to:

async fn read_until(&mut self, byte: u8, buf: &mut Vec<u8>) -> io::Result<usize>;

So maybe it is NOT exactly an async call , even though you use "await" to call it?

I am now wondering if I have this all wrong, and provided the byte is available in the tokio::io::BufReader buffer, there is no significant overhead at all.

The runtime does not get involved at all unless the future yields, and an async call that is immediately ready is literally just a normal function call.

3 Likes

"equivalent" there just means that's the standard library's Read trait equivalent but async, not that it necessarily has identical performance

1 Like

Thank you Alice, eventually I realised that is the case. As usual, I was confused.

Even if the future does yield, this is not a particularly expensive operation. It involves modifying an atomic integer, and moving it around in a list.

This is, after all, something that it is quite important that we optimize well.

Out of curiosity, I threw together some super simple benchmarks with the Read and AsyncRead traits.

AsyncRead looks like it's a tiny bit slower, but if you read in chunks bigger than a single byte the difference becomes negligible.

The sample is just iterating over 20,000 bytes in a Vec
Playground

cargo bench output:

test byte_async     ... bench:     161,320 ns/iter (+/- 11,644)
test byte_sync      ... bench:       4,738 ns/iter (+/- 320)
test kilobyte_async ... bench:       4,897 ns/iter (+/- 750)
test kilobyte_sync  ... bench:       4,533 ns/iter (+/- 676)

Entirely possible I'm doing something silly here though (I mean you definitely shouldn't be reading 20,000 bytes one at a time)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.