Will AsyncWrite give the same buf on returning Poll::Pending?

Hi,

Suppose I am implementing AsyncWrite and I return Poll::Pending from the poll_write - is there any contract that says that as long as I return Poll::Pending (and of course arrange for a wakeup), poll_write will be called with the SAME buffer again ? In other words, can I end up with a sequence like this

  1. my poll_write(&buf1) is called
  2. I return Poll::Pending and arrange for wakeup
  3. After wakeup I am called with poll_write(&buf2) where buf1 and buf2 are different

Rgds,
Gopa.

No, there is no such guarantee. You must be able to handle this case correctly by writing buf2 and not writing buf1.

This is why tokio::fs::File needs to be flushed even though the std file doesn't. When given bytes, it immediately returns Ok(len) and copies them into a buffer, then starts a background thread that actually writes the data. Flushing waits for the background thread.

Which is a subtle way in which fn poll_write isn't great for completion based IO, since you can lose buffered data (i.e. believe it got written but it hasn't yet been) if you don't finish flushing. Which can happen fairly easily via cancellation and/or panics, which could even amount to a successful program exit before the implicit IO buffers have finished flushing.

Completion IO would really prefer async fn write, since the returned future can hold the transient buffer and only return "yes it's been written" once it's actually hit the thing you're writing to. (E.g. so that you have a mostly consistent picture of how much has/hasn't been written to the buffer, though a write completing without receiving the notify is duplicated. Though that's probably better than lost, I think.)

fn poll_write is the correct interface for readiness based writes. async fn write is correct for completion based writes, and trivially shims to use the completion based fn poll_write.

Maybe it'd make some sense for completion based IO to offer a "soft" flush (i.e. wait until submitted IO completes, but e.g. don't flush BufWriter to its underlying sink) and a "hard" flush? That way async fn write can submit the poll_write and wait for the "soft" flush, with a user initiated flush being a "hard" flush.

...but any of this probably isn't beneficial until there's some way to have a best effort async drop that can do flushes, since cancellation ruins the attempts at better resiliency anyway. And you get most of the benefits by the runtime waiting on any latent IO to finish before shutting down, anyway. (Which Tokio can do (I don't know if it does) when the runtime drops, but async-std can't (since its runtime is implicit background threads).)

@alice thx a lot for the clarification. Just for my understanding, what is an example scenario when I am called with buf1, I return Pending and then I am called with buf2 - wont the async state machine keep the same states and keep calling poll_write with the same state till I return Poll:Ok ? I can imagine that maybe a cancelled future might be one example, but anything else ?

There are two versions of this:

  • The "benign" version. Here, buf2[..buf1.len()] == buf1, but buf2 might be longer than buf1. This is pretty common. For example, it might happen with tokio::io::copy.
  • The full version. This only happens when cancellation comes into play. For example, if you abort a write with tokio::select! and then start another write, then it can happen.
1 Like

@alice thx. So for case 1, in theory as long as I remember the original buffer length when I returned Pending and return Poll::Reay(Ok(orginal_buf_len)) it should work - which is why I guess you say its "benign". For case 2 yeah, it would be an error. Let me take a look at tokio::fs::File to see what it does to give me some idea of how to go about a similar case that I need to deal with

In short: either don't take any data and return Pending, or take (some of) the data, ensure it will get written before flush returns Ready, and return Ready.

Think of it basically functioning like a small BufWriter buffer. You return Ready from poll_write once you've taken data from the caller and fed it into that buffer. The actual IO might be done, or it might be happening in the background, or it might only happen when flush is called.

1 Like