Desugaring Async Fn

// The `async fn` I'm trying to desugar
async fn read_file(file: &mut File) -> String {
    let mut v = Vec::new();
    file.read_to_end(&mut v).await.unwrap();
    String::from_utf8(v).unwrap()
}

So the returned future starts capturing a &'file mut File "upvar", which points / refers to an external / non-self-referential element, hence why it can be named with a proper lifetime parameter.

But the state at the first .await point, the whole future must now contain:

  • still the initial &'file mut File,

    • Nit: there is also a reborrow of &mut *file in file.read_to_end…
  • v: Vec<u8>,

  • &mut v: &'?? mut Vec<u8>

  • ReadToEndFuture<'file, '??>, the future returned by file.read_to_end(&mut v).

As you can see, there is an issue with '??: this is actually a self-reference to the v field, which is why:

  • this future will not be Unpin;

  • the automagically compiler-generated code will "use unsafe" to make the borrow checker look the other way w.r.t. '??, since by the safety contract of Pin<&'_ mut Self> (the receiver of the Future::poll method (the only way to make the state machine evolve and to reach the code using unsafe to use the self-reference)), we know that in between .poll()s, the location of the whole future shall not have changed.

Now, you can try to write this state machine yourself, but the presence of this self-reference means you'll necessarily need to write unsafe yourself.


async fn read_file (file: &mut File)
  -> String
{ // <- State0
    let mut v = Vec::new();
    file.read_to_end(&mut v)
        .await // <- State1
    .unwrap();
    String::from_utf8(v).unwrap()
} // <- State2

The state machine would be something like:

use ::core::{
    future::Future,
    marker,
    mem::{MaybeUninit as MU},
    ptr,
};

type ReadToEndFuture<'file, 'v> = impl Future<Output = ::std::io::Result<()>>;
fn read_to_end<'file, 'v> (
    file: &'file mut File,
    v: &'v mut Vec<u8>,
) -> ReadToEndFuture<'file, 'v>
{
    file.read_to_end(v.as_mut())
}

enum ReadFileFuture<'file> {
    State0 {
        file: &'file mut File,
    },
    State1 {
        file: &'file mut File,
        file_reborrowed: MU<&'unsafe_self_ref_file mut File>,
        v: Vec<u8>,
        at_v_mut: MU<&'unsafe_self_ref_v mut Vec<u8>>,
        file_read_to_end: MU<ReadToEndFut<'unsafe_self_ref_file, 'unsafe_self_ref_v>>,
        _self_referential: marker::PhantomPinned,
    },
    State2 {
        return_value: String,
    },
    Poisoned,
}

impl<'file> Future for ReadFileFuture<'file> {
    type Output = String;

    fn poll (self: Pin<&'_ mut ReadFileFuture<'file>>, cx: &'_ mut Context<'_>)
      -> Poll<Self::Output>
    { unsafe {
        let this = Pin::get_mut_unchecked(self);
        loop { break match *this {
            | Self::State0 { ref file } => {
                let file = ptr::read(file); // This is `ManuallyDrop::take()`.
                ptr::write(this, Self::Poisoned);
                ptr::write(this, Self::State1 {
                    file,
                    file_reborrowed: MU::uninit(),
                    v: Vec::new(), // <- Function's body
                    at_v_mut: MU::uninit(),
                    file_read_to_end: MU::uninit(),
                    _self_referential: <_>::default(),
                });
                if let Self::State1 { file, file_reborrowed, v, at_v_mut, file_read_to_end, .. } = this {
                    let file_reborrowed = file_reborrowed.write(&mut **file); // <- Function's body
                    let at_v_mut = at_v_mut.write(v); // <- Function's body
                    let _ = file_read_to_end.write(
                        read_to_end(file_reborrowed, at_v_mut) // <- Function's body
                    );
                    continue;
                } else {
                    ::std::hint::unreachable_unchecked()
                }
            },
            | Self::State1 { ref mut file_read_to_end, ref v, .. } => {
                let result = ::futures::ready!(unsafe { Pin::new(file_read_to_end) }.poll(cx));
                let file_read_to_end = ptr::read(file_read_to_end);
                let v = ptr::read(v);
                ptr::write(this, Self::Poisoned);
                result.unwrap(); // <- Function's body
                drop(file_read_to_end); // <- Function's body
                let return_value = String::from_utf8(v).unwrap(); // <- Function's body
                ptr::write(this, Self::State2 { return_value });
                continue;
            },  // I now realize the `State2` could be elided altogether…
            | Self::State2 { ref return_value } => {
                let return_value = ptr::read(return_value);
                ptr::write(this, Self::Poisoned);
                Poll::Ready(return_value)
            },
            | Self::Poisoned => panic!("Violation of `Future::poll`'s contract: future polled _again_ after completion or panic"),
        }}
    }}
}

I've tried to write not too awful code, safety-wise (even though I've sacrificed some "pattern consistency" by taking shortcuts abusing knowledge of some stuff (e.g., that the only drop glue in State1 was v and the inner future)), although the above code would upset miri very much (note: current futures already do upset miri). In order to do this fully correctly, raw pointers ought to be used instead of Rust references every time an 'unsafe_self_ref… lifetime appears in the type definition.

1 Like