Dealing with Futures and Polling in Async Rust

just necro'ing this thread [Edit by moderators: we've moved this into a separate topic, after all] here with a rant, async rust feels like an overengineered nightmare, you implement Future and then it polls once, returns pending, then does nothing, unclear why this is acceptable developer experience, how the **** do we have to rely on a hodge podge of 3rd party interfaces and everyone just gets locked into tokio, do we use std futures, core futures, futures futures, do we use futures stream, or async iterator

all i want to do is implement the postgresql wire protocol, i implemented asynciterator, stream, future, nothing works to poll more than one time with executor::block_on and "await" on the future, go figure,

Async functions - too high level, no control over pending (i.e. doesn't actually work)
Poll functions - seems to only run once, return pending, then hang forever (i.e. doesn't actually work)
AsyncIterator/Stream - many competing traits doing similar things, unclear how they work because very few examples, requires nightly stuff, even with the nightly stuff on, doesnt work

Must be a skill issue but dipping my toes into async rust has ruined my preceding week because there's like 50 billion different blogs about how async could be done differently / better in the future but no real great guidance on how to do it correctly right now.

Call me crazy, but "impl Future" and "block_on" should ...just work, no runtime needed, but sadly they just don't actually work,

How do we have no standard Runtime trait? You wonder why there's a bunch of conflicting runtimes and every async crate just gets sucked into the tokio ecosystem, it's because there's no Runtime trait to standardize the functionality of a runtime.

IMHO avoiding async rust is desireable because it's a utter mess and bloats your compile time a ton, but if you need to do I/O there is little choice, if you need to use postgresql there is no choice, even the sync library for postgres depends on async. Surely there's a sync way to get at async, surely "await" means "poll until NOT pending" right?

anyway, no expectation for a reply, just feeling compelled to voice my outrage the current state of async rust is considered acceptable by anyone, this is midwit territory for sure, we have overcomplicated it, how do we make async rust more fun and accessible and "just works" ?

1 Like

Implementing the Future trait manually is not something you're really supposed to do as a user. Use async await instead.

And to be absolutely clear, you do not need to use Tokio-specific interfaces to manually implement Future. You need to use the standard library waker type which is provided to you in the context argument to poll. The documentation for Pending explains:

Represents that a value is not ready yet.

When a function returns Pending, the function must also ensure that the current task is scheduled to be awoken when progress can be made.

Or the documentation for Future says:

When a future is not ready yet, poll returns Poll::Pending and stores a clone of the Waker copied from the current Context. This Waker is then woken once the future can make progress. For example, a future waiting for a socket to become readable would call .clone() on the Waker and store it. When a signal arrives elsewhere indicating that the socket is readable, Waker::wake is called and the socket future’s task is awoken. Once a task has been woken up, it should attempt to poll the future again, which may or may not produce a final value.

[...]

The poll function is not called repeatedly in a tight loop – instead, it should only be called when the future indicates that it is ready to make progress (by calling wake()). If you’re familiar with the poll(2) or select(2) syscalls on Unix it’s worth noting that futures typically do not suffer the same problems of “all wakeups must poll all events”; they are more like epoll(4).

Future in std::future - Rust

The future trait works in the way it does for very good reasons. Futures must notify the runtime when they want to be polled again. Otherwise the runtime would have to continually poll all futures all the time which is a massive waste of resources.

18 Likes

thank you, my bad for posting while frustrated, I'm embarassed to have been a jerk and noob simultaneously, that's life sometimes. It's pretty funny i made the most obvious noob-didn't-call-wake error in hindsight.

It was indeed my error, because I didn't intuit we must wake futures from inside themselves while they are asleep; something about that seems so counterintuitive it failed to sink in for me! I am not sure what I was thinking but I figured pending without sleeping would desugar to a hot loop.

Not ready? Check again.

For this reason I wrongly expected the Poll::Pending to be a signal to re-poll; and the timing would involve sleeping before you return that. I'm not sure I'd describe setting my alarm clock as running a background thread in my brain to invoke the wake() method when I'm ready for the alarm to go off.

Would it be possible for the Waker to have timer / alarm clock style methods like "wake_after(multiplier, time_unit)" and "wake_at(iso_8601_timestamp)" ? Maybe this would be more understandable way to communicate waking?

Suppose there was an API for automation that looked like this

Automation<PollingSchedule, Trigger, Payload>

on the schedule, you poll the trigger, if it's ready, then you poll the payload. This is how I assumed it worked; the scheduling separate from the triggering of being ready. It seems Poll = PollingSchedule+Trigger+Payload all wrapped in one. Are we sure it's optimal to tie the schedule of the polling of the trigger directly into the trigger itself? If Pending means sleep, could we put sleep schedule into Pending instead of mutating Waker? Perhaps the pending state could say when to re-poll (if ever)

Pin<&mut Self> is a weird trait: a rustacean learns to praise the borrow checker and work with pointers a certain way. This is central to the Rust experience. Pin<&mut Self> inserts a middleman between rustacean and borrow checker, and forces us to learn a different API to borrow.

Do you think Async Rust Poll Pending/Ready is like the the Halting Problem; Could we face the same problem Alan Turing did: we only give ourselves 2 options: pending or ready, and we need a third (or more)? Forces us to use Ready(Result)

It seems like instead of the Future morphing itself into the Ready state, a functional approach could be simpler ?

could become easier if we use states like this?

pub enum TernaryDecision<A, B, C> {
    Loop(A), // equivalent to (Pending, Pin<&mut Self>)
    Halt(B), // equivalent to Ready<T>, done from inside
    Stop(C), // cancellation from outside
}

or even

pub enum Decision<A, B, C = (), D = anyhow::Error> {
    Initial,
    Loop(A), // equivalent to (Pending, Pin<&mut Self>)
    Halt(B), // equivalent to Ready<T>, done from inside
    Stop(C), // cancellation from outside
    Fail(D), // cancellation from inside
}

seems more complicated with more variants but could be less complicated in the long run because it has explicit variants to signal initial, stop and fail conditions instead of nesting result, and also by letting us avoid needing to deal with Pin, because it's an owned receiver for Futures and Streams and Sinks ?

Can you or anyone please explain like i'm a five year old with a phd why we want mutable self references to receivers for futures, streams, and sinks? What's the benefit of the mutable self reference over taking ownership of the thing and just returning a new thing? Is it necessary to use mutable self references so other things can carry pointers to mutant futures before and after it mutates itself from the pending to the ready state?

My bad again for being a dope. Thank you for checking the docs and you're right I should have seen that and realized what it meant.

I'm a novice in this area, but just noting that there is explanation about Pinning in the async book, if you haven't seen it.

2 Likes

Think of Poll::Pending as just meaning "I've got more work to do" and Poll::Ready as "I'm done", by itself Poll doesn't represent anything about when to poll again, that's what the waker is for.

Ideally, the future will only wake when it knows it will be able to make progress, for example if it's waiting for bytes to arrive from the network it will add the connection to the set of things the OS will notify it about, and when a notification arrives it will find the waker related to that handle and wake it. (Depending on OS this might be using epoll, kqueue, IO Completion Ports, or a few others...)

Clearly this is both more efficient and impossible to generalize away from the specific async runtime that implements the read call etc. The general part of futures is the ability to compose futures together, and that's what the API is designed to support.


Futures need to pin, and take a mutable reference for the same reason: the actual contents of the Future can contain internal references, for example:

async fn foo() {
    let mut buf = [0u8; 256];
    net.read(&mut buf).await;
    ...
}

The future this returns contains buf and a reference to it that was passed to read, meaning if you move the future after it started the reference would still be pointing to the old location.

To fix this, you can only poll a future after you promise to the compiler that you won't move it any more, and that's what Pin is, and you need to use a mut ref to avoid moving it while still updating it.

7 Likes

The way that Tokio implements timers is like this:

  1. There's a Sleep future that you can create.
  2. Inside the Sleep future, there are prev/next pointers that allow it to be part of a linked list of all timers on the runtime.
  3. When you call poll on a Sleep, it adds itself to the linked list of all pending timers on the runtime.
  4. To trigger timers, Tokio traverses the linked list of timers.

If we used ownership instead of pinned mutable references when polling it, then this implementation would be impossible because passing ownership of the Sleep into a poll method involves moving the Sleep, but such a move would not update the pointers to the Sleep that other items in the linked list has, so calling poll would break the linked list.

However, with the pinned mutable references, Tokio can be confident that the Sleep future will not move once poll has been called. That means that the linked list implementation can work. The linked list implementation is nice because it avoids allocating memory when you add a new timer.

(Actually Tokio doesn't just have one linked list of timers, but many linked lists grouping together timers with similar expiration timestamps.)

The futures generated by the compiler by async/await syntax have similar reasons for preferring pinned references.

10 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.