Is an Iterator a coroutine/generator?

Hi all,

I've done a bit of searching and tbh I don't really understand the lengthy discussions on the difference between the two. So I'm looking for simple answers aimed at people who are not comp-sci wizards.

I know that coroutines/generators are coming to Rust, and i was just curious why, if we already have Iterators?

Thank you.

Iterators are roughly generators, but coroutines are a more general and potentially more powerful concept.

Iterators (and generators) are a subset of coroutines that return execution back to the place they were called from. Coroutines can control what code is executed after they yield (return).

Iterator is an interface. It describes in an abstract way how to read data.

Generators are a language feature that allows writing stateful-but-interruptible code using more straightforward syntax. They are about generating the data.

Rust has Iterators, but writing a custom iterator (not just chaining existing filter/map) requires writing a next method and saving/restoring state manually.

A generator syntax would allow just writing a loop with yield instead.

It's like Future interface in Rust. The Future interface has poll_next(), but writing it is cumbersome. async fn syntax writes the poll_next() for you and makes it easy.

I think about iterators and generators as a specialized coroutines, i.e. Iterator<T> = Generator<Yield = T, Return = ()> and Generator<Yield = T, Return = F> = Coroutine<Yield = T, Arg = (), Return = F>. And this view maps really nicely to for loops.

So a generator would desugar into an Iterator?

Why is that useful? And would our iterators upgrade to coroutines, once that feature hits, or will they live alongisde?

I think @m51's explanation was not great here. In pure world you can not influence elements which iterator/generator will return after its creation (though there are "impure" ways such as implicit shared states), while with coroutine you can explicitly pass data for each "iteration", thus influencing its execution. The most prominent example is async context used in futures.

Ideally iterator and generator would be just specialized aliases for coroutine, but unfortunately due to the backwards compatibility guarantees, it's highly unlikely to happen until Rust 2 (i.e. effectively never).

Coroutines are useful for things like encoding state machines in a single function.

https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Yes, in the same way that |...| ... closures are a mechanism to create instances of an ad-hoc type that implements the Fn... interface,

Example
let one: i32 = 1;
let increment = move |y: i32| y + one;

is sugar for:

let one: i32 = 1;
let increment = {
    struct Closure {
        // captures
        one: i32,
    }
    impl Fn... for Closure {
        ...
        fn (self: ..., y: i32) -> i32
        {
            y + self.one
        }
    }

    Closure { one }
};

|...| { ... yield ... } generators will be a mechanism to define an ad-hoc type that implements the Generator interface, which is very similar to the Iterator one (it just has bit more versatility). But we can actually discard that versatily (generator's .next() function, called .resume(), can receive some state, and when the generator ends, it can also give some state back, where an iterator cannot. But if we discard both these states, there is a trivial conversion from a (pinned) generator to an iterator:

#![feature(generators, generator_trait)]
mod helper {
    use ::std::{
        pin::Pin,
        ops::{Generator, GeneratorState},
    };
    
    pub
    fn generator_to_iterator<'gen, Item> (
        mut gen: impl Generator<Yield = Item, Return = ()> + Unpin + 'gen,
    ) -> impl Iterator<Item = Item> + 'gen
    {
        ::std::iter::from_fn(move || match Pin::new(&mut gen).resume(()) {
            | GeneratorState::Yielded(item) => Some(item),
            | GeneratorState::Complete(()) => None,
        })
    }
}

This way, one can now write things like:

fn prime_divisors (mut n: u64) -> impl Iterator<Item = u64>
{
    let mut p = 2; // extra initial state
    helper::generator_to_iterator(move /* n, p */ || {
        while n > 1 {
            while n % p != 0 {
                p += 1; // quite naive, for the sake of the example
            }
            yield p;
            n /= p;
        }
    })
}

And what the Rust compiler has done is convert the above into something with the semantics of:

fn prime_divisors (n: u64) -> impl Iterator<Item = u64>
{
    let p = 2; // extra initial state
    return MyIterator { n, p };
    // where
    struct MyIterator { n: u64, p: u64 }
    impl Iterator for MyIterator {
        type Item = u64;
        
        fn next (self: &'_ mut Self)
          -> Option<u64>
        {
            if self.n > 1 {
                while self.n % self.p != 0 {
                    self.p += 1;
                }
                self.n /= self.p; // prepare state for the next `.next()` call.
                return Some(self.p);
            }
            None // done
        }
    }
}

And, to be honest, I have cheated a bit with this example, technically the true translation is a full-fledged state machine,

  • (and one that can even feature self-references, like async-generated Futures do, which then requires pin_mut! or Box::pin in order to be pollable).

See, for instance,

let x: i32 = 42;
generator_to_iterator(move /* x */ || {
    // <- initial state, captures x,
    yield 2;
    // <- state after the first yield, still captures x
    let y: i32 = get_y();
    yield 3;
    // <- state after the second yield, captures `x` and `y`
    yield x;
    // <- state after the third yield, capturs `y`, but no longer needs to capture `x`
    yield y;
    // <- final state, does not need to capture anything (since this `return`s just a `()`)

would be converted into;

enum InternalState {
    Initial { x: i32 },
    AfterFirstYield { x: i32 },
    AfterSecondYield { x: i32, y: i32 },
    AfterThirdYield { y: i32 },
    Final { /* nothing  */ },
}

impl Generator for InternalState {
    type Yield = i32;
    type Return = ();

    fn resume (self: Pin<&'_ mut Self>, _: ())
      -> GeneratorState<i32, ()>
    {
        use GeneratorState::{Yielded, Completed};
        use InternalState::*;
        match *self {
            | Initial { x } => {
                *self = AfterFirstYield { x }; // prepare for next
                Yielded(2)
            },
            | AfterFirstYield { x } => {
                let y = get_y();
                *self = AfterSecondYield { x, y }; // prepare...
                Yielded(3)
            },
            | AfterSecondYield { x, y } => {
                *self = AfterThirdYield { y };
                Yielded(x)
            },
            | AfterThirdYield { y } => {
                *self = Final { /* nothing */ };
                Yielded(y)
            },
            | Final { .. } => {
                Completed(())
            },
        }
    }
}
4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.