Generator<Yield=&[u8]> -> impl Read?

I was having fun with encodings and I noticed that a generator that yields byte slices might be trivially wrapped into an implementation of io::Read. So I implemented this (playground link), that successfully compiles and runs. But there are a couple of issues when trying to extend it:

  • Can't yield a borrowed reference owned by the generator. I tried to work around this by adding lifetime annotations, but I think I'm missing something. I guess the resolution to this issue could mean the whole design is dashed.
  • There are several boxes and I feel like they're not all necessary, but I can't seem to remove them.

Thoughts?

#![feature(generators, generator_trait)]

use std::{
    boxed::Box,
    io::{self, Read},
    ops::{Generator, GeneratorState},
    pin::Pin,
};

struct GeneratorReader<'a, G>
where
    G: Generator<Yield = &'a [u8], Return = ()>,
{
    generator: Pin<Box<G>>,
    bytes: &'a [u8],
}

fn new_generator_reader<'a, G>(generator: G) -> GeneratorReader<'a, G>
where
    G: Generator<Yield = &'a [u8], Return = ()>,
{
    GeneratorReader {
        generator: Box::pin(generator),
        bytes: &[],
    }
}

impl<'a, G> Read for GeneratorReader<'a, G>
where
    G: Generator<Yield = &'a [u8], Return = ()>,
{
    fn read(&mut self, mut buf: &mut [u8]) -> io::Result<usize> {
        let mut count = 0;
        while !buf.is_empty() {
            while self.bytes.is_empty() {
                self.bytes = match Pin::new(&mut self.generator).resume(()) {
                    GeneratorState::Yielded(b) => b,
                    GeneratorState::Complete(()) => return Ok(count),
                };
            }
            let n = min(self.bytes.len(), buf.len());
            buf[..n].clone_from_slice(&self.bytes[..n]);
            buf = &mut buf[n..];
            self.bytes = &self.bytes[n..];
            count += n;
        }
        Ok(count)
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    
    let mut genreader = new_generator_reader(Box::new(|| {
        let a: &[u8] = b"abc";
        yield a;
        let b: &[u8] = b"def";
        yield b;
        // Not quite as useful as expected
        //let c: &[u8] = &(128u64).to_le_bytes();
        //yield c;
        // ...
    }));

    io::copy(&mut genreader, &mut io::stdout())?;
    
    Ok(())
}

I think this requires generic associated types because it's logically equivalent to a "lending iterator".

For example, imagine writing the constructor for a generator that reads bytes from a file into a local buffer.

fn reader() -> impl Generator<Yield = &'_ [u8], Return = ()> {
    move || {
        let mut f = File::open("./input.txt").unwrap();
        let mut buffer = [0; 1024];

        loop {
            match f.read(&mut buffer) {
                Ok(bytes_read) => {
                    yield &buffer[..bytes_read];

                    if bytes_read == 0 {
                        return;
                    }
                }
                Err(e) => panic!(),
            }
        }
    }
}

There are two problems here,

  1. We want something like a 'this lifetime for the byte slice that gets yielded to indicate that the slice is borrowed from the object, but that's not possible
  2. If we yield a &[u8] we need some way of making sure the caller can't resume our generator until they've dropped the slice, otherwise the generator could write over the slice while the caller is using it.
3 Likes

Alas, I'm afraid so: right now Generators are only able to emulate the API of Iterator, which is a non-lending one:

/// Iterator::next
fn next<'next> (self: &'next mut Self)
  -> Option<Self::Item>

Notice how the return type does not depend on 'next, and compare that to a LendingIterator's next:

/// LendingIterator::next
fn next<'next> (self: &'next mut Self)
  -> Option<Self::Item<'next>>

Yeah, the Box::new() in main ought to be removable; the one inside GeneratorReader is harder: you would have to take a Pin<&'_ mut G> rather than a G, and at that point, you could simply take a G with an extra Unpin bound.

  • Playground (notice how I've added "fusing" to the generator with an extra exhausted flag, since otherwise we would be susceptible of polling the generator after completion).
2 Likes

Yes I was imagining something like these issues but the details were fuzzy to me. Thanks for giving a name to the concept: "Lending Iterator".

It seems to me that the hypothetical LendingIterator::next would solve the the problems mentioned by @Michael-F-Bryan, #2 in particular is solved because next takes self as mut (aka exclusive), which the caller can't get until it drops item because self and item have the same lifetime. Is that a reasonable description of how that would work?

Good catch with exhausted to catch additional reads after the generator has completed. Thanks for cleaning up the Pins, I wasn't exactly sure what was needed. I suppose Unpin is needed in the function so that the function can place the (moved) generator into the struct, because a pinned object can't be moved, yes?

Why do you need a new Pin every time you call resume? I understand that mechanically resume moves self, but why is it designed that way? Or maybe where can I read more about the design of Pin & friends?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.