Creating a struct with a Peekable iterator as a field -- Confusion as to when types match

I'm exploring different Rust idioms for writing parsers for bioinformatics file formats.

For one particular format (FASTA), I find it useful to be able to look ahead one line, so I've written a function (next_record) that takes a Peekable Iterator as illustrated in the following code:

use std::io::{BufRead, BufReader};
use std::iter::Peekable;

#[derive(Debug, Default)]
pub struct Record {
    pub id: String,
    pub description: String,
    pub sequence: String,
}

impl Record {
    fn new() -> Record {
        Record {
            ..Default::default()
        }
    }
}

pub fn next_record<I>(itr: &mut Peekable<I>) -> Option<Record>
where
    I: Iterator<Item = String>,
{
    let mut active_record = false;
    let mut rec = Record::new();

    while let Some(nextline) = itr.peek() {
        let nextline = nextline.trim();
        match nextline.chars().next() {
            None | Some(';') => (),
            Some('>') if active_record => {
                return Some(rec);
            }
            Some('>') => {
                active_record = true;
                let mut parts = nextline[1..].splitn(2, char::is_whitespace);
                rec.id = parts.next().unwrap_or("").to_owned();
                rec.description = parts.next().unwrap_or("").to_owned();
            }
            Some(_) if active_record => rec.sequence.push_str(nextline),
            _ => (),
        }
        itr.next();
        continue;
    }
    if active_record {
        Some(rec)
    } else {
        None
    }
}

fn main() {
    let s = "
    ; a comment line
    >id1 description1
    ATAGACGAGCAG
    >id2 description2
    ATAGATAGATA
    ";

    // using next_record with peekable derived from BufRead
    let rdr = BufReader::new(s.as_bytes());
    let mut itr = rdr.lines().filter_map(Result::ok).peekable();
    while let Some(rec) = next_record(&mut itr) {
        println!("{:?}", rec);
    }

    // using next_record with peekable derived from &str
    let mut itr = s.lines().map(str::to_owned).peekable();
    while let Some(rec) = next_record(&mut itr) {
        println!("{:?}", rec);
    }
}

Playground link

Now I'm trying to implement an iterator based around next_record() like so:

pub struct FastaIterator<I>
where
    I: Iterator<Item = String>,
{
    itr: Peekable<I>,
}

impl<I> FastaIterator<I>
where
    I: Iterator<Item = String>,
{
    pub fn from_rdr(rdr: impl BufRead) -> FastaIterator<I> {
        FastaIterator {
            itr: rdr.lines().filter_map(Result::ok).peekable(),
        }
    }
    pub fn from_str(s: &str) -> FastaIterator<I> {
        FastaIterator {
            itr: s.lines().map(str::to_owned).peekable(),
        }
    }
}

/// Implements Iterator trait
impl<I> Iterator for FastaIterator<I>
where
    I: Iterator<Item = String>,
{
    type Item = Record;

    // next() is the only required method
    fn next(&mut self) -> Option<Self::Item> {
        next_record(&mut self.itr)
    }
}

However, my from_rdr() and from_str() functions are rejected with the error message that they do not match the type std::iter::Peekable<I>.

My question -- Why do these constructs (rdr.lines().. and s.lines()..) match the Peekable constraints when I call them in main (as in the first code example), but not when I try to associate them with a struct?

Your problem is not because rdr.lines().filter_map(Result::ok).peekable(), it is in completely other place. The issue is, that if you are putting types as return values, they need to be complete types which can be inferred only from input values. In here, you cannot infer I basing on your input. The I is actually very difficult to name (and it is quoted in your error), but the easy solution is to change return type to impl Iterator<Item = Record> - impl Trait on return type is enforcing type elision.

When you do this there would be additional problem with lifetime of from_str, the solution is to add + '_ to return type.

Thank you! As suggested I've changed the FastaIterator constructors as so:

impl<I> FastaIterator<I>
where
    I: Iterator<Item = String>,
{
    pub fn from_bufread(rdr: impl BufRead) -> impl Iterator<Item = Record> {
        FastaIterator {
            itr: rdr.lines().filter_map(Result::ok).peekable(),
        }
    }
    pub fn from_str(s: &str) -> impl Iterator<Item = Record> + '_ {
        FastaIterator {
            itr: s.lines().map(str::to_owned).peekable(),
        }
    }
}

However, I'm now stuck on how to declare a variable to hold this iterator. I've tried to do the following:

fn main() {
    let s = "
    ; a comment line
    >id1 description1
    ATAGACGAGCAG
    TAGCAGATAGATA
    ATTATA
    >id2 description2
    ATAGATAGATA
    ";

    let mut itr = FastaIterator::from_str(s);
    while let Some(rec) = itr.next() {
        println!("{:?}", rec);
    }
}

But I get the compiler error:

error[E0284]: type annotations needed: cannot resolve `<_ as std::iter::Iterator>::Item == std::string::String`

and I can't figure out what annotation to put on itr to make the compiler happy.

impl<I> takes I as a parameter from the caller. The caller decides what type it is, not you.

Someone could implement Iterator<Item = String> on struct Tralalala, and call FastaIterator<Tralalala>, and you'd have a type error, because your s.lines().map(str::to_owned).peekable() is not struct Tralalala that the caller demanded.

  1. Put the two functions in two separate impl blocks, because they use different and incompatible types of iterator. Rust is not Java, impl Iterator or I: Iterator is still a specific type, not a base class.

  2. If you want one type to handle both cases, you'll need to add dynamic typing with struct FastaIterator { itr: Peekable<Box<dyn Iterator<Item = Result<String, std::io::Error>>>> }

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.