How to return an iterator


#1

Hi,

I was wondering it anyone can help me figure out how to return an iterator. Example:

for line in BufReader::new(file).lines() {
        let r = line.unwrap();
        println!("{}", r);
}

What I would like to do here is to move BufReader::new(file).lines() into a function and have a function call here named filt_iter(). Is this possible? So the idea is to filter out some lines in this function filt_iter() (i.e. any line that has X inside should be skipped)… help ?!?

Expected usage:

for line in filt_iter(file) {

        println!("{}", line);   // only lines without X printed   -> no need for regex implementation U already helped with that one
}

thnx


#2
fn filt_iter<R: Read>(file: R) -> impl Iterator<Item = String> {
    BufReader::new(file)
        .lines()
        .map(Result::unwrap)
        .filter(|s| !s.contains("blah"))
}

#3

Thnx! This is an elegant solution and a great example on how to properly return an iterator. :slight_smile: I have two more questions if I may.

  1. If I have a large file (5GB txt file) this would load it into memory if I am not mistaking. What would be a low memory alternative?
  2. Follow-up on the previous question; given I would like to implement my own iterator and set it into an object how would I do that. I tried to write something minimalistic but the the amount of errors is incomprehensible. Any help would be more that appreciated:
use std::io::*;
use std::fs::File;


struct Lines<R> where R: Read {
    read: BufReader<R>,
    buf: String
}



impl <R: Read> Lines<R> {
    fn new(r: R) -> Lines<R> {
        Lines{read: BufReader::new(r), buf: String::new()}
    }
    fn next(&mut self) -> Option<Result<&str>>{
        self.buf.clear();
        match self.read.read_line(&mut self.buf) {
            Ok(nbytes) => if nbytes == 0 {
                None 
            } else {
                let line = self.buf.trim_right();
                Some(Ok(line))
            },
            Err(e) => Some(Err(e))
        }
    }
}


pub trait Parse {
	fn init<R: Read>(file: &str) -> R;
	fn filt_iter (&self)  ->  Option<Result<&str>>;
}


struct ObjA<R> where R: Read{
	fh: R,
}

impl <R: Read> ObjA<R> {

   pub fn new(file: &str)-> Self{
            ObjA{
                fh: ObjA::init(file),
            }
        }
}

impl Parse for ObjA<R> where  R: Read {
	fn filt_iter (&self)  ->  Option<Result<&str>>{
		let mut line = Lines::new(&self.fh);
                 line.next() // Hm ....  ??
	}
	fn init<R: Read>(file: &str) -> R {
		File::open(file.to_string())
	}

}


//////// in my main.rs


fn main(){

  let o = Obj::new("myfile.txt");
  
  for line in o.filt_iter() {
     println!("{}", line);
  }

}

I realize is a bit convoluted solution but I am trying to be consistent with previous designs of the code I am trying to implement my part into. So I know this can be streamlined but the setup I have is also important.

Thank you once more !!


#4

No, the code by @vitalyd will not read the whole file into memory at once. This is the beauty of iterators. (unless the file is one very very long line, of course, but then it’s your fault for asking for lines)

I am not quite sure what exactly you’re trying to do with the code sample, can you try to clarify what you want the code to do?


#5

Oh… so sorry, still learning. Ok so then the above implementation into object would look something like :


pub trait It {
     type Iter: Iterator<Item = String>;
     fn records(&self) ->  Self::Iter;
}


impl It for ObjA{
   	type Iter = Iterator<Item = String>;
        fn records(&self) -> Self::Iter {
		BufReader::new(&self.fh)
			.lines()
			.map(Result::unwrap)
			.filter(|s| !s.contains("bla"))
	}
}

but then I get the following :

error[E0277]: the size for values of type `(dyn std::iter::Iterator<Item=std::string::String> + 'static)` cannot be known at compilation time
   --> src/util/parse.rs:280:6
    |
280 | impl It for ObjA{
    |      ^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `std::marker::Sized` is not implemented for `(dyn std::iter::Iterator<Item=std::string::String> + 'static)`
    = note: to learn more, visit <https://doc.rust-lang.org/book/second-edition/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>

:frowning:


#6

It seems your code sample is missing the definition of ObjA, but the reason for your error is that you wrote type Iter = Iterator<Item = String>, and Iterator is a trait, and all traits are unsized.

If you want to use an iterator in a type definition in a trait, you must use the actual underlying iterator type. In this case the type would look something like Filter<Map<Lines<File>, ...>, ...>, where the types mentioned are these: Filter, Map, Lines. Unfortunately you cannot mention the types of a closure, so I put ... where they should have been. Because of the closures, you would have to define you own struct and implement Iterator for it (or use a box).

However I suspect this is not what you’re trying to do. Take a look at the code sample below:

use std::fs::File;
use std::io::{BufReader, BufRead};

pub struct Records {
    fh: File,
}
impl Records {
    pub fn new(fh: File) -> Records {
        Records {
            fh,
        }
    }
    pub fn records(self) -> impl Iterator<Item = String> {
        BufReader::new(self.fh)
            .lines()
            .map(Result::unwrap)
            .filter(|s| !s.contains("blah"))
    }
}

#7

For completeness sake, I’ll add this:

use std::fs::File;
use std::io::{BufRead, BufReader, Lines};

pub struct RecordsIter<'a> {
    lines: Lines<BufReader<File>>,
    skip: &'a str,
}
impl<'a> RecordsIter<'a> {
    pub fn new(fh: File, skip: &'a str) -> RecordsIter<'a> {
        RecordsIter {
            lines: BufReader::new(fh).lines(),
            skip,
        }
    }
}
impl<'a> Iterator for RecordsIter<'a> {
    type Item = String;
    fn next(&mut self) -> Option<String> {
        // Let's loop until we find something that should not be filtered.
        loop {
            match self.lines.next() {
                Some(line) => {
                    let line = line.unwrap();
                    if !line.contains(self.skip) {
                        return Some(line);
                    }
                },
                None => return None,
            }
        }
    }
}

#8

Thank you !!! :slight_smile:


Method calls with : &self.xx