Read overlapping chunks out of polymorphic IO (stdin/Memmap)


#1

My command-line program needs to handle data from a pipe and from a file via Mmap.
The data should be processed read-only (in parallel) with iterators returning overlapping short slices.
I am working with Memmap because it needs to be as fast as possible.

I have 2 questions:

  1. I have an working example (see code below) for reading from stdin/BufReader. How can migrate this to stdin/Memmap?

  2. How to implement an Iterator returning overlapping short slices?

use std::io::{self, Read, BufRead};
use std::fs::File;
use std::env;

struct Input<'a> {
    source: Box<BufRead + 'a>
}

impl<'a> Input<'a> {
    fn pipe(stdin: &'a io::Stdin) -> Input<'a> {
        Input { source: Box::new(stdin.lock()) }
    }

    fn file(path: &str) -> io::Result<Input<'a>> {
        File::open(path)
          .map(|file| Input{source: Box::new(io::BufReader::new(file)) }
    )}


}

impl<'a> Read for Input<'a> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        self.source.read(buf)
    }
}

impl<'a> BufRead for Input<'a> {
    fn fill_buf(&mut self) -> io::Result<&[u8]> {
        self.source.fill_buf()
    }
    fn consume(&mut self, amt: usize) {
        self.source.consume(amt);
    }
}



fn main() {
    let arg1 = env::args().skip(1).next();
    let stdin = io::stdin();

    let mut input = match arg1 {
       Some(ref s) if s == "-" =>  Input::pipe(&stdin),
       Some(s)  =>  Input::file(&s).unwrap(),
       _ =>  panic!("First parameter has to be filename or '-'.")
    };
    
    for line in input.lines() {
        println!("from input : {:?}", line);
    }
}

The slightly modified above code was found in how to do polymorphic io file or stdin in Rust.


#2

Thank you all for your help.

Here you find the the product I needed overlapping chunks for: Github: getreu/stringsext
A Unicode enhancement of the GNU strings-tool with additional features.