Mixing `step_by` with `peekable` on iterators

I want to iterate over a string by two characters, in other words on each iteration I want to consume two characters from string. I tried to use step_by and peekable over chars() method but I did not success. What is the best method for consuming two chars on each iteration over a string?

The itertools crate has several useful methods for this, such as Itertools::tuples. Example:

use itertools::Itertools;

fn main() {
    for (a, b) in "Hello world!".chars().tuples() {
        println!("{} {}", a, b);
    }
}

(Playground)

1 Like

You can also implement this fairly easily without a dependency and without confining the iterator item to tuples of any maximum length, by writing an iterator that is generic over arrays instead: Playground.

fn by_chars<'a, T>(s: &'a str) -> impl Iterator<Item=T> + 'a
    where
        T: Default + 'a,
        for<'b> &'b mut T: IntoIterator<Item=&'b mut char>
{
    struct NChars<'c, U> {
        inner: std::str::Chars<'c>,
        _phantom: std::marker::PhantomData<U>
    }
    
    impl<'c, U> Iterator for NChars<'c, U>
        where
            U: Default + 'c,
            for<'d> &'d mut U: IntoIterator<Item=&'d mut char>
    {
        type Item = U;
        
        fn next(&mut self) -> Option<Self::Item> {
            let mut item = U::default();
            
            for ptr in &mut item {
                *ptr = self.inner.next()?;
            }
            
            Some(item)
        }
    }
    
    NChars {
        inner: s.chars(),
        _phantom: std::marker::PhantomData,
    }
}

fn main() {
    for chunk in by_chars::<[char; 2]>("Hello World!") {
        println!("{} {}", chunk[0], chunk[1]);
    }
}

You can also adjust this so that it returns a subslice into the strings instead of an array of char, like this:

/// `n: NonZeroUsize` because one can't iterate by 0 chars
fn by_chars(s: &str, n: NonZeroUsize) -> impl Iterator<Item=&str> {
    struct NChars<'c> {
        inner: std::str::Chars<'c>,
        n: usize,
    }
    
    impl<'c> Iterator for NChars<'c> {
        type Item = &'c str;
        
        fn next(&mut self) -> Option<Self::Item> {
            let string = self.inner.as_str();
            let len = self.inner.by_ref().take(self.n).map(char::len_utf8).sum();

            if len > 0 {
                Some(&string[..len])
            } else {
                None
            }
        }
    }
    
    NChars {
        inner: s.chars(),
        n: n.into(),
    }
}

Incidentally, what are you using this for? Remember that char does not correspond to a human-readable character; it's a Unicode code point instead. If you want to iterate human-readable "characters", you'd have to use something like str::grapheme_clusters() from the unicode_segmentation crate, and a very slight variation on the above piece of code.

5 Likes

Perhaps a combination of windows and step_by:

let iter = slice.windows(2).step_by(2);

...even better from @cuviper: slice.chunks_exact(2).

Playground

1 Like

If the window and step size are the same, you can use chunks_exact(2).
(Assuming you have a slice, but I think the OP does not.)

2 Likes

Even better.

It's possible the conversion to slice is an issue. But it did seem like something to consider anyway.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.