Window sliding for prev, current and next in &str

I'm trying to iterate a &str by windows of size 3. What that means is that given "abc", I should have:

previous: None, current: 'a', next: 'b'
previous: 'a', current: 'b', next: 'c'
previous: 'b', current: 'c', next: None

I found a lot of ways to do window iterating in a forward manner (eg: here or here) but none to do what I want. How can I do that ?

It would be even better if the solution is generic, meaning given any window size.

What you you want three of? Bytes, unicode scalar values (doubtful), grapheme clusters?

Here's an example for grapheme clusters.

2 Likes

Grapheme clusters seems fine. Your solution is nice but I need the previous, current and next. So your example would print:

None | 日 | 本
日 | 本 | 語
本 | 語 | None

If you compare what you get from Itertools::tuple_windows to what you want, it becomes apparent that you can trivially modify @quinedot's solution to fit your requirement by using Iterator::chain (and its macro version).

Playground

1 Like

That's exactly what I'm looking for but in more generic. This should adapt to window size but also to input size. For example 日本 should give:

None | 日 | 本
日 | 本 | None

@quinedot @doublequartz I came up with this:

#[derive(Clone)]
struct Triplet {
    prev: Option<char>,
    curr: Option<char>,
    next: Option<char>,
}

fn char_windows<'a>(src: &'a str) -> impl Iterator<Item = Triplet> + 'a {
    let mut t = Triplet {
        prev: None,
        curr: None,
        next: None,
    };

    src.chars()
        .enumerate()
        .map(move |(i, c)| {
            t.curr = Some(c);
            t.next = src.chars().nth(i + 1);
            let res = t.clone();

            t.prev = t.curr;
            t.curr = t.next;
            res
        })
}

fn main() {
    let s = "你好吗";
    let window = char_windows(s);

    for w in window {
        println!("{:?} | {:?} | {:?}", w.prev, w.curr, w.next);
    }
}

Is there any way to improve perf or readability ?

My solution already does just that, you did not have to invent your own. If you have any doubts, you can confirm that it works by changing window size and input in the playground.

1 Like

@doublequartz would you be able to guide me on how to store this TupleWindow type in a struct ? This is what I have right now:

struct Lexer<'a> {
    w: TupleWindows<
	    std::iter::Chain<
		    std::iter::Chain<
			   std::array::IntoIter<Option<&'a str>, 1_usize>,
			   Map<Graphemes<'a>, fn (&'a str) -> Option<&'a str>>>,
		    std::array::IntoIter<Option<&'a str>, 1_usize>>,
	(Option<&'a str>, Option<&'a str>, Option<&'a str>)>
}

and then in the Lexer impl:

fn new(s: &str) -> Self {
    Lexer {
	    w: chain!(
		    [None; 1].into_iter(),
		    s.graphemes(true).map(Some),
		    [None; 1].into_iter()
	    ).tuple_windows::<(_, _, _)>()
    }
}

definitely seems like a wrong approach, so I'm open to suggestions

The idiomatic answer is that you don't, and shouldn't store iterators in structs. In practice, this means you will have to use the iterator directly from where you're using Lexer. If doing this makes your function too big, make a new function and pass it the tuples you get from the iterator, instead of the Lexer.

1 Like