Intersperse Iterators

Motivation

Sentence is a struct containing Vec<Word> while Word is a struct that contains Vec<char>. We want to implement a function chars() returning an impl Iterator<Item = char>, iterating over characters of each word, while inserting ' ' between them:

pub struct Word(pub Vec<char>);

pub struct Sentence {
    pub words: Vec<Word>,
}

impl Sentence {
    pub fn chars(&self) -> impl Iterator<Item = char> {...}
}

The problem is similar to this topic:

Solution

We can get close by chaining a space once-iterator after each word:

pub fn chars_space_after(&self) -> impl Iterator<Item = char> + '_ {
    self.words
        .iter()
        .flat_map(|x| x.0.iter().cloned().chain(std::iter::once(' ')))
}

Unfortunately, this iterator inserts an extra space after the last word. We can swap the order of the chain to get the extra space at the beginning instead:

pub fn chars_space_before(&self) -> impl Iterator<Item = char> + '_ {
    self.words
        .iter()
        .flat_map(|x| std::iter::once(' ').chain(x.0.iter().cloned()))
}

From there, we can just drop the first element, and get a solution:

pub fn chars_1(&self) -> impl Iterator<Item = char> + '_ {
    let mut iter = self.chars_space_before();
    iter.next();
    iter
}

Improvements

The solution we found is straightforward, but not very satisfying. The first once-iterator is created and immediately discarded. We can use @2e71828's method to get around this: by processing the first word separately, and chaining the rest:

pub fn chars_2e71828(&self) -> impl Iterator<Item = char> + '_ {
    let mut iter = self.words.iter();
    let head = iter.next().into_iter().flat_map(|x| x.0.iter().cloned());
    let tail = iter.flat_map(|x| std::iter::once(' ').chain(x.0.iter().cloned()));
    head.chain(tail)
}

And while this works, it still feels clunky. We need to declare a variable for iter and the logic of processing Words is duplicated. We can remove the logic repetition and the head and tail variables to get:

pub fn chars_3(&self) -> impl Iterator<Item = char> + '_ {
    fn word_chars(word: &Word) -> impl Iterator<Item = char> + '_ {
        word.0.iter().cloned()
    }
    let mut iter = self.words.iter();
    iter.next()
        .into_iter()
        .flat_map(word_chars)
        .chain(iter.flat_map(|x| std::iter::once(' ').chain(word_chars(x))))
}

This hardly feels like an improvement, we are writing a lot of code to get what is essentially a join for iterators.

Desired solution

The problem feels like a natural case for intersperse. However, there is a problem:

pub fn chars(&self) -> impl Iterator<Item = char> + '_ {
    self.words
        .iter()
        .map(|x| x.0.iter())
        .intersperse(std::iter::once(' '))      // [E0308]
        .flatten()
        .cloned()
}
  1. mismatched types
    expected struct std::slice::Iter<'_, char>
    found struct std::iter::Once<char> [E0308]

Unfortunately, we cannot mix and match iterator types with intersperse like we can with chain.

At this point I ran out of ideas. I am wondering if there is a better solution. I am also wondering whether it would be possible to modify intersperse to work with different types of iterators, like the last solution requires.

Why don't you do it like this?

.intersperse([' '].iter())
3 Likes

Great call! I knew I was missing something.

Now I am just wondering if there is a situation where interspersing different interator types is actually necessary.

Well, the thing is that the separator passed to .intersperse needs to match the Item associated type of the iterator. That means that you cannot combine different iterator types with the current API for .intersperse.

If you ever find a situation where you want to intersperse different iterator types, it would be just a matter of calling .map before calling .intersperse, just as you did in your playground.

Note that your original solution can be made work just fine without needing a temporary variable and mutability. Just use .skip(1):

pub fn chars(&self) -> impl Iterator<Item = char> + '_ {
    self.words
        .iter()
        .flat_map(|x| std::iter::once(' ').chain(x.0.iter().cloned()))
        .skip(1)
}
5 Likes

That's nightly-only, though.

I think it goes even further than this. Per the Iterator trait

pub trait Iterator {
    type Item;

    ...
}

there is only one Item type. So if you want to combine different item types, i.e.
SliceIter and OnceIter (probably not their actual names) then Item would have to be a trait Object, allowing for different concrete types. Is that even possible?

1 Like

Why wouldn't it possible?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.