Iterator usage questions

If I create a function that takes and returns an iterator:

pub fn add_string_spaces(vals: impl Iterator<Item = char>) -> impl Iterator<Item = char> {
    vals.enumerate().flat_map(|(i, ciph_char)| match i % 5 {
        0 if i > 0 => vec![' ', ciph_char],
        _ => vec![ciph_char],
    })
}

I can use it like this: add_string_spaces(map_cipher(&plain)).collect::<String>() where map_cipher's signature looks like this:

pub fn map_cipher<'a>(plain: &'a str) -> impl Iterator<Item = char> + 'a { ...

But I can't use it like this: map_cipher(&plain).add_string_spaces().collect::<String>(). I get an error message:

no method named `add_string_spaces` found for type `impl std::iter::Iterator` in the current scope

What am I not understanding here? Thanks!

add_string_spaces isn't a method, it's a function. To call it with method syntax you'd need it to be implemented on whatever you're calling it on; the usual way of doing this is creating a trait (call it IteratorExt or something,) putting the method on there, and implementing it for T: Iterator<Item = char> (this is what itertools does, for example.)

Thanks @asymmetrikon! Can you help me with some syntax? What I have below won't compile:

trait IterExt {
    fn add_string_spaces(&self) -> Iterator<Item = char>;
}

impl IterExt for Iterator<Item = char> {
    fn add_string_spaces(&self) -> Iterator<Item = char> {
        self.enumerate().flat_map(|(i, ciph_char)| match i % 5 {
            0 if i > 0 => vec![' ', ciph_char],
            _ => vec![ciph_char],
        })
    }
}

Besides the earlier error message I get two more:

the `enumerate` method cannot be invoked on a trait object
error[E0277]: the size for values of type `(dyn std::iter::Iterator<Item = char> + 'static)` cannot be known at compilation time

Since Iterator is a trait, you need to write this as

impl<T: Iterator<Item = char>> IterExt for T {

Thanks @alice, I think I'm getting in over my head. But one more iteration (no pun intended!). I now have:

trait IterExt {
    fn add_string_spaces(&self) -> Iterator<Item = char>;
}

impl<T: Iterator<Item = char>> IterExt for T {
    fn add_string_spaces(&self) -> Iterator<Item = char> {
        self.enumerate().flat_map(|(i, ciph_char)| match i % 5 {
            0 if i > 0 => vec![' ', ciph_char],
            _ => vec![ciph_char],
        })
    }
}

and I get:

expected trait std::iter::Iterator, found struct `std::iter::FlatMap`

The other problem is that your method would need to return impl Iterator<Item = char>; however, you can't do that in a trait. You'd have to write your own Iterator<Item = char> that did the same thing as enumerate().flat_map(...); that'd be pretty complex, given you'd have to replicate the machinery of FlatMap. You can get away with it by returning a Box<dyn Iterator<Item = char>>:

trait IterExt {
    fn add_string_spaces(self) -> Box<dyn Iterator<Item = char>>;
}

impl<I: Iterator<Item = char> + 'static> IterExt for I {
    fn add_string_spaces(self) -> Box<dyn Iterator<Item = char>> {
        Box::new(self.enumerate().flat_map(|(i, ciph_char)| match i % 5 {
            0 if i > 0 => vec![' ', ciph_char],
            _ => vec![ciph_char],
        }))
    }
}

The problem here is that you're not returning an Iterator<Item = char>, but a FlatMap<stuff>. Of course, the FlatMap struct implements the Iterator trait, but in Rust there is a distinction between the concrete struct and the trait.

Some months ago (a year?) the impl Trait feature was added to the language, and it allows you to write that a function returns an impl Trait, which means “I'm returning some concrete struct/enum that implements this trait, but I won't tell you which!”. This means you don't have to write out the full type of the struct, since the types can get quite unwieldy, and when using closures, you may not even be able to write it out.

However, even if you use impl Trait, there must still be some specific type that you're returning. This means you can't return two different types of iterator in an if {} else {}, and it also means you cannot use it when creating a trait. All types that implement your trait would have to return the same struct/enum, and that's not enforceable when it is as global a thing as a trait; it may even be implemented in a different crate.

I know in this specific case there is only one impl, but that's the deal. You must either:

  1. Write out the full type. (since you have a closure, it needs to be put into an actual function, so you can mention the type)
  2. Create your own struct and implement Iterator for it.
  3. Use Box (which is a bit less efficient).

You may want to read this thread.

(a lot of reading and tinkering later)... thanks @alice and @asymmetrikon. All three options have their strengths. It seems to me that the main decisions one needs think about are:

  • Do you care if there is heap allocation?
  • Do you care if a specific Iterator type is returned or is any Iterator appropriate/ok?
trait IterExt {
    type Ret : Iterator<Item = char>;

    fn add_string_spaces (self) -> Self::Ret;
}

impl<T : Iterator<Item = char>> IterExt for T {
    type Ret = ::std::iter::FlatMap<
        ::std::iter::Enumerate<T>,
        Vec<char>,
        fn((usize, char)) -> Vec<char>,
    >;

    fn add_string_spaces (self) -> Self::Ret
    {
        self.enumerate().flat_map(|(i, ciph_char)| match i % 5 {
            0 if i > 0 => vec![' ', ciph_char],
            _ => vec![ciph_char],
        })
    }
}
1 Like

The difference in the return types between the Trait and the implementation threw me for a loop. Many thanks for the explicit example, @Yandros!