Another Iterator Adapter

I wrote another iterator adapter similar to the one in
https://users.rust-lang.org/t/feedback-for-iterator-adapter/98090/9

This one works like intersperse but lets SomeIterator<IntoIteratorA<U>> be interspersed with IntoIteratorB<U> to get an Iterator<Box<dyn Iterator<U>>. You can then call .flatten() to get an Iterator<U>.

I have a question, but feedback is also welcome.

Prompted by my last topic, I tried which bounds I could leave away. For some of them I was quite astonished that I could just leave them away.

For example I commented out a line S: Clone on IterExt::intersperse_with_iter(). Why is that possible? The Iterator implementation for IntersperseWithIter<I, S> clearly has the trait bound S: Clone. Or would that just mean that IntersperseWithIter<I, S> just isn't an iterator when chained on an Iterator<IntoIterator> where the IntoIterator isn't Clone?. How should I then proceed? Leave the bounds on or not?

use std::iter::Peekable;

struct IntersperseWithIter<I, S>
where
    I: Iterator,
    I::Item: IntoIterator<Item = S::Item>,
    S: IntoIterator,
{
    separator: S,
    iter: Peekable<I>,
    needs_sep: bool,
}

impl<I, S> Iterator for IntersperseWithIter<I, S>
where
    I: Iterator,
    I::Item: IntoIterator<Item = S::Item> + 'static,
    S: IntoIterator + Clone + 'static,
{
    type Item = Box<(dyn Iterator<Item = S::Item>)>;

    fn next(&mut self) -> Option<Self::Item> {
        if self.needs_sep && self.iter.peek().is_some() {
            self.needs_sep = false;
            Some(Box::new(self.separator.clone().into_iter()))
        } else {
            self.needs_sep = true;
            self.iter
                .next()
                .map(|x| Box::new(x.into_iter()) as Self::Item)
        }
    }
}

trait IterExt: Iterator {
    fn intersperse_with_iter<S>(self, separator: S) -> IntersperseWithIter<Self, S>
    where
        Self: Sized,
        Self::Item: IntoIterator<Item = S::Item>,
        S: IntoIterator,
        // S: Clone,
    {
        IntersperseWithIter {
            separator,
            iter: self.peekable(),
            needs_sep: false,
        }
    }
}

impl<I> IterExt for I where
    I: Iterator
    // I::Item: IntoIterator,
    // <I::Item as IntoIterator>::Item: Clone,
{
}

fn main() {
    let a: [&[u8]; 3] = [b"hello", b"my", b"world"];
    let b: Vec<_> = a
        .into_iter()
        .intersperse_with_iter(b" SEP ")
        .flatten()
        .cloned()
        .collect();
    println!("{:?}", String::from_utf8(b).unwrap());
}

(Playground)

Output:

"hello SEP my SEP world"

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.94s
     Running `target/debug/playground`

Also it doesn't work if I change .into_iter() to iter() because of the 'static lifetimes. Can anyone explain why those are needed? I just added them because the compiler told me so.

Exactly. See this playground.

I've also removed superfluous bounds from IntersperseWithIter.

struct IntersperseWithIter<I, S>
where
    I: Iterator,
{
    separator: S,
    iter: Peekable<I>,
    needs_sep: bool,
}

Here, only I: Iterator is necessary, as that is required by Peekable<I>. Generally it's preferable to only have the minimal number of trait bounds in the places where they are actually needed. Any bounds on the struct need to be spelled out unless something like implied bounds is implemented. Leaving out the bounds on the struct can also simplify other trait implementations, as the bounds don't need to be included if not needed for this specific trait (e.g. Debug).

Additionally, sometimes you just wan to construct a type, in this case the IntersperseWithIter, without using it as its original intended purpose.

2 Likes

That seems to be wrong. When I take away the bound

I::Item: IntoIterator,

I get

 1  error[E0277]: `<I as Iterator>::Item` is not an iterator
  --> src/main.rs:8:16
   |
 8 |     delimiter: <I::Item as IntoIterator>::Item,
   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `<I as Iterator>::Item` is not an iterator
   |
   = help: the trait `Iterator` is not implemented for `<I as Iterator>::Item`
   = note: required for `<I as Iterator>::Item` to implement `IntoIterator`
 help: consider extending the `where` clause, but there might be an alternative better way to express this requirement
   |
 5 |     I: Iterator, <I as Iterator>::Item: Iterator
   |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 For more information about this error, try `rustc --explain E0277`.
 error: could not compile `intersperse_iter` (bin "intersperse_iter") due to previous error

Also your playground doesn't compile.

EDIT

Ah yeah of course your playground doesn't compile. Sorry.

EDIT 2

My bad, I tested the wrong code. You are right, they can be left away.

Yes that’s what it would mean.

If you want to follow the standard library, put the bounds on the function that creates the iterator to avoid confusion, but don’t put it on the type itself if not necessary.

For example Map in std::iter - Rust has no bounds, neither on the iterator not the closure. But the impl Iterator has them, and the map method has them, too (Self: Iterator is clear in this context, the closure bounds exist, too). Maybe not the best example, since closure bounds also help a lot with type inference for the caller… let’s see at another case

Here: Rev in std::iter - Rust – has no bounds, but the rev method requires self: DoubleEndedIterator in order to avoid confusion when rev would work but the result wouldn’t be an iterator unless extra bounds are fulfilled. It doesn’t only help with confusion and documentation, it also improves the error messages.

4 Likes

That was intended. It doesn't compile due to the last two lines in main

    let not_clone_iter = NotClone;
    
    vec![NotClone].into_iter().intersperse_with_iter(not_clone_iter).collect();

which show that calling intersperse_with_iter with an iterator that is not Clone results in the return value (a IntersperseWithIter instance) not implementing Iterator due to the missing `Clone implementation.

Your linked playground has no delimeter field. Are you using a different version?

Yes, see my edits. Sorry for the trouble.

1 Like

Ok, and how about the commented out bounds on the implementation of IterExt for Iterators?

If you want a review on the adapter: I would avoid unnecessary boxing, especially in a library, especially if allowing the user to call flatten is a common use case; so an alternative for Box<dyn Iterator<…>> to allow two distinct iterator types to be merged via Either in either - Rust.

In this case then, the adapter that intersperses an iterator of A and one of B into an iterator of Either<A, B> would be useful even if A and B are not Iterator types, too – however, that case would not support that for A and B merely being IntoIterator the resulting item type is an Iterator. For that you would need the extra bounds, and the return items of type Either<A::IntoIter, B::IntoIter>. Alternatively, one could say that the user shall just call into_iter themself. But then one could argue that the user might as well also map Either::Left and call Either::Right to make the types match, and then use ordinary Itertools::intersperse. One could also just make multiple versions of the adapter.

1 Like

I would not include them, due to the same reasoning as above. They are not needed for the trait as a whole, and the individual intersperse_with_iter function constrains Self as needed. You can have a look at Itertools, which does the same thing.

1 Like

Because I'm procrastinating, here is how you could do that. This also has the nice benefit, that it gets rid of the 'static bounds, as these were necessary due to the trait object.

1 Like

I just saw that in this specific case the standard library actually does include the Clone bound on the type^^. But here it is actually necessary because they derive Clone :slight_smile:

#[derive(Debug, Clone)]
pub struct Intersperse<I: Iterator>
where
    I::Item: Clone,
{
    separator: I::Item,
    iter: Peekable<I>,
    needs_sep: bool,
}

Haha, same😅

1 Like

Thanks a lot for this answer and thanks @RobinH for swiftly implementing it. I would have never considered an enum as return type when I have a fixed number of return types. This is a great tool to have in my pocket.

No… but I only learned this just now o.O

Apparently the built-in derive does add bounds for associated types. When they appear in any of the field’s types. If they are literally written as I::Type. It does work for I::Item but not even for <I as Iterator>::Item. This shit is crazy… now I’m actually looking for documentation of built-in derive attributes. Did nobody properly document the built-in derive attributes or what!? The docs page is entirely useless… the reference acknowledges their existence…

1 Like

So it's something akin to

#[derive(Debug)]
struct Test<T> {
    field: T
}

// generates
impl<T: Debug> Debug for Test<T> {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
        // Implementation omitted
    }
}

?