Return different combinators that impl Iterator

I find the fact that this doesn't compile a bit surprising:

struct MyVec(Vec<i32>);

impl MyVec {
    fn iter(&self) -> impl Iterator<Item = i32> {
        if self.0.len() % 2 == 0 {
            // Map<Map<std::vec::IntoIter<_>>
            self.0.into_iter().map(|x| x+1).map(|x| 2*x)
        } else {
            // Map<std::vec::IntoIter<_>
            self.0.into_iter().map(|x| x+1)
        }
    }
}

It's because the two branches return different concrete types. I'm aware of prior discussion on the subject. My questions are somewhat not covered:

Is this considered an accidental limitation or a deliberate choice?
Is there a plan to make this work in the future?
Have things changed since the design discussion of 2018?
What is the latest authoritative resource on the matter?

Thanks!

Deliberate. Rust is strictly typed and -> impl Trait is an opaque but otherwise shallow alias.

Probably not without new, distinct[1] functionality such as ad-hoc enums. I'm not aware of any concrete plans for those at the moment. If you only need two types, you can use Either instead of your own enum or type erasure.

Sometimes you can finagle the combinators to line up to a single type (but not with closures without some sort of type erasure somewhere).

Rust is a team-ran open source project so there isn't really an authoritative resource on future possibilities. You can ask on the various forums, check RFCs to see if there's any activity about things you care about, etc.


  1. especially once we get impl Trait in type aliases, people will be able to rely on opaque types being shallow aliases of pre-existing concrete types ↩ī¸Ž

2 Likes

I don't think this feature would provide enough syntactic advantage to make it a worthwhile change. My primary concern is that it would be inconsistent for the compiler to accept that code as written but reject it when the condition block needs "internal" type erasure. For instance, binding the conditional block to a variable or chaining another map() call:

impl MyVec {
    fn iter(&self) -> impl Iterator<Item = i32> {
        if self.0.len() % 2 == 0 {
            // Map<Map<std::vec::IntoIter<_>>
            self.0.into_iter().map(|x| x+1).map(|x| 2*x)
        } else {
            // Map<std::vec::IntoIter<_>
            self.0.into_iter().map(|x| x+1)
        }
        // Error: `self` cannot be both `Map<Map<IntoIter<i32>>>` and `Map<IntoIter<i32>>`
        .map(|x| x + 2)
    }
}
1 Like

These are good points. Would it then make sense to add an iter() or finish() method to all iterator combinator structs (Scan, Take, Map, etc.) that returns some sort of generic std::iter::Iter<T> struct so that the types align? I suspect this can't be made to work, but I'm not sure why.

If you don't mind adding a heap allocation, you can return Box<dyn '_ + Iterator> instead:

struct MyVec(Vec<i32>);

impl MyVec {
    fn iter(&self) -> Box<dyn '_+Iterator<Item = i32>> {
        if self.0.len() % 2 == 0 {
            // Map<Map<std::vec::IntoIter<_>>
            Box::new(self.0.iter().map(|x| x+1).map(|x| 2*x))
        } else {
            // Map<std::vec::IntoIter<_>
            Box::new(self.0.iter().map(|x| x+1))
        }
    }
}
2 Likes

You can also make an enum that wraps your iterator types. It works, but it's kind of ugly.

It also fails to pass through overridden methods in Iterator, unless those are implemented one-by-one, or by using a delegation crate.

struct MyVec(Vec<i32>);

impl MyVec {
    fn iter(&self) -> impl Iterator<Item = i32> + use<'_> {
        if self.0.len() % 2 == 0 {
            // Map<Map<std::vec::IntoIter<_>>
            Iters::A(self.0.iter().map(|x| x+1).map(|x| 2*x))
        } else {
            // Map<std::vec::IntoIter<_>
            Iters::B(self.0.iter().map(|x| x+1))
        }
    }
}
enum Iters<A, B> {
    A(A),
    B(B),
}

impl<A, B> Iterator for Iters<A, B> 
where
    A: Iterator, 
    B: Iterator<Item = A::Item>,
{
    type Item = A::Item;
    fn next(&mut self) -> Option<Self::Item> {
        match self {
            Iters::A(a) => a.next(),
            Iters::B(b) => b.next(),
        }
    }
}
1 Like

Reusable macro for the enum strategy:

macro_rules! iter_enum {
    ($enum_name:ident, $($x:ident),*) => {
        enum $enum_name<$($x,)*> {
            $($x($x),)*
        }

        impl<T, $($x,)*> Iterator for $enum_name<$($x,)*>
        where
            $($x: Iterator<Item=T>,)*
        {
            type Item = T;
            fn next(&mut self) -> Option<Self::Item> {
                match self {
                    $($enum_name::$x(i) => i.next(),)*
                }
            }
        }
    }
}

Used like so:

struct MyVec(Vec<i32>);
impl MyVec {
    fn iter(&self) -> impl Iterator<Item = i32> + '_ {
        iter_enum!(Ret, A, B);

        if self.0.len() % 2 == 0 {
            Ret::A(self.0.iter().map(|x| x+1).map(|x| 2*x))
        } else {
            Ret::B(self.0.iter().map(|x| x+1))
        }
    }
}

There's the placement-by-return RFC, which would allow you to write this:

fn iter(&self) -> dyn Iterator<Item = i32> + '_ { ... }

The RFC direction is interesting. It looks like this would still have a vtables overhead though.

A distilled version of my question: Why isn't there a single, generic Iter type and we need all these Map, Take, etc. types? This would seemingly solve the issue.

How big would that type Iter be? It would have be different every time you create an Iter and !Sized structs are really annoying to deal with and more importantly can't be returned from functions at all.

So you would need to box the Iter and at that point you have returned to Box<dyn Iter>.

1 Like

Wait, why don't the Map, Take, Scan types have the exact same size issue?

No because they are generic over the Iterator they were created with as well as maybe a closure (or other stuff) they take in.

This means that one Map is a different type than another Map and every individual Mal type has a Size. You can't really do the same with your Iter type because then again you can't return two differently constructed Iters from the same function.

You would also need an unlimited amount of generic parameters for calls like this:

iter.map().map().map().map() ...
2 Likes

In the exact example you gave the returned Maps happen to have the same size, but the caller needs a fixed concrete type so it knows how to actually perform the iteration (specifically so it knows which of your closures to call on each element). Dynamic dispatch or matching on an enum work around this by deferring the decision of which functions to call until runtime.

4 Likes

Thanks for all your insightful answers. Just to add to the list of solutions, another approach is to define a MyVecIter struct, implement Iterator<Item = i32> for it, and have MyVec.iter() return MyVecIter. It certainly isn't as generic as returning impl Iterator<Item = i32>, but it has lower runtime cost than dynamic dispatch IIUC.