Trying to make sense of functions returning iterator


#1

I am trying to learn the basics of Rust, and as part of doing some simple implementations, i realized just how much i do not understand yet :slight_smile: With the help of the IRC channel i managed to figure out how to return an iterator from a function, but now i am playing around with accepting an iterator, and returning a new iterator over that.

The small example below does not work at all (as i said, n00b so several misunderstandings going on i am sure :), but i was hoping someone would be kind enough to help me out. And please note that i am trying to avoid returning boxed iterator.

#![feature(conservative_impl_trait)]

fn plus_one<'a>(items : &'a Iterator<Item=i32> ) -> impl Iterator<Item=i32> + 'a {
    items.map(|p| p + 1)
}

fn main() {
    let mut numbers: Iterator<Item=i32> = vec![1, 2].into_iter();

    for i in plus_one(&mut numbers) {
        println!("{}", i);
    }
}

Assuming the above can be made to work somehow, i am also wondering if something like the following would be allowed:

  for i in plus_one( plus_one(&mut numbers) ) {
        println!("{}", i);
    }

#2

Something like this:

#![feature(conservative_impl_trait)]

fn plus_one<I: IntoIterator<Item=i32>>(items: I) -> impl IntoIterator<Item=i32> {
    items.into_iter().map(|p| p + 1)
}

fn main() {
    let mut numbers = vec![1, 2];
    for i in plus_one(plus_one(numbers)) {
        println!("{}", i);
    }
}

Let me know if you need any clarification, otherwise I’ll let you google/look up the docs as needed :slight_smile:.


#3

The main issue with your example is that you’re trying to use a trait as a type, but when you use a trait directly as a type you get an unsized type. You can’t directly assign an unsized type to a variable, or return one from a function.

To get over this, you can either Box that type (as in, return a Box<Trait>), or write your code in a generic way so that you get concrete, sized types bounded by your traits. What you’re trying to do is probably the latter, but you’re simply confused about how to express iterators in a generic way.

So, let’s fix the example. First, the variable declaration. You wrote

let mut numbers: Iterator<Item=i32> = vec![1, 2].into_iter();

However, as I said before, Iterator<Item=i32> is an unsized type, so you can’t have variables in the stack with that type. What you want is a concrete type that implements the trait Iterator<Item=i32>. If you really want to annotate the type, then the concrete type you want is std::vec::IntoIter<i32> (that’s what the into_iter method of a Vec returns), so this line would look like:

let mut numbers: std::vec::IntoIter<i32> = vec![1, 2].into_iter();

Of course, you don’t actually need to annotate the type, so I’d recomment to just let type inference do its thing:

let mut numbers = vec![1, 2].into_iter();

Now, the function definition. You’re recieving a shared (immutable) reference to an Iterator, and you want to return a new iterator that iterates over the previous one, yielding i32’s? That doesn’t make much sense, especially considering the body of the function:

items.map(|p| p + 1)

The map method, as most iterator adaptors, consume the previous iterator (they take ownership of it). Therefore, you should pass the original iterator to the function, not just a reference to it (Note: this is not strictly necessary, but given what you’re trying to do it is what makes the most sense). So in our function signature we want to express that we want to recieve a type (by value) that implements Iterator<Item=i32>, and we want to return a type that also implements Iterator<Item=i32>. The first part is done with the traditional generic syntax, and the second part with conservative_impl_trait. Taking all of this into account, the function looks like this:

fn plus_one<T>(items: T) -> impl Iterator<Item = i32>
    where T: Iterator<Item = i32>
{
    items.map(|p| p + 1)
}

So, we are receving a type T that implements Iterator<Item = i32> and we are returning another type that also implements Iterator<Item = i32>

After these changes, the example looks like this:

#![feature(conservative_impl_trait)]

fn plus_one<T>(items: T) -> impl Iterator<Item = i32>
    where T: Iterator<Item = i32>
{
    items.map(|p| p + 1)
}

fn main() {
    let numbers = vec![1, 2].into_iter();

    for i in plus_one(numbers) {
        println!("{}", i);
    }

    println!("-");
    for i in plus_one(plus_one(vec![1, 2].into_iter())) { // Yes, you can do plus_one(plus_one(iterator))
        println!("{}", i);
    }
}

#4

Thank your very much for that step-by-step explanation!

The reason i annotated numbers was because i wanted to make sure i understood what type it was and catch the error early. It takes a bit of practice thinking in stack allocated types.

I had definitely not understood that in order to accept an Iterator<Item=i32> by value, i would need to use the generic syntax, but now that i think about it, that is of course the case.

You wrote that passing by value is not strictly necessary? I would like to understand the alternatives.


#5

We didn’t actually have to pass the iterator by value; we could have opted instead to pass it as a mutable reference. Your initial example wasn’t very far off if you stuck with this approach. The function, when written this way, looks like this:

fn plus_one<'a>(items: &'a mut Iterator<Item = i32>) -> impl Iterator<Item = i32> + 'a {
    items.map(|p| p + 1)
}

The only change is making the shared reference a mutable one. The reason this works when previously it didn’t is because this exists in the standard library:

impl<'a, I> Iterator for &'a mut I where I: Iterator + ?Sized

This means mutable references to iterators are themselves iterators, so you can use iterator adaptors directly on them. However, iterating over the new iterator would mutate the underlying one, so it can’t be done with a shared (immutable) reference.

Now, originally you were calling it as plus_one(&mut numbers), that is, you were passing a mutable reference. However, the mutable reference was being coerced to an immutable one because the function expected an immutable reference as its argument. Once it was coerced to an immutable reference, it no longer worked as an iterator.

This approach (passing &mut numbers) works, but it’s not a very good approach. One you consume the new iterator, you will still have the old iterator available. However, it will be empty.

#![feature(conservative_impl_trait)]

fn plus_one<'a>(items: &'a mut Iterator<Item = i32>) -> impl Iterator<Item = i32> + 'a {
    items.map(|p| p + 1)
}

fn main() {
    let mut numbers = vec![1, 2].into_iter();

    for i in plus_one(&mut numbers) {
        println!("{}", i);
    }

    println!("-");
    for i in numbers {
        println!("{}", i);
    }
}

This prints

2
3
-

We didn’t directly consume the numbers iterator, yet it was still consumed when we iterated over plus_one(&mut numbers). This should be properly expressed by passing ownership of the iterator to the plus_one function so we no longer have the old (and now useless) iterator laying around. Passing the iterator by value (and transfering ownership of it) removes a layer of indirection and makes the code more straightforward.


#6

Thanks again.

Interesting that you bring up this example, as the code i was originally trying to write had an iterator being used twice, conceptually similar to the example below:

#![feature(conservative_impl_trait)]

fn take_one_plus<'a>(items: &'a mut Iterator<Item = i32>) -> impl Iterator<Item = i32> + 'a {
    items.take(1).map(|p| p + 1)
}

fn main() {
    let mut numbers = vec![1, 2].into_iter();

    for i in take_one_plus(&mut numbers) {
        println!("{}", i);
    }

    println!("-");

    for i in take_one_plus(&mut numbers) {
        println!("{}", i);
    }
}

I found it interesting to note that doing the following:

fn read_parts<'a>(mut lines: &'a mut Iterator<Item=&'a str>) -> impl Iterator<Item=Vec<&'a str>> + 'a {
    let foo = lines.take(1);
    let bar = lines.take(1);
    .... 
   (actual processing here)

Gives me a double mutable borrow. error. In say C#, it would simply re-iterate (well re-enumerate) lines again (which is then something that code-analysis tools like R# will then warn about :). Also, if i then change to lines.take(1).nth(0), the error goes away. So obvious in a way, but nice to see the compiler is smart enough for that.

I must admit Rust is really growing on me, and i am looking forward to continue learning it.