Why (ie. how) is map() lazy when iterating?

I understand lazyness of iterators. Using a map() might discard everything, better use for_each() instead.

My question is about how it's done. I am looking the source code of Map and for_each() and can't spot what makes Map lazy. It must be related to consuming something in for_each(), but where precisely ?

Map<I, F> is an iterator adapter which implements Iterator (when I: Iterator and F: FnMut(I::Item) -> Ret), and as you noted, iterators are lazy. The laziness is inherent in its implementation of Iterator.

fn example<I: Iterator>(iter: I) {
    let mut mapped_iter = iter.map(|_| 42);
    // No iteration happens until `mapped_iter.next()` is called...
    let _ = mapped_iter.next();
    // ...at which point <Map<I, _> as Iterator>::next is called
    // ...which calls <I as Iterator>::next()...
    // ...and if the result is `Some(x)`, `Some((closure)(x))` is returned
    //    (otherwise `None` is returned)
}

for_each isn't lazy and isn't tied to Map or vice-versa. It consumes iterators; it's not an adapter.

3 Likes

Short answer: Map only maps a value (polls the inner iterator and passes it through the function) when it is asked for a value.

You can see it in the source code

    fn next(&mut self) -> Option<B> {
        self.iter.next().map(&mut self.f)
    }

Do you mean that the only difference is that for_each does not return anything, so it is considered as a sink cancelling lazyness, while map, being an adapter, is just a link in the chain that delegates lazyness cancellation to the subsequent operations ?

[edit]
No, it can't be that, otherwise let _ = v1.iter(); would not store the iterator, it would be a sink as well.

New guess :
The only thing that cancels lazyness is if next() is called at anytime.
And using map() only inserts a stage in the chain, but does not call next().
Right ?

It's all about doing as little as possible, as late as possible. This is generally the case for all iterators. No actual iteration work is done until something starts calling next.

When you call map, it takes the closure and input iterator and package them into a new iterator, but that's all it will do for now. Nothing has called next yet.

When you call for_each, it starts calling next in a loop and passing the values to the input closure. That's when work is actually done, no matter how many adaptors you have stacked before. They will recursively call each other's next on demand. The same goes for collect, sum, and so on.

I suppose a rule of thumb is to see if it returns another iterator or an end result value.

2 Likes

Nothing "cancels laziness" per se, and there is no magic here; methods either consume the iterator (e.g. by calling next until the iterator is exhausted), or they don't.

(It's possible to make iterator adapters that consume the wrapped iterator (into a Vec, say) and then return their own items (lazily, as per the Iterator trait), though it's generally considered poor design if avoidable.)

1 Like

Iterator::map caches your input closure and implements an Iterator::next method that does the mapping. That is, Map is also an iterator itself, and it doesn't do any preprocessing, so it's clearly inert because it doesn't call the next or other consumer methods of the internal iterator ahead of time at all. Instead, for_each calls the iterator's next until the iterator ends, and therefore consumes the iterator.

For how to tell if a method is inert or not, I think there are these (the ones I usually use) guidelines:

  1. documentation. The documentation will generally describe clearly what this function does.
  2. method signature. In general, a method that produces some value will consume an iterator, while a method that produces another iterator will generally produce an inert iterator. *
  3. method implementation. If an iterator does not or only calls the internal iterator's consume method in its own consume method, then generally that iterator is inert.

(*) Methods like itertools::sorted, although they also return an iterator, are not strictly inert iterators because they consumes the iterator (for sorted, sort the original iterator) before the (first) iteration.

2 Likes