Why is `skip` lazy? (Rationale for std::iter::Skip)

Hi,

Just wondering: what's the rationale for the skip method of iterators returning a new instance of std::iter::Skip instead of simply regurgitating the original iterator after calling nth on it? After all the following does work:

trait SimpleSkip {
    fn simple_skip(self, n: usize) -> Self;
}

impl<I: Iterator> SimpleSkip for I {
    fn simple_skip(mut self, n: usize) -> Self {
        if n != 0 {
            self.nth(n - 1);
        }
        self
    }
}

fn main() {
    let v = vec![0, 1, 2, 3, 4, 5];
    for e in v.iter().simple_skip(3) {
        println!{"{e}"};
    }
}

Skip is lazy, but that laziness will only be an advantage when multiple calls to skip are chained for the same iterator (multiple calls to nth will be replaced by a single one), or when the iterator is created, skipped, but then actually never consumed. Both seem like not relevant.

On the other hand, Skip introduces another field that takes up memory. (The compiler should optimize it away in many cases though.)

All Iterator adapters are lazy, not just skip.

3 Likes

This ties in to the discussion on this thread from Internals: `Vec::append` but by value - language design - Rust Internals

nth and simple_skip are just “mutating” versus “chaining” versions of the same function. If there were syntax sugar for something like tap, as discussed in the linked thread, then they wouldn’t need to be two separate functions.

2 Likes

If you want an eager version of skip there's Itertools::dropping

1 Like

Ok, let's phrase the question differently: why is skip an iterator adapter without this being technically necessary? Most iterator adapters (like map, filter, or take) can only be realized as an adapter (i.e. consume and return a different type), but skip is an exception to this rule.

I guess that the answer is that in practice the laziness of skip is almost never a problem: if it's not necessary (because the iterator is consumed right away), the additional storage needed for the Skip structure gets optimized away. And in some cases the laziness may be useful or even necessary, and then the small price is justified.

Interesting thread, thanks for the pointer! Indeed, the function simple_skip above would be more general if instead of consuming and regurgitating it would not require ownership and simply take a &mut self (well, that would be nth). But as the thread discusses there is no convenient syntax to chain such methods.

I mean we're talking about one usize. Applications that can't handle up to 64 bits of additional memory that is required for the convenience, ergonomics and API consistency of having skip as an adapter that we can chain with the other iterator methods are probably written without using such high-level APIs in the first place.

2 Likes

Related.

1 Like

skip needs to be lazy in order to work properly with non-fused iterators. Consider:

let iter = || {
    let mut b = true;
    iter::from_fn(move || {
        if mem::take(&mut b) {
            None
        } else {
            unreachable!()
        }
    })
};

dbg!(iter().skip(3).next()); // None
dbg!(iter().simple_skip(3).next()); // panic
3 Likes

Excellent point, thanks. That's an invariant of iterator adapters that I was not aware of. Without it even a simple for loop may panic, like

for i in iter().simple_skip(3) {
    dbg!(i);
}

Also note that laziness is required for upholding complexity expectations. Because iterator adapters are lazy, people expect that creating an iterator adapter is O(1). If skip were implemented naĂŻvely, it would break this expectation on long iterators with a long skipped prefix, potentially leading to accidentally quadratic algorithms.

3 Likes

This is only true if the algorithm is dominated by creating iterators without ever using them.

Under the condition that next is called at least once, the eager simple_skip above does the same work as the lazy skip: they both call nth.

Yes, but that's not guaranteed or automatic. For example, it may be more convenient to unconditionally create an iterator (either for scoping/borrowing reasons, or just because of the general structuring of the code) and then only advancing it conditionally. It would be quite a big footgun if merely moving the construction of an iterator outside a condition changed the performance characteristics of the code.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.