Iterate Over Iterator or Rev

keyneom · November 23, 2020, 6:43pm

I'm attempting to write a function that is basically as follows:

fn parse_identifier(test_string: impl Iterator<Item = char> + DoubleEndedIterator, reverse: bool) -> Option<String> {
let test_string = if reverse { test_string.rev() } else { test_string };
for c in test_string {
    // Do some parsing logic also slightly impacted by whether reverse is set
}
String::new() // New String built in the for loop
}

I receive an error from the compiler which is obvious in retrospect considering how Rust iterators work:

`if` and `else` have incompatible types
       expected type `Rev<impl Iterator<Item = char> + DoubleEndedIterator>`
found type parameter `impl Iterator<Item = char> + DoubleEndedIterator`rustc(E0308)

Is there a way to handle this cleanly? The bulk of the logic between the two processes (reverse and !reverse) are the same so it feels painful to handle them completely separately.

2e71828 · November 23, 2020, 6:50pm

Since they're both iterators yielding the same item type, you can ask the compiler to use dynamic dispatch:

fn parse_identifier(
    mut test_string: impl Iterator<Item = char> + DoubleEndedIterator,
    reverse: bool,
) -> Option<String> {
    let mut rev_iter;
    let test_string: &mut dyn Iterator<Item = char> = if reverse {
        rev_iter = test_string.rev();
        &mut rev_iter
    } else {
        &mut test_string
    };
    for c in test_string {
        // Do some parsing logic also slightly impacted by whether reverse is set
    }
    Some(String::new()) // New String built in the for loop
}

(Playground)

Edit: Removed unnecessary Option

keyneom · November 23, 2020, 7:04pm

I was about to ask about the extra Option, this is great thank you!

2e71828 · November 23, 2020, 7:07pm

Declaring a variable without initializing it immediately isn't something I do very often, so I sometimes forget it's allowed .

quinedot · November 23, 2020, 7:12pm

You can implement the for loop logic yourself by calling next_back() or next() and checking the result. Or you can make your own iterator modifier struct so next() always does what you want.

Playground for both.

scottmcm · November 23, 2020, 7:26pm

The simplest way to do this is with either, as that already implements the "check in next which to use" logic:

let test_string =
    if reverse { Either::Left(test_string.rev()) }
    else { Either::Right(test_string) };

steffahn · November 23, 2020, 8:01pm

This! Using either can have some big advantages over the dyn version. For example: if you then, also, use .for_each or .try_for_each instead of for loops whenever possible, there is then there’s only going to be a single check on whether the iterator is the forward or reversed version at the start of the loop. Compared to the dyn version where in the loop for every item, every single next call will be a dynamic function call.

SkiFire13 · November 23, 2020, 9:41pm

That will also happen with dynamic dispatch, however with either the compiler may be able to optimize normal for loops too (it's not guarantee though).

If op uses for_each/try_fold/fold/ecc ecc he could also declare the closure before hand and reuse it while duplicating the iterator code:

fn parse_identifier(test_string: impl Iterator<Item = char> + DoubleEndedIterator, reverse: bool) -> Option<String> {
    let body = |c| {
        // Do some parsing logic also slightly impacted by whether reverse is set
    };

    if reverse {
        test_string.rev().for_each(body);
    } else {
        test_string.for_each(body);
    }

    Some(String::new()) // New String built in the for loop
}

No need to generate vtables, create new types or import external crates

steffahn · November 23, 2020, 9:43pm

What do you mean? Only a single dynamic function call when using for_each? I don’t think so: for_each is generic and can thus not appear in a vtable. Using for_each with a trait object will be using the default implementation of for_each and thus repeated calls to next.

Edit: A second reason why it can’t appear in a vtable: for_each takes self by value.

SkiFire13 · November 23, 2020, 9:52pm

With dynamic dispatch you'll need a vtable lookup for the for_each method but after that you'll call the concrete method which won't need vtable lookups anymore.

2e71828 · November 23, 2020, 10:00pm

Due to the non-aliasing rules, test_string can't be made to point at a different object inside the body of the loop. I assumed (without checking) that the optimizer would be able to determine that every call the for loop makes to next() looks up the same item from the same (immutable) vtable and simply caches the result in a register somewhere.

steffahn · November 23, 2020, 10:07pm

Just to be clear, I was only talking about pre-optimization here. I have no idea about what Rust or LLVM is capable off, just that without optimization, the dyn version makes lots of dynamic function calls. And I have a a feeling that e.g. even inlining the for_each call (including the for_each call on the original iterator - or something like an rfold call for the reversed one) might be possible in the version with Either, and that seems less likely in a version with dyn.

But actually answering which version performs how well, one ought to just look at the assembly and/or benchmark of course, anyways.

Topic		Replies	Views
Swap out Vec<>.iter() with Vec<>.iter().rev() in-place?	12	2868	January 12, 2023
Forward or backward loop depend on condition? help	4	170	May 26, 2025
Understand iterators & reverse iterator	5	2224	November 2, 2021
[beginner] using .rev() with a range help	6	5685	September 15, 2019
Returning an iterator and reverse iterator with match statement help	7	1178	October 9, 2019

Iterate Over Iterator or Rev

Related topics