Iterate Over Iterator or Rev

I'm attempting to write a function that is basically as follows:

fn parse_identifier(test_string: impl Iterator<Item = char> + DoubleEndedIterator, reverse: bool) -> Option<String> {
let test_string = if reverse { test_string.rev() } else { test_string };
for c in test_string {
    // Do some parsing logic also slightly impacted by whether reverse is set
}
String::new() // New String built in the for loop
}

I receive an error from the compiler which is obvious in retrospect considering how Rust iterators work:

`if` and `else` have incompatible types
       expected type `Rev<impl Iterator<Item = char> + DoubleEndedIterator>`
found type parameter `impl Iterator<Item = char> + DoubleEndedIterator`rustc(E0308)

Is there a way to handle this cleanly? The bulk of the logic between the two processes (reverse and !reverse) are the same so it feels painful to handle them completely separately.

Since they're both iterators yielding the same item type, you can ask the compiler to use dynamic dispatch:

fn parse_identifier(
    mut test_string: impl Iterator<Item = char> + DoubleEndedIterator,
    reverse: bool,
) -> Option<String> {
    let mut rev_iter;
    let test_string: &mut dyn Iterator<Item = char> = if reverse {
        rev_iter = test_string.rev();
        &mut rev_iter
    } else {
        &mut test_string
    };
    for c in test_string {
        // Do some parsing logic also slightly impacted by whether reverse is set
    }
    Some(String::new()) // New String built in the for loop
}

(Playground)

Edit: Removed unnecessary Option

1 Like

I was about to ask about the extra Option, this is great thank you!

Declaring a variable without initializing it immediately isn't something I do very often, so I sometimes forget it's allowed :slight_smile: .

You can implement the for loop logic yourself by calling next_back() or next() and checking the result. Or you can make your own iterator modifier struct so next() always does what you want.

Playground for both.

1 Like

The simplest way to do this is with either, as that already implements the "check in next which to use" logic:

let test_string =
    if reverse { Either::Left(test_string.rev()) }
    else { Either::Right(test_string) };
4 Likes

This! Using either can have some big advantages over the dyn version. For example: if you then, also, use .for_each or .try_for_each instead of for loops whenever possible, there is then there’s only going to be a single check on whether the iterator is the forward or reversed version at the start of the loop. Compared to the dyn version where in the loop for every item, every single next call will be a dynamic function call.

1 Like

That will also happen with dynamic dispatch, however with either the compiler may be able to optimize normal for loops too (it's not guarantee though).

If op uses for_each/try_fold/fold/ecc ecc he could also declare the closure before hand and reuse it while duplicating the iterator code:

fn parse_identifier(test_string: impl Iterator<Item = char> + DoubleEndedIterator, reverse: bool) -> Option<String> {
    let body = |c| {
        // Do some parsing logic also slightly impacted by whether reverse is set
    };

    if reverse {
        test_string.rev().for_each(body);
    } else {
        test_string.for_each(body);
    }

    Some(String::new()) // New String built in the for loop
}

No need to generate vtables, create new types or import external crates

What do you mean? Only a single dynamic function call when using for_each? I don’t think so: for_each is generic and can thus not appear in a vtable. Using for_each with a trait object will be using the default implementation of for_each and thus repeated calls to next.

Edit: A second reason why it can’t appear in a vtable: for_each takes self by value.

1 Like

With dynamic dispatch you'll need a vtable lookup for the for_each method but after that you'll call the concrete method which won't need vtable lookups anymore.

Due to the non-aliasing rules, test_string can't be made to point at a different object inside the body of the loop. I assumed (without checking) that the optimizer would be able to determine that every call the for loop makes to next() looks up the same item from the same (immutable) vtable and simply caches the result in a register somewhere.

Just to be clear, I was only talking about pre-optimization here. I have no idea about what Rust or LLVM is capable off, just that without optimization, the dyn version makes lots of dynamic function calls. And I have a a feeling that e.g. even inlining the for_each call (including the for_each call on the original iterator - or something like an rfold call for the reversed one) might be possible in the version with Either, and that seems less likely in a version with dyn.

But actually answering which version performs how well, one ought to just look at the assembly and/or benchmark of course, anyways.

1 Like