Find or return last element of iterator

mankinskin · April 10, 2021, 12:58pm

Is there a way to find an element of an iterator or return the last element if none is found with existing adapters? Or is there an existing crate implementing an operation like this?

The behavior I need is

return None if iterator is empty
return Some(element) when element matches the predicate
return Some(last_element) when no element matches the predicate

alice · April 10, 2021, 1:10pm

Here you go:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    
    for elem in iter {
        if f(&elem) {
            return Some(elem);
        }
        last = Some(elem);
    }
    
    last
}

If you must use iterators, there aren't any great ones besides rewriting the above loop with try_fold or something. There are some neat things like this one:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut counter = 0;
    iter.into_iter()
        .max_by_key(|elem| if f(elem) { u64::MAX } else { counter += 1; counter })
}

but it doesn't short-circuit.

Yandros · April 10, 2021, 1:20pm

If you really want to "one-shot" this with iterator adaptors, do this: it short-circuits as expected:

let mut acc = None;
match
    iterator
    .scan(&mut acc, |acc, elem| if predicate(&elem) { **acc = Some(elem); None } else { Some(elem) })
    .last()
{
    Some(last) => Some(acc.unwrap_or(last)),
    None => acc,
}

(Again, using .scan() as the poor man's replacement of .map_while() in stable Rust).

But I find the initial spelled out definition of matching_or_last from @alice to more clearly express the semantics of the desired operation, making it more readable.

mankinskin · April 10, 2021, 1:23pm

Thank you so much! Is there any advantage to implement this using only iterators? I now implemented your function in a trait for all Iterator + Sized, so I can use it on iterators, but I was thinking about implementing this behavior like a mix of .find() and .last() in something like itertools, if there is any advantage.

mankinskin · April 10, 2021, 1:30pm

Are you sure this short-circuits? I would think scan keeps iterating to the last element of the iterator even if a match has been found, no?

Edit: Oh, I see last stops iterating once it encounters None, I guess.

2e71828 · April 10, 2021, 1:40pm

My initial thought, which is probably overkill, is to write an iterator adapter that stores the last N elements that were produced:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator, I::Item: Copy,
    F: FnMut(&I::Item) -> bool,
{
    let mut hist: History<_, 1> = History::new(iter.into_iter());
    hist.by_ref().find(f).or_else(|| hist[-1])
}

#[test]
fn test_matching_or_last() {
    assert_eq!(matching_or_last(vec![], |&x:&usize| x>3), None);
    assert_eq!(matching_or_last(vec![1,2,3], |&x| x>3), Some(3));
    assert_eq!(matching_or_last(vec![1,5,3], |&x| x>3), Some(5));
}

// -----------------------

struct History<I:Iterator, const N:usize> {
    iter: I,
    hist: [Option<I::Item>; N],
    next_hist: usize
}

impl<I, const N:usize> History<I,N> where I:Iterator, I::Item: Copy
{
    pub fn new(iter: I)->Self {
        History { 
            iter,
            hist: [None; N],
            next_hist: 0,
        }
    }
}

impl<I,const N:usize> std::ops::Index<isize> for History<I,N>
where I:Iterator {
    type Output = Option<I::Item>;
    fn index(&self, i:isize)->&Self::Output {
        assert!(i<0);
        assert!((-i) as usize <= N);
        
        &self.hist[(self.next_hist + N - (-i) as usize) % N]
    }
}

impl<I, const N:usize> Iterator for History<I,N> where I:Iterator, I::Item:Copy {
    type Item = I::Item;
    fn next(&mut self)->Option<I::Item> {
        let item = self.iter.next();
        if let Some(x) = item {
            self.hist[self.next_hist] = Some(x);
            self.next_hist = (self.next_hist + 1) % N;
        }
        item
    }
}

(Playground)

Note that this implementation only supports Copyable items.

Yandros · April 10, 2021, 1:44pm

FWIW, .try_fold() may be more readable, at least in the future when we will be able to short-circuit with custom enums such as Break(…) and ContinueWithAcc(…). Right now each respectively is signaled with Err and Ok, which is not necessarily the most readable version.

Nevertheless, this approach gets rid of the necessary outer match of my previous snippet:

iterator
    .try_fold(None, |_, elem| if predicate(&elem) {
        Err(elem)
    } else {
        Ok(Some(elem)) // the accumulator just keeps the last one seen.
    })
    .map_or_else(|found| Some(found), |mb_last| mb_last)

Truly expressed with built-in iterator adaptors, and clearly short-circuiting

EDIT: in the future, once the last design of the Try trait (and ? unsugaring) stabilizes, or by hackily polyfilling it with:

use Err as BreakWithVal;
use Ok as ContinueWithAcc;

this could become:

iterator
    .try_fold(None, |_, elem| if predicate(&elem) {
        BreakWithVal(elem)
    } else {
        ContinueWithAcc(Some(elem))
    })
    .map_or_else(
        // on short_circuit:
        |found_value| Some(found_value),
        // on loop exhaustion:
        |mb_last| mb_last,
    )

Which is the only point where I am finally satisfied with the readability of the code.

mankinskin · April 10, 2021, 1:56pm

This is also the impl I came up with now when combining find and last from the standard library
(Edited)

fn find_or_last<P>(mut self, predicate: P) -> Option<Self::Item>
	where Self: Sized,
		  P: FnMut(&Self::Item) -> bool,
{
    #[inline]
    fn check<T>(mut predicate: impl FnMut(&T) -> bool) -> impl FnMut(Option<T>, T) -> Result<Option<T>, T> {
        move |_, x| {
            if predicate(&x) { Result::Err(x) } else { Result::Ok(Some(x)) }
        }
    }

    self.try_fold(None, check(predicate)).unwrap_or_else(Some)
}

Playground
I guess I will open a pull request for this at itertools, seems like a useful thing to have.

Thanks for all the help everyone!

steffahn · April 10, 2021, 2:07pm

In case you want to be more concise: unwrap_or_else(Some)

steffahn · April 10, 2021, 2:28pm

I’m wondering what the use-case is for this operation. In case I haven’t overlooked something, I didn’t find anything about this in this thread so far.

Also, I’m noticing that with itertools, using with_position could be useful for writing this quite nicely:

use itertools::Itertools;
use itertools::Position::{self, *};

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    i.into_iter()
        .with_position()
        .find(|e| match e {
            Last(e) | Only(e) => true,
            First(e) | Middle(e) => predicate(e),
        })
        .map(Position::into_inner)
}

The behavior is slightly different though, as the next method on the underlying iterator will always be called once more than needed to find the element matching the predicate, and the predicate will never be called on the last element.

mankinskin · April 10, 2021, 2:37pm

Oh, I didn't know about with_position. That would work too, although I don't know how efficient it is.

In my usecase I have an iterator of candidates, and I want to find one that satisfies a predicate, or just any one, and taking the last one would be the most efficient. At least that is what I think. Now I don't know if all this moving of the accumulator is even that efficient anymore. In some cases it might be better to just clone the iterator and call last or even first if find doesn't return anything.

2e71828 · April 10, 2021, 2:45pm

You could use Peekable::peek() for this: Save a copy of the first element before you search to return if there are no matching items.

mankinskin · April 10, 2021, 2:46pm

Yes, but in that case I would always clone the first item, even if a match exists, which I would like to avoid.

alice · April 10, 2021, 2:47pm

I suspect that the solution here that performs the best is the for loop, though for certain iterators such as chain, you may be able to beat it by replacing the for loop with try_for_each.

mankinskin · April 10, 2021, 2:50pm

Why do you think the for loop or try_for_each would be more efficient than try_fold?

alice · April 10, 2021, 2:51pm

Because for iterators such as chain, the try_for_each method is overloaded to produce assembly that looks like this:

for a in first_half {
    ...
}
for b in second_half {
    ...
}

whereas with the for loop, it would look like this:

for a_or_b in chained_iter {
    if in_first_half {
        ...
    } else {
        ...
    }
}

And the compiler would have to transform the latter into the former to get the best performance with the for loop. So it's a question of how much work it takes the optimizer to get to the best version, and the longer the path is, the more likely it is that you wont get the best version.

Of course, with more traditional iterators, this does not apply and they would most likely behave the same.

alice · April 10, 2021, 3:04pm

As for try_fold, I suspect that it would either behave the same or worse than the for loop and try_for_each solutions. Again, it's a question of how far the compiler has to go to fully optimize it. If you compare to the for loop, the try_fold solution will repeatedly pass last as an argument to a function, then return it again. To get performance like the for loop, the compiler has to remove back and forth. Of course, it's very possible that it will be able to do that in practice.

steffahn · April 10, 2021, 3:08pm

Note that try_fold has the same kind of advantage that try_for_each has for iterators like chain. In fact, Chain only implemts try_fold—not try_for_each—directly, the latter only uses the default implementation that delegates to try_fold.

alice · April 10, 2021, 3:11pm

Yes, but with try_for_each, the folding accumulator is the zero-sized (), so you still avoid the back-and-forth with try_for_each. So the solutions are not identical in what they are asking of the optimizer.

steffahn · April 10, 2021, 3:26pm

Yes. OTOH, with try_for_each you need to introduce some new moves

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    i.into_iter()
        .try_for_each(|elem| {
            if predicate(&elem) {
                Err(elem)
            } else {
                last = Some(elem);
                Ok(())
            }
        })
        .err()
        .or(last)
}

(I hope the code behaves correctly)

because now there’s the Some(elem) values being moved into last. It’s all a trade-off and no solution is strictly better than any other in every way, I guess.

Edit: Just noticing I’m basically re-implementing find_map with that Ok(()), Err(elem) and .err() dance.

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    i.into_iter()
        .find_map(|elem| {
            if predicate(&elem) {
                Some(elem)
            } else {
                last = Some(elem);
                None
            }
        })
        .or(last)
}

Furthermore, find_map is a method that iterators from crates other than std can implement efficiently.

Topic		Replies	Views
Is it possible to get the last element returned by an iterator? help	4	12276	October 6, 2020
Get first and last from not cloneable iterator help	2	1279	January 1, 2023
Find element to the left/right of given predicate	2	605	May 21, 2020
Iterator: need to identify the last element help	12	13532	December 27, 2020
Iterator `skip_last()`? help	5	5929	October 8, 2020

Find or return last element of iterator

Related topics