Find or return last element of iterator

Is there a way to find an element of an iterator or return the last element if none is found with existing adapters? Or is there an existing crate implementing an operation like this?

The behavior I need is

  • return None if iterator is empty
  • return Some(element) when element matches the predicate
  • return Some(last_element) when no element matches the predicate
1 Like

Here you go:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    
    for elem in iter {
        if f(&elem) {
            return Some(elem);
        }
        last = Some(elem);
    }
    
    last
}

If you must use iterators, there aren't any great ones besides rewriting the above loop with try_fold or something. There are some neat things like this one:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut counter = 0;
    iter.into_iter()
        .max_by_key(|elem| if f(elem) { u64::MAX } else { counter += 1; counter })
}

but it doesn't short-circuit.

7 Likes

If you really want to "one-shot" this with iterator adaptors, do this: it short-circuits as expected:

  • let mut acc = None;
    match
        iterator
        .scan(&mut acc, |acc, elem| if predicate(&elem) { **acc = Some(elem); None } else { Some(elem) })
        .last()
    {
        Some(last) => Some(acc.unwrap_or(last)),
        None => acc,
    }
    
    • (Again, using .scan() as the poor man's replacement of .map_while() in stable Rust).

But I find the initial spelled out definition of matching_or_last from @alice to more clearly express the semantics of the desired operation, making it more readable.

4 Likes

Thank you so much! Is there any advantage to implement this using only iterators? I now implemented your function in a trait for all Iterator + Sized, so I can use it on iterators, but I was thinking about implementing this behavior like a mix of .find() and .last() in something like itertools, if there is any advantage.

Are you sure this short-circuits? I would think scan keeps iterating to the last element of the iterator even if a match has been found, no? :thinking:

Edit: Oh, I see last stops iterating once it encounters None, I guess.

1 Like

My initial thought, which is probably overkill, is to write an iterator adapter that stores the last N elements that were produced:

fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
    I: IntoIterator, I::Item: Copy,
    F: FnMut(&I::Item) -> bool,
{
    let mut hist: History<_, 1> = History::new(iter.into_iter());
    hist.by_ref().find(f).or_else(|| hist[-1])
}

#[test]
fn test_matching_or_last() {
    assert_eq!(matching_or_last(vec![], |&x:&usize| x>3), None);
    assert_eq!(matching_or_last(vec![1,2,3], |&x| x>3), Some(3));
    assert_eq!(matching_or_last(vec![1,5,3], |&x| x>3), Some(5));
}

// -----------------------

struct History<I:Iterator, const N:usize> {
    iter: I,
    hist: [Option<I::Item>; N],
    next_hist: usize
}

impl<I, const N:usize> History<I,N> where I:Iterator, I::Item: Copy
{
    pub fn new(iter: I)->Self {
        History { 
            iter,
            hist: [None; N],
            next_hist: 0,
        }
    }
}

impl<I,const N:usize> std::ops::Index<isize> for History<I,N>
where I:Iterator {
    type Output = Option<I::Item>;
    fn index(&self, i:isize)->&Self::Output {
        assert!(i<0);
        assert!((-i) as usize <= N);
        
        &self.hist[(self.next_hist + N - (-i) as usize) % N]
    }
}

impl<I, const N:usize> Iterator for History<I,N> where I:Iterator, I::Item:Copy {
    type Item = I::Item;
    fn next(&mut self)->Option<I::Item> {
        let item = self.iter.next();
        if let Some(x) = item {
            self.hist[self.next_hist] = Some(x);
            self.next_hist = (self.next_hist + 1) % N;
        }
        item
    }
}

(Playground)

Note that this implementation only supports Copyable items.

FWIW, .try_fold() may be more readable, at least in the future when we will be able to short-circuit with custom enums such as Break(…) and ContinueWithAcc(…). Right now each respectively is signaled with Err and Ok, which is not necessarily the most readable version.

Nevertheless, this approach gets rid of the necessary outer match of my previous snippet:

iterator
    .try_fold(None, |_, elem| if predicate(&elem) {
        Err(elem)
    } else {
        Ok(Some(elem)) // the accumulator just keeps the last one seen.
    })
    .map_or_else(|found| Some(found), |mb_last| mb_last)

Truly expressed with built-in iterator adaptors, and clearly short-circuiting :slightly_smiling_face:


EDIT: in the future, once the last design of the Try trait (and ? unsugaring) stabilizes, or by hackily polyfilling it with:

use Err as BreakWithVal;
use Ok as ContinueWithAcc;

this could become:

iterator
    .try_fold(None, |_, elem| if predicate(&elem) {
        BreakWithVal(elem)
    } else {
        ContinueWithAcc(Some(elem))
    })
    .map_or_else(
        // on short_circuit:
        |found_value| Some(found_value),
        // on loop exhaustion:
        |mb_last| mb_last,
    )

Which is the only point where I am finally satisfied with the readability of the code.

2 Likes

This is also the impl I came up with now when combining find and last from the standard library :slight_smile:
(Edited)

fn find_or_last<P>(mut self, predicate: P) -> Option<Self::Item>
	where Self: Sized,
		  P: FnMut(&Self::Item) -> bool,
{
    #[inline]
    fn check<T>(mut predicate: impl FnMut(&T) -> bool) -> impl FnMut(Option<T>, T) -> Result<Option<T>, T> {
        move |_, x| {
            if predicate(&x) { Result::Err(x) } else { Result::Ok(Some(x)) }
        }
    }

    self.try_fold(None, check(predicate)).unwrap_or_else(Some)
}

Playground
I guess I will open a pull request for this at itertools, seems like a useful thing to have.

Thanks for all the help everyone!

In case you want to be more concise: unwrap_or_else(Some)

2 Likes

I’m wondering what the use-case is for this operation. In case I haven’t overlooked something, I didn’t find anything about this in this thread so far.

Also, I’m noticing that with itertools, using with_position could be useful for writing this quite nicely:

use itertools::Itertools;
use itertools::Position::{self, *};

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    i.into_iter()
        .with_position()
        .find(|e| match e {
            Last(e) | Only(e) => true,
            First(e) | Middle(e) => predicate(e),
        })
        .map(Position::into_inner)
}

The behavior is slightly different though, as the next method on the underlying iterator will always be called once more than needed to find the element matching the predicate, and the predicate will never be called on the last element.

2 Likes

Oh, I didn't know about with_position. That would work too, although I don't know how efficient it is.

In my usecase I have an iterator of candidates, and I want to find one that satisfies a predicate, or just any one, and taking the last one would be the most efficient. At least that is what I think. Now I don't know if all this moving of the accumulator is even that efficient anymore. In some cases it might be better to just clone the iterator and call last or even first if find doesn't return anything.

You could use Peekable::peek() for this: Save a copy of the first element before you search to return if there are no matching items.

Yes, but in that case I would always clone the first item, even if a match exists, which I would like to avoid.

I suspect that the solution here that performs the best is the for loop, though for certain iterators such as chain, you may be able to beat it by replacing the for loop with try_for_each.

2 Likes

Why do you think the for loop or try_for_each would be more efficient than try_fold?

Because for iterators such as chain, the try_for_each method is overloaded to produce assembly that looks like this:

for a in first_half {
    ...
}
for b in second_half {
    ...
}

whereas with the for loop, it would look like this:

for a_or_b in chained_iter {
    if in_first_half {
        ...
    } else {
        ...
    }
}

And the compiler would have to transform the latter into the former to get the best performance with the for loop. So it's a question of how much work it takes the optimizer to get to the best version, and the longer the path is, the more likely it is that you wont get the best version.

Of course, with more traditional iterators, this does not apply and they would most likely behave the same.

1 Like

As for try_fold, I suspect that it would either behave the same or worse than the for loop and try_for_each solutions. Again, it's a question of how far the compiler has to go to fully optimize it. If you compare to the for loop, the try_fold solution will repeatedly pass last as an argument to a function, then return it again. To get performance like the for loop, the compiler has to remove back and forth. Of course, it's very possible that it will be able to do that in practice.

1 Like

Note that try_fold has the same kind of advantage that try_for_each has for iterators like chain. In fact, Chain only implemts try_fold—not try_for_each—directly, the latter only uses the default implementation that delegates to try_fold.

1 Like

Yes, but with try_for_each, the folding accumulator is the zero-sized (), so you still avoid the back-and-forth with try_for_each. So the solutions are not identical in what they are asking of the optimizer.

2 Likes

Yes. OTOH, with try_for_each you need to introduce some new moves

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    i.into_iter()
        .try_for_each(|elem| {
            if predicate(&elem) {
                Err(elem)
            } else {
                last = Some(elem);
                Ok(())
            }
        })
        .err()
        .or(last)
}

(I hope the code behaves correctly)

because now there’s the Some(elem) values being moved into last. It’s all a trade-off and no solution is strictly better than any other in every way, I guess.

Edit: Just noticing I’m basically re-implementing find_map with that Ok(()), Err(elem) and .err() dance.

fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
    I: IntoIterator,
    F: FnMut(&I::Item) -> bool,
{
    let mut last = None;
    i.into_iter()
        .find_map(|elem| {
            if predicate(&elem) {
                Some(elem)
            } else {
                last = Some(elem);
                None
            }
        })
        .or(last)
}

Furthermore, find_map is a method that iterators from crates other than std can implement efficiently.

1 Like