Is there a way to find an element of an iterator or return the last element if none is found with existing adapters? Or is there an existing crate implementing an operation like this?
The behavior I need is
return None if iterator is empty
return Some(element) when element matches the predicate
return Some(last_element) when no element matches the predicate
fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
I: IntoIterator,
F: FnMut(&I::Item) -> bool,
{
let mut last = None;
for elem in iter {
if f(&elem) {
return Some(elem);
}
last = Some(elem);
}
last
}
If you must use iterators, there aren't any great ones besides rewriting the above loop with try_fold or something. There are some neat things like this one:
fn matching_or_last<I, F>(iter: I, mut f: F) -> Option<I::Item>
where
I: IntoIterator,
F: FnMut(&I::Item) -> bool,
{
let mut counter = 0;
iter.into_iter()
.max_by_key(|elem| if f(elem) { u64::MAX } else { counter += 1; counter })
}
(Again, using .scan() as the poor man's replacement of .map_while() in stable Rust).
But I find the initial spelled out definition of matching_or_last from @alice to more clearly express the semantics of the desired operation, making it more readable.
Thank you so much! Is there any advantage to implement this using only iterators? I now implemented your function in a trait for all Iterator + Sized, so I can use it on iterators, but I was thinking about implementing this behavior like a mix of .find() and .last() in something like itertools, if there is any advantage.
FWIW, .try_fold() may be more readable, at least in the future when we will be able to short-circuit with custom enums such as Break(…) and ContinueWithAcc(…). Right now each respectively is signaled with Err and Ok, which is not necessarily the most readable version.
Nevertheless, this approach gets rid of the necessary outer match of my previous snippet:
iterator
.try_fold(None, |_, elem| if predicate(&elem) {
Err(elem)
} else {
Ok(Some(elem)) // the accumulator just keeps the last one seen.
})
.map_or_else(|found| Some(found), |mb_last| mb_last)
Truly expressed with built-in iterator adaptors, and clearly short-circuiting
EDIT: in the future, once the last design of the Try trait (and ? unsugaring) stabilizes, or by hackily polyfilling it with:
use Err as BreakWithVal;
use Ok as ContinueWithAcc;
this could become:
iterator
.try_fold(None, |_, elem| if predicate(&elem) {
BreakWithVal(elem)
} else {
ContinueWithAcc(Some(elem))
})
.map_or_else(
// on short_circuit:
|found_value| Some(found_value),
// on loop exhaustion:
|mb_last| mb_last,
)
Which is the only point where I am finally satisfied with the readability of the code.
I’m wondering what the use-case is for this operation. In case I haven’t overlooked something, I didn’t find anything about this in this thread so far.
Also, I’m noticing that with itertools, using with_position could be useful for writing this quite nicely:
use itertools::Itertools;
use itertools::Position::{self, *};
fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
I: IntoIterator,
F: FnMut(&I::Item) -> bool,
{
i.into_iter()
.with_position()
.find(|e| match e {
Last(e) | Only(e) => true,
First(e) | Middle(e) => predicate(e),
})
.map(Position::into_inner)
}
The behavior is slightly different though, as the next method on the underlying iterator will always be called once more than needed to find the element matching the predicate, and the predicate will never be called on the last element.
Oh, I didn't know about with_position. That would work too, although I don't know how efficient it is.
In my usecase I have an iterator of candidates, and I want to find one that satisfies a predicate, or just any one, and taking the last one would be the most efficient. At least that is what I think. Now I don't know if all this moving of the accumulator is even that efficient anymore. In some cases it might be better to just clone the iterator and call last or even first if find doesn't return anything.
I suspect that the solution here that performs the best is the for loop, though for certain iterators such as chain, you may be able to beat it by replacing the for loop with try_for_each.
Because for iterators such as chain, the try_for_each method is overloaded to produce assembly that looks like this:
for a in first_half {
...
}
for b in second_half {
...
}
whereas with the for loop, it would look like this:
for a_or_b in chained_iter {
if in_first_half {
...
} else {
...
}
}
And the compiler would have to transform the latter into the former to get the best performance with the for loop. So it's a question of how much work it takes the optimizer to get to the best version, and the longer the path is, the more likely it is that you wont get the best version.
Of course, with more traditional iterators, this does not apply and they would most likely behave the same.
As for try_fold, I suspect that it would either behave the same or worse than the for loop and try_for_each solutions. Again, it's a question of how far the compiler has to go to fully optimize it. If you compare to the for loop, the try_fold solution will repeatedly pass last as an argument to a function, then return it again. To get performance like the for loop, the compiler has to remove back and forth. Of course, it's very possible that it will be able to do that in practice.
Note that try_fold has the same kind of advantage that try_for_each has for iterators like chain. In fact, Chainonly implemts try_fold—not try_for_each—directly, the latter only uses the default implementation that delegates to try_fold.
Yes, but with try_for_each, the folding accumulator is the zero-sized (), so you still avoid the back-and-forth with try_for_each. So the solutions are not identical in what they are asking of the optimizer.
Yes. OTOH, with try_for_each you need to introduce some new moves
fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
I: IntoIterator,
F: FnMut(&I::Item) -> bool,
{
let mut last = None;
i.into_iter()
.try_for_each(|elem| {
if predicate(&elem) {
Err(elem)
} else {
last = Some(elem);
Ok(())
}
})
.err()
.or(last)
}
(I hope the code behaves correctly)
because now there’s the Some(elem) values being moved into last. It’s all a trade-off and no solution is strictly better than any other in every way, I guess.
Edit: Just noticing I’m basically re-implementing find_map with that Ok(()), Err(elem) and .err() dance.
fn matching_or_last<I, F>(i: I, mut predicate: F) -> Option<I::Item>
where
I: IntoIterator,
F: FnMut(&I::Item) -> bool,
{
let mut last = None;
i.into_iter()
.find_map(|elem| {
if predicate(&elem) {
Some(elem)
} else {
last = Some(elem);
None
}
})
.or(last)
}
Furthermore, find_map is a method that iterators from crates other than std can implement efficiently.