Simplifying iterator method implementations

arendjr · May 27, 2025, 7:32am

On the chance of starting a bike shedding topic, I wanted to ask a simple, but serious question: Why are most methods of Iterator so complex?

I mean, they're not hugely complex, but let's look at find_map() for example:

fn find_map<B, F>(&mut self, f: F) -> Option<B>
where
    Self: Sized,
    F: FnMut(Self::Item) -> Option<B>,
{
    #[inline]
    fn check<T, B>(mut f: impl FnMut(T) -> Option<B>) -> impl FnMut((), T) -> ControlFlow<B> {
        move |(), x| match f(x) {
            Some(x) => ControlFlow::Break(x),
            None => ControlFlow::Continue(()),
        }
    }

    self.try_fold((), check(f)).break_value()
}

I can understand what's going on here. And I understand if you like functional programming, then yes, most utility functions defined on the Iterator trait are expressible as try_fold().

But wouldn't the following be much simpler still?

fn find_map<B, F>(&mut self, f: F) -> Option<B>
where
    Self: Sized,
    F: FnMut(Self::Item) -> Option<B>,
{
    while let Some(x) = self.next() {
        if let Some(x) = match f(x) {
            return Some(x);
        }
    }
    None
}

No messing around with ControlFlow, no internal closure, no call to another method. Just very plain control flow that you can see almost instantly is correct.

I can understand that maybe the code was written that way one day, and no one saw a reason the rewrite it. After all, with zero cost abstractions, it might not really matter.

But does it really not matter? In my debug builds, I can see that try_fold() is a separate function call in my stack traces. Those builds could probably be sped up with a simpler implementation. And maybe it could improve compile times too. Maybe it would be negligible on the whole, but there's a whole bunch of methods that are like this: all(), any(), find(), find_map() itself, position(), rposition(), try_find(), comparison methods, maybe more? Seeing how pervasively iterators are used in Rust, maybe simplifying this code is worthwhile, even if the gains are minuscule?

But it begs the question: Am I missing some reason for why using try_fold() is actually better?

chrefr · May 27, 2025, 7:35am

The standard library strives to be the most efficient possible. Some iterators (e.g. chain(), flatten()) can implement internal iteration (try_fold(), for_each()) more efficiently than external iteration (next()).

arendjr · May 27, 2025, 7:44am

Thanks, that makes sense! So if I understand correctly, those iterators can speed things up by merely reimplementing try_fold() and for_each(), and the other methods will then make use of those.

But it’s still a tradeoff of course. Would be nice if iterators that cannot benefit from this had some mechanism to automatically use simpler implementations instead. Although I cannot think of a way to express that in Rust syntax right now…

chrefr · May 27, 2025, 7:46am

Not correct; every iterator must implement next(), because it is used e.g. by for loops. But they can also implement try_fold() etc. more efficiently.

There is such mechanism: they could override the methods. But it'll mean the std developers have to maintain significantly more code for dubious benefit.

scottmcm · May 27, 2025, 8:42pm

Because the standard library is used by everything, so it's often worth doing things in it that wouldn't be worth bothering about elsewhere.

For example, the check function is a function instead of just putting a normal closure because that way it monomorphizes just for the function type. If it wasn't separated out, it'd also monomorphize by the iterator type. Separating out the check like this thus reduces the overall amount of monomorphization if you ever pass the same predicate to multiple different iterators, which slightly speeds up compilation.

Is that something I'd regularly bother doing outside of core? No, probably not. It's not that big a difference.

As for why everything goes through try_fold, see Rust’s iterators are inefficient, and here’s what we can do about it. | by Veedrac | Medium for the usual description. In short, what it most often means is that if you call find_map on something that happens to be a Chain in std::iter - Rust, rather than every time through the one loop checking "have I exhausted the first one?" (as would happen in the simpler while let Some(x) = self.next() loop), it ends up looking for the item in the first iterator, then only if it's not found does it look in the second one. And when it's two separate loops, it's easier for the optimizer to optimize.

If you're running opt-level = 0, this absolutely makes things slower. But it makes certain things faster in --release.

cuviper · May 28, 2025, 12:13am

When the closure captures the iterator type, this can also lead to exponential growth in the raw type name, which was my original motivation for #62429. If that find_map example had self.try_fold((), {closure}), then you'd have a method call on Self using a closure that also captures Self (to get Self::Item). Add another iterator adaptor and you may double the mentions of Self again, etc.

There's an example here (playground) that still demonstrates the issue in LLVM IR, especially in these two debug names:

!16 = !DICompositeType(tag: DW_TAG_structure_type, name: "Map<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>, playground::multiply2::mul::{closure_env#0}<i32>>", scope: !18, file: !17, size: 128, align: 64, flags: DIFlagPublic, elements: !20, templateParams: !49, identifier: "24fd75c018ccc309239953abe20bdaf3")
!81 = !DICompositeType(tag: DW_TAG_structure_type, name: "Map<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>, playground::multiply::{closure_env#0}<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>>>", scope: !18, file: !17, size: 128, align: 64, flags: DIFlagPublic, elements: !82, templateParams: !89, identifier: "9b50d28994e5a9f684d703f83476601a")

!16 is the manually-constrained version from multiply2, while !81 is from multiply with the same Map<I, F> type name duplicated by the closure capture. Here's an indented view of those types:

Map<
  core::iter::adapters::map::Map<
    core::ops::range::Range<i32>,
    playground::iter::{closure_env#0}
  >,
  playground::multiply2::mul::{closure_env#0}<i32>
>

Map<
  core::iter::adapters::map::Map<
    core::ops::range::Range<i32>,
    playground::iter::{closure_env#0}
  >,
  playground::multiply::{closure_env#0}<
    core::iter::adapters::map::Map<
      core::ops::range::Range<i32>,
      playground::iter::{closure_env#0}
    >
  >
>

Add another layer of each (playground) and you get:

!16 = !DICompositeType(tag: DW_TAG_structure_type, name: "Map<core::iter::adapters::map::Map<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>, playground::multiply2::mul::{closure_env#0}<i32>>, playground::multiply2::mul::{closure_env#0}<f64>>", scope: !18, file: !17, size: 192, align: 64, flags: DIFlagPublic, elements: !20, templateParams: !59, identifier: "d17100d0c8952248cff78fd22bc9bf9d")
!106 = !DICompositeType(tag: DW_TAG_structure_type, name: "Map<core::iter::adapters::map::Map<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>, playground::multiply::{closure_env#0}<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>>>, playground::multiply::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>, playground::multiply::{closure_env#0}<core::iter::adapters::map::Map<core::ops::range::Range<i32>, playground::iter::{closure_env#0}>>>>>", scope: !18, file: !17, size: 192, align: 64, flags: DIFlagPublic, elements: !107, templateParams: !123, identifier: "4dfd2209da1a6584748bd206bae322ee")

Map<
  core::iter::adapters::map::Map<
    core::iter::adapters::map::Map<
      core::ops::range::Range<i32>,
      playground::iter::{closure_env#0}
    >,
    playground::multiply2::mul::{closure_env#0}<i32>
  >,
  playground::multiply2::mul::{closure_env#0}<f64>
>

Map<
  core::iter::adapters::map::Map<
    core::iter::adapters::map::Map<
      core::ops::range::Range<i32>,
      playground::iter::{closure_env#0}
    >,
    playground::multiply::{closure_env#0}<
      core::iter::adapters::map::Map<
        core::ops::range::Range<i32>,
        playground::iter::{closure_env#0}
      >
    >
  >,
  playground::multiply::{closure_env#0}<
    core::iter::adapters::map::Map<
      core::iter::adapters::map::Map<
        core::ops::range::Range<i32>,
        playground::iter::{closure_env#0}
      >,
      playground::multiply::{closure_env#0}<
        core::iter::adapters::map::Map<
          core::ops::range::Range<i32>,
          playground::iter::{closure_env#0}
        >
      >
    >
  >
>

arendjr · May 28, 2025, 4:59am

Thanks, those explanations were really helpful in understanding what’s going on!

Topic		Replies	Views
Does anybody need a try_flatten() method? code review	2	822	June 11, 2022
.map(), .filter(), but no .each()	12	6481	April 16, 2015
Abstraction causes overhead help	16	863	August 12, 2021
What `.iter().map()` does than simple loop don't? code review	10	3034	April 13, 2020
Iterating over Results help	5	829	January 2, 2019

Simplifying iterator method implementations

Related topics