fn iter_flatmap_call(bench:&mut Bencher){
let hs = HashSet::new();
let ohs:Option<&HashSet<i64>> = Some(&hs);
bench.iter(|| {
get_iter_flat_map(&ohs);
})
}
fn iter_flatmap_call_and_use(bench:&mut Bencher){
let big_hs = big_hs();
let ohs = Some(&big_hs);
bench.iter(|| {
for i in get_iter_flat_map(&ohs) {
let _ = i+1;
}
})
}
And the results:
test iter_dyn_call ... bench: 18 ns/iter (+/- 2)
test iter_dyn_call_and_use ... bench: 3,747 ns/iter (+/- 516)
test iter_either_call ... bench: 8 ns/iter (+/- 1)
test iter_either_call_and_use ... bench: 1,990 ns/iter (+/- 210)
test iter_enum_call ... bench: 8 ns/iter (+/- 2)
test iter_enum_call_and_use ... bench: 2,026 ns/iter (+/- 287)
test iter_flatmap_call ... bench: 1 ns/iter (+/- 0)
test iter_flatmap_call_and_use ... bench: 1,684 ns/iter (+/- 218)
And full code.
So, there is a difference, and Boxing is the slowest both in terms of
running the method and subsequent access of the iterater. It’s a
2-fold difference though. The flatmap solution is fastest,
especially for creating the Iterator.
Also worth noting that Option<&T> is Copy, so the caller can still use the original Option, and it’s the same size as &Option<&T>, but requires less indirection, so passing by-value is actually cheaper. Win-win-win.
(Edit: Actually, the performance difference appears to be optimized away. Point stands.)
The Either crate solution works, but inserts a match for every Deref call which means before every next as far as I can see. The flat_map solution does the same thing, as part of the internal calls it makes to the contained Iterator .
All of the following methods can be optimized in such a manner:
any
all
count
find
{,try_}fold
{,try_}for_each
last
{min,max}{,_by,_by_key}
nth
partition
{,r}position
product
sum
unzip
Overriding just fold (or I guess you should override try_fold as of 1.27) will automatically optimize the vast majority of these, which use it in their default implementation.
Of course, @scottmcm is right. nth can’t use try_fold.
A small hiccup: try_fold takes &mut self with Self: Sized. So at best we need &mut &mut Self.
Another hiccup: by_ref takes Self: Sized, so you can’t use it. (however, you can just mutably borrow self)
After clearing those hiccups, it compiles, but does not forward to the optimized try_fold of the underlying iterator. This is because impl Iterator for &'a mut T does not forward the try_fold method, because it can’t, because the impl is for T: ?Sized.
An obvious corollary to this is that by_ref (or taking &mut iter) destroys virtually all of the fold optimizations!