let fh = File::open(input)?;
let reader = BufReader::new(fh);
for spam in reader.lines() {
if let Some(line) = spam {
}
}
and Clippy very helpfully mentioned that it can be simplified to:
for line in reader.lines().flatten() {
}
This is excellent, but my question is, where in the world is this behaviour of flatten documented? It doesn't seem to be mentioned at Iterator in std::iter - Rust nor in the book or Rust By Example... Should it be documented better?
The beta docs have a new example for this, saying "Flattening works on any IntoIterator type, including Option and Result." So what's happening here is that Option<T> turns into an iterator that is either empty for None, or yields exactly one item for Some(T).
flatten() operates on an Iterator whose items implement IntoIterator. It converts the items into iterators (if they aren't already) and flattens out the iterator of iterators. There's nothing particularly special about None that needs to be documented. All that is happening is that Option implements IntoIterator, with Some yielding an iterator of one element and None yielding an iterator of zero elements.
Thanks! Even though, in theory, one can derive this behaviour from a complete understanding of IntoIterator, human brains are not as good at type inference as rustc, so it's nice to have this example.
The problem is, you can't expect every possible combination of traits and functions to be separately documented. It's simply not feasible. Inferring what happens here is easy enough – you only have to follow the signatures and traits one-by-one, after all.
Huh, can you elaborate? To me filter_map(identity) looks needlessly complicated and it’s a shame if flatten() is suboptimal as it’s the most natural way to do it for me.
Yeah, it's a shame, but flatten has to deal with the possibility of an iterator returning multiple items, and thus it's just fundamentally more complicated than filter_map that deals in one item at a time.
You can see it right in the type. FilterMap is just
where F is a ZST for flattening options, whereas Flatten is
because it needs to deal with the possibility of unconsumed front and back iterators.
Is flatten() on Options another case where using for_each instead of a for loop would solve the performance problem?
Also, given the overhead exists, it seems almost problematic that Clippy suggests this particular use of flatten with a for loop containing if let Some...?
I guess it's a question of what trade-offs you are willing to make. Likely Clippy optimizes for clarity/simplicity over performance here, and/or perhaps the author of the lint did not know that there is overhead associated with flatten. (I would have guessed there isn't.)
For for_each it should, yes. That means it doesn't need to worry about resumability any more -- since fold takes self -- and can thus just nest the loop handling.
Thanks everyone and sorry for the slightly exasperated tone of my initial message! To summarize:
Option mplements IntoIterator, and the resulting iterator will return no values if the Option contains None
The rest is commentary... but probably merits an example for a few commonly-encountered use cases like iterating over lines()
flat_map(identity) is potentially more efficient than flatten() but Clippy is (if I can trust the discussion on GitHub) correct to recommend the latter and warn about the former because flatten() is more idiomatic and also handles some edge cases better.
Ah, I guess the optimizer is smarter than I remembered for the cases where the iterator is local and thus it knows it doesn't need to write-back to it ever.
If you looks a just the next -- like if it's used in dyn Iterator -- then there's a big difference, due to not having the same pre-knowledge: https://rust.godbolt.org/z/aG444qGdW.