Where is flatten() skipping None documented?

Hi! I had some code like this:

let fh = File::open(input)?;
let reader = BufReader::new(fh);
for spam in reader.lines() {
    if let Some(line) = spam {
    }
}

and Clippy very helpfully mentioned that it can be simplified to:

for line in reader.lines().flatten() {
}

This is excellent, but my question is, where in the world is this behaviour of flatten documented? It doesn't seem to be mentioned at Iterator in std::iter - Rust nor in the book or Rust By Example... Should it be documented better?

4 Likes

The beta docs have a new example for this, saying "Flattening works on any IntoIterator type, including Option and Result." So what's happening here is that Option<T> turns into an iterator that is either empty for None, or yields exactly one item for Some(T).

11 Likes

flatten() operates on an Iterator whose items implement IntoIterator. It converts the items into iterators (if they aren't already) and flattens out the iterator of iterators. There's nothing particularly special about None that needs to be documented. All that is happening is that Option implements IntoIterator, with Some yielding an iterator of one element and None yielding an iterator of zero elements.

4 Likes

To add to the other answer, this is not a special case for Option, it just implements IntoIterator

2 Likes

Thanks! Even though, in theory, one can derive this behaviour from a complete understanding of IntoIterator, human brains are not as good at type inference as rustc, so it's nice to have this example.

4 Likes

The problem is, you can't expect every possible combination of traits and functions to be separately documented. It's simply not feasible. Inferring what happens here is easy enough – you only have to follow the signatures and traits one-by-one, after all.

1 Like

Sure, but in this case it's important enough to have a lint in Clippy.

As I like to say to anyone in earshot, "DRY is for code, not documentation"

7 Likes

Though it needs to be not too wet, or else it won't ever be fully updated, and you'll be stuck with inconsistent documentation that's worse.

Also, using flatten for options is arguably an anti-pattern. It's actually faster to use filter_map(identity) to flatten options.

EDIT: steffahn demonstrates below that it's often no longer a problem to use flatten for this.

10 Likes

Oh! Very interesting indeed... makes sense... once you understand what IntoIterator actually does :slight_smile:

Thanks!

1 Like

Huh, can you elaborate? To me filter_map(identity) looks needlessly complicated and it’s a shame if flatten() is suboptimal as it’s the most natural way to do it for me.

It's easier to write .flatten(), but the actual iterator is more complicated.

1 Like

Yeah, it's a shame, but flatten has to deal with the possibility of an iterator returning multiple items, and thus it's just fundamentally more complicated than filter_map that deals in one item at a time.

You can see it right in the type. FilterMap is just

where F is a ZST for flattening options, whereas Flatten is

because it needs to deal with the possibility of unconsumed front and back iterators.

2 Likes

Is flatten() on Options another case where using for_each instead of a for loop would solve the performance problem?

Also, given the overhead exists, it seems almost problematic that Clippy suggests this particular use of flatten with a for loop containing if let Some...?

I guess it's a question of what trade-offs you are willing to make. Likely Clippy optimizes for clarity/simplicity over performance here, and/or perhaps the author of the lint did not know that there is overhead associated with flatten. (I would have guessed there isn't.)

For for_each it should, yes. That means it doesn't need to worry about resumability any more -- since fold takes self -- and can thus just nest the loop handling.

And yup, looks like it does just that:

1 Like

I think it's documented under Option::iter, which "Returns an iterator over the possibly contained value." and provides this example:

let x = Some(4);
assert_eq!(x.iter().next(), Some(&4));

let x: Option<u32> = None;
assert_eq!(x.iter().next(), None);

Technically iter can deviate from IntoIterator but in practice this will never happen.

1 Like

Thanks everyone and sorry for the slightly exasperated tone of my initial message! To summarize:

  • Option mplements IntoIterator, and the resulting iterator will return no values if the Option contains None
  • The rest is commentary... but probably merits an example for a few commonly-encountered use cases like iterating over lines()
  • flat_map(identity) is potentially more efficient than flatten() but Clippy is (if I can trust the discussion on GitHub) correct to recommend the latter and warn about the former because flatten() is more idiomatic and also handles some edge cases better.

Thanks again, you people are excellent!

Codegen doesn't look all that different for any of the approaches. In the end it's nested scalar loops with no chance for unrolling or anything.

Ah, I guess the optimizer is smarter than I remembered for the cases where the iterator is local and thus it knows it doesn't need to write-back to it ever.

If you looks a just the next -- like if it's used in dyn Iterator -- then there's a big difference, due to not having the same pre-knowledge: https://rust.godbolt.org/z/aG444qGdW.

1 Like

Hrm, that would be fixable with specialization but I'm not sure if it's worth it considering iterators are usually used in loops.