How can I make FlatMap::size_hint use my iterator's size_hint

use core::iter::*;

pub enum SingleOrIter<T, I: Iterator<Item = T>> {
    Single(Option<T>),
    Iter(I),
}

impl<T, I: Iterator<Item = T>> SingleOrIter<T, I> {
    pub fn single(item: T) -> Self {
        Self::Single(Some(item))
    }
    pub fn none() -> Self {
        Self::Single(None)
    }
    pub fn iter(iter: I) -> Self {
        Self::Iter(iter)
    }
}

impl<T, I: FusedIterator<Item = T>> Iterator for SingleOrIter<T, I> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        match self {
            Self::Single(item_option) => item_option.take(),
            Self::Iter(iter) => iter.next(),
        }
    }
    
    fn size_hint(&self) -> (usize, Option<usize>) {
        match self {
            SingleOrIter::Single(Some(_)) => (1, Some(1)),
            SingleOrIter::Single(None) => (0, Some(0)),
            SingleOrIter::Iter(i) => {
                assert!(
                    i.size_hint() != (0, None),
                    "SingleOrIter::Iter has no size hint"
                );
                i.size_hint()
            }
        }
    }
}
fn main() {
    let iter = [once(1), once(2), once(3)].into_iter();
    let iter = [
        SingleOrIter::Single(Some(1)),
        SingleOrIter::Iter(once(1)),
        SingleOrIter::Single(None),
    ]
    .into_iter()
    .flat_map(|x| x);
    assert_ne!(iter.size_hint(), (0, None), "FlatMap has no size hint");
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
warning: unused variable: `iter`
  --> src/main.rs:45:9
   |
45 |     let iter = [once(1), once(2), once(3)].into_iter();
   |         ^^^^ help: if this is intentional, prefix it with an underscore: `_iter`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: `playground` (bin "playground") generated 1 warning
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.28s
     Running `target/debug/playground`
thread 'main' panicked at src/main.rs:53:5:
assertion `left != right` failed: FlatMap has no size hint
  left: (0, None)
 right: (0, None)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

What do you propose the size hint should be? Since flat_mapping can discard all items or produce arbitrarily many new items per original item, there's no lower or upper bound that can be calculated without actually exhausting the iterator.

1 Like

@paramagnetic if I understand you right, you are saying generally cannot give a size hint.

It does implement TrustedLen though, and its size_hint method is implemented. (It is advanced unstable Rust however, which I am not smart enough to understand.)

I looked at that. It only returns an upper bound if the inner iterator is empty (based on its size_hint), perhaps taking into account a cached next and next_back element (iterator to flatten) from the inner iterator.

(I have no idea what causes these to get cached.)

In other words, only after it has exhausted the iterator, modulo some very limited caching.

3 Likes

That's the code I looked at, too ... and did not understand.

If I understand you right, you are saying, that FlatMap::size_hint will only return a usable value, when it is empty?

That can't be right.

let iter = [once(1), once(2), once(3)].into_iter().flat_map(|x|x);
println!("{:?}", iter.size_hint());

prints

(0, Some(3))

Without the `.flat_map(|x|x)´ it prints

(3, Some(3))

Probably an artifact of specialization or of whatever ConstSizeIntoIterator is about.

2 Likes

That's conceivable. So there really is no way to make FlatMap return a useful size. Unlucky.

There may technically be a way, but you'd have to figure out all the related unstable traits and specialization hacks. And I'm certain they're considered implementation details and not a guaranteed (hint that could be used for) optimization.

So unless you're confident this could help with some significant performance problem, it's probably not worth your time.

1 Like

The TrustedLen impl is conditional on U being arrays or references to arrays and the inner iterator also being TrustedLen.
IIf we wrap a TrustedLen iterator that gets flat-mapped to arrays which have a known size at compile time then we can multiply those to provide a proper size hint.

2 Likes

If you do https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e80f16fa8999b8cdc6e35b243278f51a

let mut iter = [
    SingleOrIter::Iter([1, 2, 3, 4].iter().copied()),
    SingleOrIter::Single(Some(1)),
    SingleOrIter::Single(None),
]
.into_iter()
.flat_map(|x| x);
iter.next(); // <-- NB
assert_ne!(iter.size_hint(), (0, None), "FlatMap has no size hint");

then the assertion will pass.

So it does use your size_hint, but only once it's actually pulled on the iterator once. If it's never called next on the inner iterator to get a SingleOrIter in the first place, then it has nothing on which to call size_hint.

(It definitely can't consume the whole iterator to get all the size_hints of all the elements, because then those elements are gone. If you want to walk the whole thing and add up the sizes of the intermediate things, you probably want slices instead of iterator. Like https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.concat.)

This is a great idea, but unfortunately in my case it did not work.

Maybe it is because the source of my iterator-variants is not an array, but a list based on a splay-tree. That's why I cant operate on slices, either. But it's fine. I do know the size and can ensure the capacity of the accepting collection beforehand. Having size_hint work would have just been the more elegant and robust way.

Thanks everybody :slight_smile:

That sounds like each iterator may have a different length, in which case there's not much you can do.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.