Flattening and collecting a nested iterator of results

If I have an iterator of iterators like this, I can easily flatten and collect into the vector:

pub fn bar<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = J>,
    J: IntoIterator<Item = Result<i32>>,
{
    iter.into_iter().flatten().collect()
}

However, I ran into a situation where I have this instead:

pub fn foo<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<J>>,
    J: IntoIterator<Item = Result<i32>>,
{
    todo!()
}

I would still like to collect into a flat vector while propagating the first error that happens. The errors come from two different sources now – either from the outer iterator or from one of the inner iterators. The only solution I managed to come up with that avoids collecting the inner iterators into temporary vectors is this:

pub fn foo<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<J>>,
    J: IntoIterator<Item = Result<i32>>,
{
    let mut v = vec![];
    for inner in iter {
        for x in inner? {
            v.push(x?);
        }
    }
    Ok(v)
}

Is there any better way to do this? Specifically, can I somehow use collect() to take advantage of the optimizations in the FromIterator implementation of Vec?

(Playground)

No one else has answered yet, so I'll throw in what I can.

I played with this a bit, but I couldn't come up with anything better than you.

There's this option:

pub fn foo<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<J>>,
    J: IntoIterator<Item = Result<i32>>,
{
    iter.into_iter().flatten().flatten().collect()
}

Because Result implements IntoIterator, you can flatten it. But doing so ignores any errors from Result<J>, which isn't what you want.

If you can change your input, this would work

pub fn baz<I>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<Result<i32>>>,
{
    let result: Result<Result<Vec<i32>>> = iter.into_iter().collect();
    result?
}

But again, it's not what you asked for.

To do this generically, I think you would need an implementation of the form

impl<I, A, V, E> FromIterator<Result<I, E>> for Result<V, E>
where
    I: IntoIterator<Item = A>,
    V: FromIterator<A>,
{
    // ...
}

But that would collide with the existing implementation of FromIterator for Result.

From what I can tell, what you have right now is the best solution. Maybe someone else can come up with a better option?

A general approach that almost always works when you need to work with an Iterator<Item = Result<T>> as if it were an Iterator<Item = T> is to use itertools::process_results. A function like that is also used internally in the standard library in order to implement the ability to collect an iterator of Result<T> into a Result<Vec<T>>.

E.g.

// some Result<T>-style type synonym
type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;

pub fn foo<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<J>>,
    J: IntoIterator<Item = Result<i32>>,
{
    itertools::process_results(iter.into_iter(), |i| i.flatten().collect()).unwrap_or_else(Err)
}

The crate itertools also offers a method/adapter called flatten_ok which you could use here.

// some Result<T>-style type synonym
type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;

use itertools::Itertools;

pub fn foo<I, J>(iter: I) -> Result<Vec<i32>>
where
    I: IntoIterator<Item = Result<J>>,
    J: IntoIterator<Item = Result<i32>>,
{
    iter.into_iter().flatten_ok().map(|item| item.unwrap_or_else(Err)).collect()
}

In either case, the .unwrap_or_else(Err) method call flattens a Result<Result<T>> into Result<T>.

2 Likes

Thanks for pointing that out. I didn't think to check itertools.

That being said, in my opinion, the original solution is far more readable than any of these options. I guess it doesn't pre-allocate the vector, though, which is a performance penalty.

I believe that the other solutions probably don't pre-allocate the whole vector either. The size_hint of the involved iterator from process_results (both the itertools version, and the std-internal version that's used in the FromIterator<Result<...>> implementation) accurately has a lower-bound of zero for its length, and IIRC it's the lower bound that counts for how much array is pre-allocated by the FromIterator-implementation for Vec.

1 Like

Doesn't it use specialization to pre-allocate for I: TrustedLen? That uses the upper limit.

ResultShunt can't be TrustedLen, though, because knowing how long it is would require knowing where the Err was, if any, which is impossible in general.

Thanks everyone! Itertools::flatten_ok() looks like the best solution in functional style at the moment, but I guess I'll stick with the imperative version. Rust isn't Haskell, after all.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.