Idiomatic way to __collect__ errors

  1. I know about Option, Result, anyhow::Result.

  2. Suppose we have a list of elements Vec<K> and we want to transform it into another list Vec<V>. If we have f: (&K) -> V, this straight forward.

Suppose now, that the function f can fail, so we have f: (&K) -> Option<V> or f: (&K) -> anyhow::Result<V>. Then, if we use collect, we get either (1) a Vec<V> or (2) the first error.

Suppose we wanted a different behavior, (Vec<V>, Vec<Err>), two vectors -- a collection of all the good output, and a collection of all the errors. This is not a difficult function to write. Is there an idiomatic way to do this in Rust ?

[Suppose that the index does not matter, i.e. within both V and Err, we have enough info to know which K it came from]

Iterating over Results - Rust By Example (rust-lang.org).

let (results, errors): (Vec<_>, Vec<_>) = keys
    .into_iter()
    .map(f)
    .partition(Result::is_ok);
let results: Vec<_> = numbers.into_iter().map(Result::unwrap).collect();
let errors: Vec<_> = errors.into_iter().map(Result::unwrap_err).collect();
5 Likes

And the extra step can be skipped by using https://docs.rs/itertools/0.10.0/itertools/trait.Itertools.html#method.partition_map so that the types come out right the first time.

5 Likes

Not really. That's only if you purposefully collect into a Result<Vec<_>, _>. Instead, you could flip the composition of types inside out, and collect into a Vec<Result<_, _>>, or even directly unzip the iterator if you are fine with having Options in your collections.

Also note that for less specialized iterator methods, you can always use the very general fold() method. Here's a function that spares you the two unwrap()s by using fold(), and which is generic over both collection types: Playground.

fn collect_results<T, E, I, C, D>(iter: I) -> (C, D)
    where
        I: IntoIterator<Item=Result<T, E>>,
        C: Default + Extend<T>,
        D: Default + Extend<E>,
{
    iter.into_iter().fold(Default::default(), |(mut v, mut e), r| {
        match r {
            Ok(val) => v.extend(Some(val)),
            Err(err) => e.extend(Some(err)),
        }
        (v, e)
    })
}
1 Like

This is just reimplementing partition():
iterator.rs - source (rust-lang.org)

    fn partition<B, F>(self, f: F) -> (B, B)
    where
        Self: Sized,
        B: Default + Extend<Self::Item>,
        F: FnMut(&Self::Item) -> bool,
    {
        #[inline]
        fn extend<'a, T, B: Extend<T>>(
            mut f: impl FnMut(&T) -> bool + 'a,
            left: &'a mut B,
            right: &'a mut B,
        ) -> impl FnMut((), T) + 'a {
            move |(), x| {
                if f(&x) {
                    left.extend_one(x);
                } else {
                    right.extend_one(x);
                }
            }
        }

        let mut left: B = Default::default();
        let mut right: B = Default::default();

        self.fold((), extend(f, &mut left, &mut right));

        (left, right)
    }

No it isn't. Obviously there won't be a large discovery in these kind of trivial algorithms. But partition bakes in the type of the output items (they must be of the same as the input), so if you use it, you must then redundantly unwrap and unwrap_err each individual item of both collections.

My solution solves this problem by pattern matching the Results in-place, hence it is both safer and faster, because it avoids unwrapping and obviates the need for a second pass, saving two allocations.

1 Like

That's what Itertools::partition_map that I mentioned above does.

1 Like

For error's specifically, I typically pass-in a sink argument which collects non-critical errors:

pub  fn process_items(items: Vec<I>, error_sink: &mut dyn Fn(E)) -> Vec<O> {
  items.into_iter().filter_map(|x| match process_single(x) {
    Ok(it) ->Some(it),
    Err(it) -> { error_sink(it); None }
  }).collect()
}

fn process_single(item: I) -> Result<O, E> { ... }

This gives composability & ergonomics for the implementor and flexibility for the user.

2 Likes

Yes, I see, I was merely trying to provide a std-only solution.