Short-circuiting flat_map applied to a Result iterable

Hi,

I'm processing a key-value-like datastructure Item = (Key, Vec<Value>) to build the "reverse" mapping of (Value, Key), repeating Key for each Value.
Both keys and values (inside the Vec) need to go through a Fn(I) -> Result<O> transform, which we can assume is expensive to compute and therefore should not be called more than once per key and value.

The code below works, but introduces an intermediate collect() to a Vec which allows me to conveniently use the ? operator to short-circuit transforming the values when 1+ keys failed. I'd like to understand if I could get rid of that intermediate Vec. I was hoping std::iter would contain a try_map equivalent of try_fold (so I don't need to manually accumulate from an empty Vec) but it doesn't.

Note that I could do

input.iter()
  .flat_map(|k, values| values.iter().map(|v| (k,v)))
  .map(|(k, v)| try { (can_err(v)?, can_err(k)?) }).collect()

but this is wasteful because I'm calling can_err on the same key multiple times. Besides flat_map does not compose well with an iterator of Result<>.

I know crates like fallible_iterator::FlatMap - Rust could probably do this, and I've skimmed through itertools helpers but could not find what I need. Depending on yet another crate looks overkill.

Thanks for your help!

(Playground)

use std::io::{ErrorKind, Result};

/// Ok(len()) if even-sized, otherwise Err().
fn can_err(a: &str) -> Result<usize> {
    if a.len() % 2 == 0 {
        return Err(ErrorKind::InvalidInput.into());
    }
    Ok(a.len())
}

fn main() -> Result<()> {
    let input = [
        ("a", vec!["bbb"]),
        ("ccccc", vec!["ddd", "e"]),
        ("fff", vec!["BOOM"]), // try with BOM instead of BOOM
    ];
    let key_layer: Result<Vec<_>> = input
        .iter()
        .map(|(k, values)| can_err(k).map(|k| (k, values)))
        .collect();
    println!("{:?}", key_layer);

    let value_layer: Result<Vec<_>> = key_layer?
        .into_iter()
        .flat_map(|(k, values)| values.iter().map(move |v| can_err(v).map(|v| (k, v))))
        .collect();
    println!("{:?}", value_layer);

    Ok(())
}

I love iterators too, but I think this is a case where some plain loops are much more clear.

    let mut results = Vec::new();
    for (k, values) in input.iter() {
        let k = can_err(k)?;
        for v in values {
            let v = can_err(v)?;
            results.push((k, v));
        }
    }

(Or are you looking for a way with iterators specifically for some reason?)

Some options:

Using itertools::process_results

fn main() -> Result<()> {
    let input = [
        ("a", vec!["bbb"]),
        ("ccccc", vec!["ddd", "e"]),
        ("fff", vec!["BOM"]), // try with BOM instead of BOOM
    ];

    let value_layer = itertools::process_results(
        input
            .iter()
            .map(|(k, values)| can_err(k).map(|k| (k, values))),
        |i| {
            i.flat_map(|(k, values)| values.iter().map(move |v| can_err(v).map(|v| (k, v))))
                .collect::<Result<Vec<_>>>()
        },
    )??;
    println!("{:?}", value_layer);

    Ok(())
}

Using map+Itertools::flatten_ok and an additional call to map flattening the resulting Result<Result<T, E>, E>s:

fn main() -> Result<()> {
    let input = [
        ("a", vec!["bbb"]),
        ("ccccc", vec!["ddd", "e"]),
        ("fff", vec!["BOM"]), // try with BOM instead of BOOM
    ];

    let value_layer = input
        .iter()
        .map(|(k, values)| {
            can_err(k).map(|k| values.iter().map(move |v| can_err(v).map(|v| (k, v))))
        })
        .flatten_ok()
        .map(|rr| rr.and_then(|r| r))
        .collect::<Result<Vec<_>>>();
    println!("{:?}", value_layer);

    Ok(())
}

I agree that readability is terrible here and I ended-up writing this "imperatively" like you did. In a language with first-class generators such as Python, I'd have just written what you did but with a yield so it can feed into any container. What I dislike the most here is the lack of syntactic sugar for pushing in the Vec, the rest looks just fine. I'd love being able to write imperative loops while being able to use ? and yield, something like

let nice: Vec<_> = for ... in ... {
  for ... in ... { let item = foo(...)?; yield item; }
}.collect();

But anyway :smile: Thanks!

Ah thanks, I suppose I did not try hard enough using itertools helpers. That makes sense, I better understand the API for process_results now.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.