Avoid unwrap() (or expect()) in closure

So I have the following function:

fn expand_range(input: &str) -> Vec<usize> {
    let vector: Vec<usize> = input
        .split(&['-', ':'][..])
        .map(|x| x.parse().expect("Parsing of String into usize failed!"))
        .collect();
    (vector[0]..=vector[1]).collect()
}

Its purpose is to take a string like 1-5 or 3:5 and return the full range of numbers. The most straightforward way I could come up with was this iterator, however I couldn't figure out how to propagate the error in the map closure properly. Ideally I would want the function to return a result so I can use the ? operator. I know there are a couple of questions like this floating around but I wasn't able to make it work.
One reason probably is that I'm still struggling to understand error handling in Rust.
I have a couple of such cases and I would like to improve my code by doing this in a more idiomatic way.

What should the function return on input "1-5-foo" and "5-1"?

The way to turn .map(something.unwrap()).collect() into a construct that returns Result and doesn’t panic is by using the fact that .collect() supports Result:

So the straightforward translation of your code is:

fn expand_range(input: &str) -> Result<Vec<usize>, std::num::ParseIntError> {
    let vector: Vec<usize> = input
        .split(&['-', ':'][..])
        .map(|x| x.parse())
        .collect::<Result<_, _>>()?;
    Ok((vector[0]..=vector[1]).collect())
}

You might also want to make sure, somehow, that the vector doesn’t contain more than two numbers or the numbers aren’t decreasing, that’s probably why @Hyeonu asked some questions for clarification. In any case, there are ways to avoid allocation for the intermediate Vec since you’re only using two of its elements anyways; also you could consider returning the RangeInclusive iterator directly, avoiding the second vector, but returning “only” an iterator — whether this is useful depends on your use cases (i.e. whether you always need a vector anyways or an iterator would be okay, too).

2 Likes

Honestly, the function just assumes that it's never going to be used that way. But you're right, I need to include some logic for such cases.
Other than that: thanks, this solves my issue with this function.
Also: which ways for avoiding allocations do you mean? The intermediate vec only contains two elements (this function is only called in such a way to make sure this is the case). Returning the iterator would of course be possible but my program works with passing around vecs of usize. If I change this behaviour, I will come back to this.

I wanted to try myself at doing this “properly” (which is somewhat subjective, I guess, and might certainly be overkill anyways).

Here’s the result:

use thiserror::Error;

#[derive(Debug, Error)]
pub enum ExpandRangeError {
    #[error("error while parsing range bound: {0}")]
    ParseIntError(#[from] std::num::ParseIntError),
    #[error("too few or too many bounds (expected exactly 2)")]
    UnexpectedLength,
    #[error("lower bound is larger than upper bound")]
    DecreasingBounds,
}

use itertools::Itertools;
use std::ops::RangeInclusive;

pub fn expand_range(input: &str) -> Result<RangeInclusive<usize>, ExpandRangeError> {
    use ExpandRangeError::*;
    let (left_result, right_result) = input
        .split(&['-', ':'][..])
        .map(|x| x.parse())
        .collect_tuple()
        .ok_or(UnexpectedLength)?;
    let (left, right) = (left_result?, right_result?);
    if left > right {
        Err(DecreasingBounds)?
    }
    Ok(left..=right)
}
2 Likes

A follow-up question then:
If collecting the result is not an option like in this case:

  let input_vec: Vec<String> = input.split(',').map(|x| x.to_lowercase()).collect();
                output_vec = pdb
                    .residues()
                    .filter(|x| input_vec.contains(&x.name().unwrap().to_lowercase()))
                    .map(|x| x.serial_number())
                    .collect();

What to do then? The name method returns an Option and the value inside it needs to be converted. I tried making it so the closure returns a result but I only ever got deeper into compiler errors that I didn't quite understand. :frowning:

Would you mind giving me some more complete example code? It’s easier to do this with the help of the compiler and without having to re-invent all kinds of stub implementation/initialization :wink:

Without testing the code, something like

let input_vec: Vec<String> = input.split(',').map(|x| x.to_lowercase()).collect();
output_vec = pdb
    .residues()
    .filter_map(|x| {
        input_vec
            .contains(&x.name()?.to_lowercase())
            .then(|| x.serial_number())
    })
    .collect::<Result<_, _>>()?;

should work.

Edit: Wait.. that’s not quite right...

Maybe

output_vec = pdb
    .residues()
    .map(|x| {
        Ok(input_vec
            .contains(&x.name()?.to_lowercase())
            .then(|| x.serial_number()))
    })
    .filter_map(Result::transpose)
    .collect::<Result<_, _>>()?;

Ok sorry, didn't think of that :-/
Not quite sure how best to do this since the pdb variable is a PDB type from the pdbtbx crate that can only be generated from a file.
If it helps, the code is hosted here on ll. 436.

I tried your solution but says: cannot infer type for type parameter 'E' declared on the enum Result

More complete error messages would also be more helpful, i.e. the relevant section of cargo check output.

error[E0282]: type annotations needed
   --> src/functions.rs:427:25
    |
427 |                         Ok(input_vec
    |                         ^^ cannot infer type for type parameter `E` declared on the enum `Result`
    |
help: consider specifying the type arguments in the method call
    |
426 |                     .map::<B, F>(|x| {
    |                         ^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0282`.
error: could not compile `pdbman`

To learn more, run the command again with --verbose.


error[E0107]: wrong number of type arguments: expected 2, found 1
   --> src/functions.rs:426:22
    |
426 |                     .map::<Result<_, _>>(|x| {
    |                      ^^^ expected 2 type arguments

error: aborting due to previous error

For more information about this error, try `rustc --explain E0107`.
error: could not compile `pdbman`

To learn more, run the command again with --verbose.


error[E0282]: type annotations needed
   --> src/functions.rs:421:25
    |
421 |                         Ok(input_vec
    |                         ^^ cannot infer type for type parameter `E` declared on the enum `Result`
    |
help: consider specifying the type arguments in the method call
    |
420 |                     .map::<B, F>(|x| {
    |                         ^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0282`.
error: could not compile `pdbman`

What kind of type does name() return?

Edit: Presumably an option, in this case, you’d probably need to use name().ok_or("No Residue Name")? here too in case you didn’t already.'

And there’s still ambiguity I guess becaues both ?’s – the one in the map and the one after the collect can convert things… so maybe add a type annotation and just use your Result type synonym.

output_vec = pdb
    .residues()
    .map(|x| -> Result<_> {
        Ok(input_vec
            .contains(&x.name().ok_or("No Residue Name")?.to_lowercase())
            .then(|| x.serial_number()))
    })
    .filter_map(Result::transpose)
    .collect::<Result<_, _>>()?;

Edit2: Maybe the collect doesn’t even need the extra type argument anymore now.

Yes, it's an option and yes it needs to be converted to a result like you described. I did this in other places but not in this function because I still try to wrap my head around making it work.
Regardless of the presence of both ?s the compiler requests type annotations. However, I couldn't figure out what exactly to put there.

Still needs type annotation :slight_smile:

Ah, of course it does because of the different FromIterator implementations... well, if you write .collect::<Result<_>> using your type synonym, maybe the map closure doesn’t need an explicit return type anymore. Anyways, it’s always easier to add as much as possible until it compiles successfully and then possibly remove a few type annotations again. Or just leave in the redundancy.

I don't have a type synonym for this, so far I have just been putting all my error in boxes and figured I'd worry about that later. But indeed using this compiles:

output_vec = pdb
    .residues()
    .map(|x| {
        Ok(input_vec
            .contains(&x.name().ok_or("No Residue Name")?.to_lowercase())
            .then(|| x.serial_number()))
    })
    .filter_map(Result::transpose)
    .collect::<Result<Vec<isize>, Box<dyn Error>>>()?;
1 Like

Thanks for your help! Now I only need to understand what exactly this does :smiley:

I was referring to the type Result<T> = std::result::Result<T, Box<dyn Error>>;

Oh this. I took this out again because for some reason it broke one of my functions.

  • residues gives Iterator<Item = &Residue> (I think…)
  • map applies a (&Residue) -> Result<Option<isize>, Box<dyn Error>>> closure and returns an Iterator<Item = Result<Option<isize>>>
    • inside of the closure, the ? operator can be used on Result values whose error types are convertible to Box<dyn Error>
      • so, x.name().ok_or("No Residue Name") is a Result<isize, &'static str> and x.name().ok_or("No Residue Name")? is the contained isize where in the error case, the &'static str is converted to Box<dyn Error> and returned early
    • the expression input_vec.contains(&x.name().ok_or("No Residue Name")?.to_lowercase()) is a bool. The code uses the fairly new bool::then method which works with a closure such that true.then(f) evaluates to Some(f()) and false.then(f) evaluates to None. So the …….then(|| x.serial_number()) call evaluates to an Option<isize>. The whole thing is wrapped with Ok to give the Result<Option<isize>, Box<dyn Error>>.
  • filter_map expects an (OldItemType) -> Option<NewItemType> closure/function. The item type from the map was Result<Option<isize>, Box<dyn Error>>. Result::transpose converts Result<Option<T>, E> into Option<Result<T, E>>, so the result of filter_map is an Iterator<Item = Result<isize, Box<dyn Error>>, and the None cases are filtered out.
  • finally the collect implementation for Result is used as before, to collect the Iterator<Item = Result<isize, Box<dyn Error>> into a Result<Vec<isize>, Box<dyn Error>>, where in the error case, the collect operation is aborted and returns the first error it encounters.

By the way, the collect implementation uses something very similar to Itertools::process_results, which can be another useful tool for getting rid of unwrap in complicated settings with iterator combinators in certain situations.