Fluent approach and unwrap


#1

Coming from other languages, trying to create a data structure containing the names of all the plain files in a directory, I find myself with:

let files: Vec<String> = std::fs::read_dir(path)?
        .map(|it| it.unwrap())
        .filter(|it| it.file_type().unwrap().is_file())
        .map(|it| it.file_name().into_string().unwrap())
        .collect();

Looking at this, I get the strong feeling this really isn’t the Rust way of doing this.

Does anyone have a moment to give me some indicators as to what a more idiomatic way of writing this would be?


#2

There are two common strategies to reduce use of unwrapping in Rust code:

  1. Keep things in Options and Results as long as possible.
  2. Propagate the remaining errors to the outer scope, unless they reflect a bug (in which case unwrapping is fine, but you might prefer the more self-documenting expect())

In trying to do so, you will find that you need to refine your interface contract and figure out your error handling strategy. For example, in your first filter(), which selects files, what do you want to do when directory enumeration failed for a specific file?

  • Is this an unexpected scenario and should the entire program bomb (current behaviour, can be improved by adding an error message with expect())?
  • Do you only want to report readable files, and should this specific file be thus silently ignored?
  • Should the program proceed, but report that an error occurred for some files? In this case, should it report it on a per-file basis, as a counter of files which failed, or as a mere binary failed/succeeded statement?
  • Do you want to be specific about the kind of error(s) that occured, or is it enough to say that something went wrong?

Answer this question, and you will know how to replace your first map(|it| it.unwrap()).

The question appears again and again later on. Each time you carry out IO, it can fail for various reasons (missing permissions being the most common one). You need to figure out what your strategy for handling such failure is.


#3

There’s a problem with Rust Iterator APIs in that it’s impossible to go from Iterator<Item=Result<T, E>> to Result<Iterator<Item=T>, E>. This is understandable, because you need to know upfront, if you’ll get Ok or Err, but to know this, you must traverse the whole iterator. This is in contrast to Haskell, where swapping [] and Either is easy (but forces the evaluation of the whole list).

In this example, you hit this problem twice: with it.file_type().unwrap and with it.unwrap(). For this situation, I would probably just write a boring for loop then :slight_smile:

fn file_names(path: &Path) -> Result<Vec<String>, io::Error> {
    let mut result = Vec::new();
    for entry in fs::read_dir(path)? {
        let entry = entry?;
        if !entry.file_type()?.is_file() {
            continue
        }
        // Ignore non-unicode names
        if let Ok(name) = entry.file_name().into_string() {
            result.push(name)
        }
    }

    Ok(result)
}

It is however possible to get similar result purely with iterators, but it is somewhat more complex:

fn file_names(path: &Path) -> Result<Vec<String>, io::Error> {
    fs::read_dir(path)?
        .filter_map(|entry| match entry {
            Err(e) => Some(Err(e)),
            Ok(entry) => match entry.file_type() {
                Err(e) => Some(Err(e)),
                Ok(t) if t.is_file() => entry.file_name().into_string().ok().map(Ok),
                Ok(_) => None
            }
        })
        .collect()
}

I wonder if we should have a FailiableIterator in stdlib, or at least some extensions methods for Iterator<Item=Result>

EDIT: yeah, the last version works because it is possible to go from Iterator<Item=Result<T, E>> to Result<Vec<T>, E>.


#4

I see where you are heading with this, and appreciate the feedback. It seems then that Go and Rust end up being in the same camp, but Go with return codes and Rust with Option and Result (and indeed Java checked exceptions): you have to handle the error at the point of calling, whereas in Groovy, D, etc. you just let exception propagation handle things.


#5

Thanks for the example codes, much appreciated. I have to admit I am not a great fan of creating an empty container and then appending values – indoctrination by Python comprehensions perhaps. I think the second code is very much in line with the thinking from @HadrienG Deal with the Option or Result completely at the call site even at the expense of more code.

Thanks to both for prompt replies, I shall amend my code!


#6

Note that recently, the Rust community went through a lot of effort in order to make error handling based on return values feel more like exception propagation, which is a sort of convergence between the two approaches:

  • The ? syntax which you use, for example, was added to shorten the common “early exit on error” usage pattern.
  • The error-chain crate was built to ease the composition of Result (more specifically error) types from different crates, allowing things like ? to be used even when returning errors of different types. It also has some nice other features such as printing error backtraces, and is in general the recommended way to handle errors in large Rust programs.

#7

Wont’t

let files: Vec<String> = std::fs::read_dir(path)?
        .map(|it| it?.file_type()?)
        .filter(|it| it.is_file())
        .map(|it| it.file_name().into_string()?))
        .collect();

work? That looks somewhat nice to me.

(e) Ah yeah, as HadrienG mentions, you’d need to set up error handling properly.


#8

collect does this, in a similar way: it evaluates the list.

Specifically, collect on Iterator<Item=Result<T, E>> gives a Result<Collection<T>, E>.


#9

As others mentioned it all depends on what error handling granularity you want. Here is a quick and dirty version that reports problem with read_dir and discards all problematic entries (admittedly its ugly and nonidiomatic as hell as I’m on mobile)

link to playground

use std::fs;
use std::io;

fn run() -> io::Result<Vec<String>> {

    let files: Vec<_> = fs::read_dir("/etc")?
        .map(|it| -> io::Result<(fs::DirEntry, fs::Metadata)> {
            let it = it?;
            let meta = it.metadata()?;
            Ok((it, meta))
        })
        .filter_map(|tup| tup.ok())
        .filter(|tup| tup.1.is_file())
        .filter_map(|(it, _)| it.file_name().into_string().ok())
        .collect();

    Ok(files)
}

fn main() {
    println!("{:?}", run().unwrap());
}

#10

Looks like it is indeed possible to cook an extension trait to work with Iterator<Item=Result<_, _>> pleasantly. The end result looks like this:

use result_iterator::ResultIterator;

use std::fs::{self, DirEntry};
use std::io;
use std::path::Path;

fn file_names(path: &Path) -> Result<Vec<String>, io::Error> {
    let result = fs::read_dir(path)?
        .map_then(|entry| {
            let is_file = entry.file_type()?.is_file();
            Ok((entry, is_file))
        })
        .filter_ok(|&(_, is_file)| is_file)
        .map_ok(|(entry, _)| entry.file_name().into_string().unwrap())
        .collect::<Result<Vec<_>, _>>()?;
    Ok(result)
}

The “library” code for ResultIterator is here: https://gist.github.com/matklad/9ccac802025f80d62bf0225f484ef1f0.

Does anyone know some crate which already has ResultIterator-like extensions? :slight_smile:


#11

In a similar vein, Itertools has map_results, fold_results, fold_options, and while_some.


#12

Wow, thanks! I didn’t expect to find these methods on the Itertools trait, because the trait itself can’t mention result, but of course you can add bounds on Self in methods!