Get all files and folders matching a filter

Hi all,

I'm new to rust, and I try to convert an existing C# application to Rust.
One of things that I should do I fetch all the files and directories in a folder and returns the ones for which the name does NOT end with something specific.
I decided to create a function on which I can pass a closure which defines the filter, so that it's reusable.
I'm looking on feedback on what you think about the implementation.
Thanks in advance.

use std::fs;
use std::io;
use std::path;

type PathFilter = fn(&path::PathBuf) -> bool;
type FilteredDirResults = Result<Vec<path::PathBuf>, io::Error>;

pub fn read_dir_filtered(dir: &str, filter: PathFilter) -> FilteredDirResults {
    match std::fs::read_dir(dir) {
        Ok(reader) => match get_all_that_matches_filter(reader, filter) {
            Ok(res) => Ok(res),
            Err(err) => Err(err),
        },
        Err(err) => Err(err),
    }
}

fn get_all_that_matches_filter(mut reader: fs::ReadDir, filter: PathFilter) -> FilteredDirResults {
    let mut res: Vec<std::path::PathBuf> = Vec::new();

    while let Some(entry) = reader.next() {
        let path = entry.unwrap().path();

        if matches(&path, filter) {
            res.push(path);
        }
    }

    Ok(res)
}

fn matches(input: &path::PathBuf, filter: PathFilter) -> bool {
    filter(input)
}

Here are my comments in the order I noticed them.

  • I would not hide a Result behind a type alias in that manner. I would write out the return type of those methods as io::Result<Vec<PathBuf>>, with a full import of PathBuf.
  • Using generics or impl Fn rather than a function pointer will make it easier for the compiler to optimize it.
  • We would typically use &Path over &PathBuf as argument.
  • You .unwrap() your errors, yet you return a Result. The get_all_that_matches_filter function can in fact never return an error by any way other than a panic.
  • We generally import our types fully rather than using a fully qualified path, except maybe if you use it only once.
  • You don't need a type annotation on the vector.
  • You can use the question mark operator to vastly simply your match statements.

Obligatory mention of

You can use ? to help simplify things, such as

pub fn read_dir_filtered(dir: &str, filter: PathFilter) -> FilteredDirResults {
    let reader = std::fs::read_dir(dir)?;
    let res = get_all_that_matches_filter(reader, filter)?;
    Ok(res)
}
2 Likes

Thanks for the remarks.
Could you explain a bit more on the Generic stuff?

Using generics you can write your methods in any one of the following ways:

fn matches(input: &path::PathBuf, filter: impl Fn(&path::PathBuf) -> bool) -> bool {
    filter(input)
}
fn matches<F: Fn(&path::PathBuf) -> bool>(input: &path::PathBuf, filter: F) -> bool {
    filter(input)
}
fn matches<F>(input: &path::PathBuf, filter: F) -> bool
where
    F: Fn(&path::PathBuf) -> bool,
{
    filter(input)
}

All three examples above are different ways of writing the same thing. They are not quite equivalent to your use of PathFilter because generics are monomorphized. This means that if you called matches multiple times with different filters, then the compiler would generate a separate copy of matches for each call, each version having the filter hard-coded in the resulting assembly. When you use PathFilter, it cannot hardcode the filter because you are not using generics.

1 Like

Thanks for mentioning the crate.
But since I'm new to Rust, I believe that writing certain utility functions myself is better for learning :slight_smile:

1 Like

You’re calling reader.next directly. If only to practice using some of the more functional sides of Rust, you could try expressing the while block converting reader to a struct that implements Iterator. At that point use filter from the std::iter::Iterator package.

Seems like an interesting idea, but it seems that the ReadDir struct already implements the Iterator trait. So what's the point of wrapping it in a struct and implementing it again?
A short example on how to make this work is highly appreciated :slight_smile:

When you call Iterator's filter() method it'll take one iterator and give you back a new one which only yields items satisfying your predicate. Kinda like how the Where() method in LINQ returns a new IEnumerable<T> instead of modifying the original.