Filtering file names with ends_with


#1

Let’s say that I would like to have a list of all *.conf files from the “/etc” directory. The intuitive way to do this is by using read_dir and filter with .path().ends_with(".conf"), but it does not seem to work. Am I using the API wrong or is it a bug?

rustc 1.27.0-nightly (7360d6dd6 2018-04-15)

Playground

use std::fs;

fn main() {
    println!("Hello, world!");
    let files = fs::read_dir("/etc").unwrap();

    files
        .filter(move |n| {
            if let Ok(file) = n {
                println!("{:?}", file.path());
                println!("{:?}", file.path().ends_with("conf"));
                let i = format!("{:?}", file.path());
                println!("{}", i);
                println!("{:?}", i.ends_with("conf\""));
                //file.path().ends_with(".conf")
                //true
                false
            } else {
                false
            }
        })
        .for_each(|f| ());//println!("{:?}", f));
}

#2

I’d recommend a formulation using filter_map to only look at Ok(DirEntry) values, and ignore Err:

files
    .filter_map(Result::ok)
    .filter(|f| f.path().ends_with("conf"))
    .for_each(|f| ());

The issue with your code is n is a &Result<DirEntry, std::io::Error> - filter() only gives you a reference to the underlying value of the iterator. So if you’re going to pattern match it, you need to do if let &Ok(ref file) = n { ... }. Alternatively, you can deref n: if let Ok(ref file) = *n { ... }


#3

Thank you for the .filter_map(Result::ok), I didn’t know you can do such thing! Regarding the pattern matching, it works with Nightly compiler and I think it is defined in this RFC: https://github.com/rust-lang/rfcs/blob/master/text/2005-match-ergonomics.md

But even with your solution the resulting list does not contain files with “.conf” suffix. Do you know why is that?


#4

That’s because Path::ends_with() matches the whole file name (i.e., the terminal component of the path.) You need str::ends_with().

files
    .filter_map(Result::ok)
    .filter_map(|d| d.path().to_str().and_then(|f| if f.ends_with(".conf") { Some(d) } else { None }))
    .for_each(|_| ());

Caveat: this will skip paths with a non-UTF8 name.

Edit: changed the clumsy map + filter_map to a single filter_map.


#5

Thank you! I totally missed this, because I didn’t understand the second line in documentation: Only considers whole path components to match. (https://doc.rust-lang.org/std/path/struct.Path.html#method.ends_with).

Do you think it is a good idea to submit a PR with extended documentation for this method? To prevent others from the same confusion and stress the fact, that you cannot match the same way as with strings.

EDIT: alternative with extension method

use std::fs;

fn main() {
    println!("Version 1:");
    let files = fs::read_dir("/etc").unwrap();
    files
        .filter_map(Result::ok)
        .filter_map(|d| d.path().to_str().and_then(|f| if f.ends_with(".conf") { Some(d) } else { None }))
        .for_each(|f| println!("{:?}", f));
    
    println!("Version 2:");    
    let files = fs::read_dir("/etc").unwrap();
    files
        .filter_map(Result::ok)
        .filter(|d| if let Some(e) = d.path().extension() { e == "conf" } else {false})
        .for_each(|f| println!("{:?}", f));
}

#6

FWIW, I found the docs clear enough. (And yes, Path::extension() is good for this.)


#7

Argh - sorry! I was too focused on explaining all the other stuff to even notice.

Yup - this is going to improve. It’s still valuable to know how the current scheme works as you may encounter code that looks like this.


#9

Hi Martin,

Notice that you can write your second version a little shorter and more direct by doing a direct equal comparison on the extension:

use std::fs;
use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;

fn main() {
    let files = fs::read_dir("/etc").unwrap();
    files.filter_map(Result::ok)
        .filter(|d| d.path().extension() == Some(OsStr::from_bytes(b"conf")))
        .for_each(|f| println!("{:?}", f));
}

#10

Thanks! It is simple and looks much better :slight_smile:


#11

Only works on Unix though.