Idiomatic way to avoid match

How to write it in "idiomatic" rust?

pub fn file_info_size(file: &str) -> Result<u64, std::io::Error> {
    let metadata = std::fs::metadata(file)?;
    Ok(metadata.len())
}
pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    match file_info_size(path_a) {
        Ok(size_a) => match file_info_size(path_b) {
            Ok(size_b) => match size_a.partial_cmp(&size_b) {
                Some(val) => val,
                None => Ordering::Equal,
            },
            Err(e) => Ordering::Equal,
        },
        Err(e) => Ordering::Equal,
    }
}
3 Likes

I’m not sure which one is the best approach, but here’s a few…

Using unstable try_blocks

#![feature(try_blocks)]

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    Option::unwrap_or(
        try {
            file_info_size(path_a)
                .ok()?
                .partial_cmp(&file_info_size(path_b).ok()?)?
        },
        Ordering::Equal,
    )
}

and similarly the stable workaround for lack of try blocks, by defining and immediately calling a closure to make a scope for ? syntax:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    (|| {
        file_info_size(path_a)
            .ok()?
            .partial_cmp(&file_info_size(path_b).ok()?)
    })()
    .unwrap_or(Ordering::Equal)
}

Using and_then a few times:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    file_info_size(path_a)
        .ok()
        .and_then(|size_a| {
            file_info_size(path_b)
                .ok()
                .and_then(|size_b| size_a.partial_cmp(&size_b))
        })
        .unwrap_or(Ordering::Equal)
}

Using unstable let_chains

#![feature(let_chains)]

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    if let Ok(size_a) = file_info_size(path_a)
    && let Ok(size_b) = file_info_size(path_b)
    && let Some(val) = size_a.partial_cmp(&size_b) {
        val
    } else {
        Ordering::Equal
    }
}

or the stable, but slightly more repetitive (w.r.t. the else case) let … else syntax:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    let Ok(size_a) = file_info_size(path_a) else { return Ordering::Equal };
    let Ok(size_b) = file_info_size(path_b) else { return Ordering::Equal };
    let Some(val) = size_a.partial_cmp(&size_b) else { return Ordering::Equal };
    val
}

After writing this, I kind-of find the last two the most readable; slightly preferring let_chains, but they’re unstable, so then: let … else it is?


Edit: Here’s another fun approach I came up with

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    'block: {
        let Ok(size_a) = file_info_size(path_a) else { break 'block None };
        let Ok(size_b) = file_info_size(path_b) else { break 'block None };
        size_a.partial_cmp(&size_b)
    }
    .unwrap_or(Ordering::Equal)
}

Also, the closure approach becomes more nicely readable if the intermediate values are named:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    (|| {
        let size_a = file_info_size(path_a).ok()?;
        let size_b = file_info_size(path_b).ok()?;
        size_a.partial_cmp(&size_b)
    })()
    .unwrap_or(Ordering::Equal)
}

as does the try blocks approach…

#![feature(try_blocks)]

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    Option::unwrap_or(
        try {
            let size_a = file_info_size(path_a).ok()?;
            let size_b = file_info_size(path_b).ok()?;
            size_a.partial_cmp(&size_b)?
        },
        Ordering::Equal,
    )
}

The original versions of those two also suffer from rustfmt formatting and can probably be made slightly more readable by re-formatting alone; e.g. in each case, the first .ok()? should IMO definitely not go in its own line. And the and_then, too, is severely hindered by rustfmt… using a more Haskell-manual-monad-style-esque formatting seems nicer to me

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    file_info_size(path_a).ok().and_then( |size_a|
    file_info_size(path_b).ok().and_then( |size_b|
        size_a.partial_cmp(&size_b)
    )).unwrap_or(Ordering::Equal)
}

though I do find rustfmt useful in general, so syntax that plays more nicely with it has some merits.

29 Likes

That is really great answer. Thanks!
I personally like the unstable try_blocks but because them being unstable, my second choice goes to the closure.

1 Like

By the way, in case this code isn’t just a stand-in for something else: u64’s partial_cmp will never return None. One can use cmp instead.

3 Likes

Another idea of unstable features that would help deduplicate even more of the code: array’s try_map method:

#![feature(let_chains)]
#![feature(array_try_map)]

pub fn compare_paths_for_size1(path_a: &str, path_b: &str) -> Ordering {
    if let Ok([size_a, size_b]) = [path_a, path_b].try_map(file_info_size)
    && let Some(val) = size_a.partial_cmp(&size_b) {
        val
    } else {
        Ordering::Equal
    }
}

If you don’t use partial_cmp but cmp, it even becomes only dependent on the array_try_map.

#![feature(array_try_map)]

pub fn compare_paths_for_size1(path_a: &str, path_b: &str) -> Ordering {
    if let Ok([size_a, size_b]) = [path_a, path_b].try_map(file_info_size) {
        size_a.cmp(&size_b)
    } else {
        Ordering::Equal
    }
}
5 Likes

I found that using if let(...) sometimes is less convenient, because I still may need the None case, and is hardly applicable for Result unpacking.

An ugly but working solution, that I saw in a crate: nest matches. The example is made up.

let result = match match match input_var {
    Some(x) => x,
    None => panic!(),
} {
    Ok(VariantOne(y)) => y,
    Ok(VariantTwo(y)) => panic!(), // or return, or continue
    Err(_) => MakeSomethingUp(123),
} {
    VariantX(i) if i > 10 => do_a(),
    VarinatX(i) if i < 10 => do_b()
};

I found it useful for complicated unpacking where I needed to contitue or return in some earlier unpacking branches, and splitting it in multiple match clauses reduced repetition.

2 Likes

TBH, I'd just do this:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Result<Ordering, std::io::Error> {
    let size_a = file_info_size(path_a)?;
    let size_b = file_info_size(path_b)?;
    Ok(u64::cmp(&size_a, &size_b))
}

People can always .unwrap_or(Ordering::Equal) if they want, but one might as well tell the caller if you couldn't compare things for some reason.

23 Likes

Without avoiding match but matching on a tuple directly:

fn foo(path_a: &str, path_b: &str) -> Ordering {
    match (file_info_size(path_a), file_info_size(path_b)) {
        (Ok(size_a), Ok(size_b)) => len_a.cmp(&len_b), // u64 is Ord not only PartialOrd
        _ => Ordering::Equal,
    }
}

Though I prefer @scottmcm solution.

7 Likes

Phew, thought I was losing my mind. The OP has a simple question about a simple task and there comes a dozen solutions that make it unfathomable, employing all manner of obscure techniques.

Actually I thought the OP's code was suffering from unreadability as well. To many nesting levels for the sake of only having a single exit point. That would look messy even in C. Better to have early returns. See below.

How about a step further. Get rid of the redundant file_size_info() function:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Result<Ordering, std::io::Error> {
    let size_a = std::fs::metadata(path_a)?.len();
    let size_b = std::fs::metadata(path_b)?.len();
    Ok(size_a.cmp(&size_b))
}

I would argue that returning the error in this case is a perfectly idiomatic way of avoiding the match.

I wonder about the motivation of the OP's code. It returns Ordering::Equal even if one file is missing!

9 Likes

Anyhow, now that we have cleared away all the clutter, made the function readable and reduced it to some kind of minimum why not add some nice error checking, so we know which file fails to be read for example:

use anyhow::{Context, Result};
use std::cmp::Ordering;

pub fn compare_file_sizes(path_a: &str, path_b: &str) -> Result<Ordering> {
    let size_a = std::fs::metadata(path_a)
        .with_context(|| format!("Failed to read metadata for {}", path_a))?
        .len();
    let size_b = std::fs::metadata(path_b)
        .with_context(|| format!("Failed to read metadata for {}", path_b))?
        .len();
    Ok(size_a.cmp(&size_b))
}

fn main() -> Result<()> {
    let ordering = compare_file_sizes("path_a", "path_b")?;

    // Do something with file size ordering...

    Ok(())
}

See what I did there?

I'm a little bit late to the party, but would like to note the following.

There are actually two tasks to solve here:

  1. A sequence of instructions should be interrupted the first time a problem occurs.
  2. In case of a problem/interruption a default value should be returned.

Therefore solve the task in two steps:

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    let compare = move || {
        let size_a = file_info_size(path_a).ok()?;
        let size_b = file_info_size(path_b).ok()?;
        size_a.partial_cmp(&size_b)
    };

    compare().unwrap_or(Ordering::Equal)
}

Each Result<T> can be converted to an Option<T> by .ok(). Exiting a sequence of computation is done easily with the ?-operator. Finally .unwrap_or(...) is used to set the default value of the computation in case it exited with a None value early.

1 Like

Thanks, that is really clean. I love it!

Yes, I like the way of employing anyhow for clear error report. TBH, In present state, your solution needs further refactoring, but yes, the direction is one I would choose as well. Thanks.

fn get_file_size(path: &str) -> Result<u64> {
    let size = std::fs::metadata(path)
        .with_context(|| format!("Failed to read metadata for {}", path))?
        .len();

    Ok(size)
}
pub fn compare_file_sizes(path_a: &str, path_b: &str) -> Result<Ordering> {
    let size_a = get_file_size(path_a)?;
    let size_b = get_file_size(path_b)?;

    Ok(size_a.cmp(&size_b))
}
4 Likes

I actually find it unhelpful, for the reason that if you get rid of the file_size_info, you immediately loose readability - you are adding additional burden (however small) on the next programmer to read and deduct what this line of code is doing instead of the immediately obvious, file_size. Also code reuse(in case later on you need the same functionality) is suffering etc.

1 Like

Now that we have the error handling in place I like the refactoring again.

1 Like

I'd like to add a way that I think is quite readable while preserving OP's code's semantic by not avoiding match but to match on both values together. This is quite common in other FP languages that support pattern-matching on complex constructs.

Hope it helps...

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    match (file_info_size(path_a), file_info_size(path_b)) {
        (Ok(sz_a), Ok(sz_b)) => sz_a.partial_cmp(&sz_b).unwrap_or(Ordering::Equal),
        _ => Ordering::Equal,
    }
}
1 Like

Unlike Haskell, Rust isn't lazy, so both calls to file_info_size would be evaluated before the match this way. That's a different behavior from the original program that avoided the second call in case the first one errors.

3 Likes

Good point on the difference between lazy/non-lazy eval. I don't know about OP's exact requirement but this is definitely worth noting.

pub fn file_info_size(file: &str) -> Result<u64, std::io::Error> {
    let metadata = std::fs::metadata(file)?;
    Ok(metadata.len())
}

pub fn compare_paths_for_size(path_a: &str, path_b: &str) -> Ordering {
    file_info_size(path_a)
    .ok()
    .zip(file_info_size(path_b).ok())
    .and_then(|(size_a, size_b)| size_a.partial_cmp(&size_b))
    .unwrap_or(Ordering::Equal)
}
1 Like

I would have suggested similar implementation myself. I'm thinking one should always return result and use the question mark shorthand to mark places that may cause early return.

That way you basically get all the plus sides of C++ exceptions and none of the cons.

If function takes filenamed or files as input, it definitely shouldn't report "files are equal" if one of the files has IO error or is missing. Only caller can know the correct way to handle such errors.