Using getopts: a request for code review


#1

Hey all. Curious as to which of these two styles you would use. I’m working on a little clone of the wc utility, and it has a number of options. There are two basic strategies: I can write the program like:

  1. Option handling
  2. Do the work
  3. print the results

Or, I can do

  1. do the work
  2. print the results based on option handling.

Style 1 looks like this:

    let mut print_lines = false;
    let mut print_words = false;
    let mut print_bytes = false;
    let mut print_chars = false;
    let mut print_max_line_length = false;

    let mut opts = Options::new();
    opts.optflag("l", "lines", "print the newline counts");
    opts.optflag("w", "words", "print the word counts");
    opts.optflag("c", "bytes", "print the byte counts");
    opts.optflag("m", "chars", "print the character counts");
    opts.optflag("L", "max-line-length", "print the length of the longest line");
    opts.optflag("h", "help", "display this help and exit");
    opts.optflag("v", "version", "output version information and exit");
    let matches = match opts.parse(&args[1..]) {
        Ok(m) => { m }
        Err(f) => { panic!(f.to_string()) }
    };
    if matches.opt_present("l") {
        print_lines = true;
    }
    if matches.opt_present("w") {
        print_words = true;
    }
    if matches.opt_present("c") {
        print_bytes = true;
    }
    if matches.opt_present("m") {
        print_chars = true;
    }
    if matches.opt_present("L") {
        print_max_line_length = true;
    }
    if matches.opt_present("h") {
        print_usage(&program, opts);
        return;

// do work

    if print_lines {
        print!("{}", lines);
    }

    if print_words {
        print!("{}", words)
    }

    if print_bytes {
        print!("{}", bytes)
    }

    if print_chars {
        print!("{}", chars)
    }

    if print_max_line_length {
        print!("{}", max_line_length)
    }

    println!(" {}", filename);

But I am always skeptical of ifs that just set booleans. I’m considering moving this to

   let mut opts = Options::new();

    opts.optflag("l", "lines", "print the newline counts");
    opts.optflag("w", "words", "print the word counts");
    opts.optflag("c", "bytes", "print the byte counts");
    opts.optflag("m", "chars", "print the character counts");
    opts.optflag("L", "max-line-length", "print the length of the longest line");
    opts.optflag("h", "help", "display this help and exit");
    opts.optflag("v", "version", "output version information and exit");

    let matches = match opts.parse(&args[1..]) {
        Ok(m) => { m }
        Err(f) => { panic!(f.to_string()) }
    };

// do work

    if matches.opt_present("l") {
        print!("{}", lines);
    }

    if matches.opt_present("w") {
        print!("{}", words)
    }

    if matches.opt_present("c") {
        print!("{}", bytes)
    }

    if matches.opt_present("m") {
        print!("{}", chars)
    }

    if matches.opt_present("L") {
        print!("{}", max_line_length)
    }

    println!(" {}", filename);

which seems like the code is cleaner, but the steps are a bit more intertwined. What do you think? Here’s the full thing, original:

and new:


#2

First off, I would write your if statements as

  let print_words =  matches.opt_present("w");

rather than

if matches.opt_present("w") {
    print_words = true;
}

since I agree with you that an if statement that just sets a boolean is silly. Also note that the flag is no longer mut, and cannot accidentally be accessed before it has been set.

I would parse options first, since you may want to introduce an expensive thing to compute that you don’t want to bother with unless it is requested. e.g. count prime numbers. Yes, that’s a silly example, but perhaps counting non-whitespace non-content lines in a specified language would be less silly.


#3

First style is only half-way to a proper separation of concerns, when you have something like

struct Options {
  print_lines: bool,
  ...
}

fn parse_arguments(cmd: &[str]) -> Result<Options> {
}

But for this task this is an overkill, so in my opinion shorter solution wins.

As a minor notice, maybe using longer options for querying matches in the second case would help readability:

    if matches.opt_present("max-line-length") {
        print!("{}", max_line_length)
    }

#4

Do you have to use getopts? I like argparse better: https://github.com/tailhook/rust-argparse.


#5

Ah this is nice! The compiler will already yell about early access, but still, you’re right that it’s better, for sure.

As it turns out, it’s fastest to always calculate all of the things, basically, or rather, it might be absolutely fastest to only calculate the specific permutations, but you can’t get super fast, since with bytes you need to iterate per byte anyway. I guess. It’s not trying to be as absolutely fast as possible, so it’s not a huge deal. Thanks :smile:

Yeah, this is a good point, I think. It might be nice because it enables try! usage, but I’m not sure that’s not over-engineering in this simple example. Nice point about the longer options

This will eventually be in the book, and we only use official distribution crates, so yes.


#6

But…

[master +2 ~0 -0 !] > cargo script -e "extern crate getopts; ()"
   Compiling expr v0.1.0 (file:///C:/Users/drk/AppData/Local/Cargo/script-cache/expr-d97dbaf6695207a8)
C:\Users\drk\AppData\Local\Cargo\script-cache\expr-d97dbaf6695207a8\expr.rs:5:2: 5:23 error: use of unstable library feature 'rustc_private': use the crates.io `getopts` library instead (see issue #27812)
...

Edit: Wait, you’re extending “official distribution” to include crates.io?

That feels a bit weak when you have to go through Cargo. That rather implies that the compiler ships with a very incomplete set of “standard” libraries. I know that the assumption is that people use Cargo, but implying that rustc alone isn’t even a functional “official distribution” seems deeply weird.

Besides, once you’re going through Cargo… well, why exclude other crates? Given that once a crate is published, it’s published for good (unless yanked, but that could happen to official crates, too), what’s wrong with having the book reflect the best available library/libraries?

To put it another way, is this policy going to preclude the book from having a section on shared stack parallel processing? I’d hope not, because that is seriously useful, not to mention something that got a lot of air time prior to (and even after) 1.0.


#7

Which one(s)? I think there are no fewer than 4 argument parsing libraries in active use. Exactly one of those is an officially supported crate. That makes it an easy choice for official documentation. :slight_smile:


#8

If it’s a case of “there is no clear winner”, that’s fine. It’s just the possibility of better crates being ignored because they’re not “official” that feels a little worrying.

Also, is getopts really “officially supported”? I mean, I got the impression that time is kind of half-deprecated now, despite being “officially supported” by fiat of being owned by the core devs.

I suppose it comes down to: is there any guaranteed, practical difference (beyond the owner) that distinguishes these crates from the rest of crates.io?

To be clear, I’m not even really sure there’s any sort of complaint here; more a vague feeling of unease. :slight_smile: