Imitating CLI args with global static & race condition

I use a CLI wrapper that simply reads args into a vec and lets a macro extract it into vars.

To be able to test this, I have a static global vec (in Mutex, of course, with all necessary ceremonies), and write into it if I want to simulate a call. Turns out, when tests are run in parallel, it experiences race conditions, and tests may randomly fail or not.

I checked if clap has testing capabilities, and found trycmd, but it seems far from what I'm trying to do.

Is there a relatively simple fix, like thread-local context object, or something like that?

Here's a very simplified model of what my code does:

static PARAMS: Mutex<Option<Vec<String>>> = Mutex::new(None);

fn set_params(data: &[&str]) {
    let mut params = PARAMS.lock().unwrap()?;
    *params = Some(data.iter().map(|v| v.to_string()).collect());
}

fn get_params() -> Option<Vec<String>> {
    PARAMS.lock().unwrap().clone()
}

(there's one function that pops an item from the vec, so other tests may fail because 1 parameter was missing)

I don't quite understand what you're trying to achieve.

But you can test clap with try_parse_from.

3 Likes

Trying to describe it as simple as possible, I try to imitate a CLI call. main calls a macro that checks PARAMS, or if empty, reads from std::env::args_os(). In tests (and only in tests), I fill PARAMS with Some(vec![...]).

The issue here is that when 2 tests are running, a race condition happens with PARAMS -- a test writes them, then the other one overrides them with another set of values, then the first test's main gets wrong params and fails.

static PARAMS: Mutex<Option<Vec<String>>> = ...

fn set_params(data: &[&str]) {
    let mut params = PARAMS.lock().unwrap()?;
    *params = Some(data.iter().map(|v| v.to_string()).collect());
}

fn get_params() -> Option<Vec<String>> {
    PARAMS.lock().unwrap().clone()
}

macro_rules! my_parser ($($varname:ident),+) => {
    // read PARAMS into $varname's
    /// if PARAMS is None, reads from std::env::args_os()
}

fn main() {
     my_parser!(var1, var2, var3);
     // do useful work
}

#[cfg(test)]
mod my_binary_tests {
    #[test]
    fn sample_call1() {
         set_params("param1", "param2", "param3");
         main();
    }

    #[test]
    fn sample_call2() {
         set_params("param4", "param5", "param6");
         main();
    }
}

The easiest way to get rid of that race condition is to avoid global state.

Write a run function that takes your arguments and has your main's current body and make main call that.

5 Likes

Yes, it's viable. I actually tried to avoid that main + "real main" function tandem, but maybe that's the only sound way, and I'll try to minimize bureacracy with a macro.

In Python, it's a very ugly common pattern:

if __name__ == '__main__':  # = this file is running as an executable
     main()

and there's a package that was called argh that simplified this into a decorated function:

@argh.dispatch_command
def main(param1, param2, etc...):
    # do useful work

I was trying to achieve this kind of brevity.

This is the usual recommended & accepted way to check if your python file is currently interpreted as a script. Nothing wrong or ugly with that.

I think the solution without global state is less verbose and more legible. You get to see what goes in and what goes out of your program.

There are clap macros crates out there that build a fake main like your argh example. I don't remember their names.

2 Likes

The best solution is not to use globals. They are generally an anti-pattern, and Rust makes them even worse (on purpose, out of prejudice against global mutable state).

You could use thread-local storage. You can also make tests single-threaded:

static MAKE_TESTS_SLOW: Mutex<()> = Mutex::new(());

#[test]
fn test() {
    let _lock = MAKE_TESTS_SLOW.lock().unwrap();
}
4 Likes

I've followed the advice:

  1. separated mains into functions that take arguments
  2. split my CLI parsing macros into 2 layers: 1 that is raw and parses std::env::args_os, 2 that processes the vector of arguments.
  3. removed the global static var

Now question is how do I write tests for the layer that parses args_os? (I mean to mae a check that it wasn't casually changed?

I'm unaware of any packages that swap env::args_os in some local context or thread. Otherwise, it's again a global var.

fn parse_args_os() -> Vec<String> {
    std::env::args_os().skip(1).map(|s| s.to_str().unwrap().to_owned()).collect::<Vec<String>>()
}

macro_rules! autocli {
	($($arg_name:ident),+, $([$($opt_name:ident),+],)? $help_text:literal) => {
		let mut args: Vec<String> = parse_args_os();
        $crate::autocli_inner!(args, $($arg_name),+, $([$($opt_name),+],)? $help_text)
	};
}

macro_rules! autocli_inner {
    ($args:ident, $($arg_name:ident),+, $([$($opt_name:ident),+],)? $help_text:literal) => {
        let mut opts: std::collections::HashSet<String> = std::collections::HashSet::new();
        // process args (Vec<String>)
    }
}

fn main() {
    autocli!(param1, param2, "help text");
    inner(param1, param2);
}

fn inner(param1: String, param2: String) {
     // do useful work
}

Weren't you using clap before? Why are writing code like that now? Your whole code reminds me something like this, which is not very generous of me, but it feels like you're shooting yourself in the foot by not doing things simply.

It should be closer to this:

use clap::Parser;

#[derive(Parser)]
struct Opt {
    ...
}

fn main() {
     run(Opt::parse());
}

fn run(opt: Opt) {}

#[test]
fn foo() {
    run(Opt { ... });
}

Or have your own argument parser that you can test in isolation. It should take a slice of strings and output your desired struct maybe in a result, and you should just test that interface.

1 Like

Clap looks like shooting sparrows with a cannon (in English it's equal to "drinking from firehose" as I understand). Clap adds 10 crates to my dependencies, to a project which already becomes slow to compile. And all I need is to pass filenames in arguments. Even if I consider some more amenities, like deserialization, implementing non-trivial Deserialize is pain in the neck. So I decided I don't need 10 extra crates, just a 100 SLOC.

You can write your own parser. Just design it with a testable interface. You absolutely don't need weird things going on with global env state.

There are smaller argument parsers out there, I'm thinking of argh but there are others.

Edit: Comparison of arg parsers

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.