Retest: simple automated black box regression testing

#1

I’ve often found that I wanted to compare the past output of a program given a particular configuration, with its present output (i.e., I often want to do regression tests).

Previously I’ve always written small custom Python programs to do this, but now I’ve created a pure rust cross platform tool — retest — to make it easy to do automated black box regression testing.

Retest is a generic tool: you create a retest plan file (plain text) and then run retest -g to generate the application to test’s initial output files, and then subsequently (e.g., after bug fixes or new features), you run retest -v to have it compare what the application to test now outputs.

I couldn’t add it to crates.io because there’s already a crate called retest. [edit] So I renamed the package qtrac-retest, and used a [[bin]] section in Cargo.toml to keep the executable’s name as retest.

It is available from my website: retest. It is GPLv3-licensed.

2 Likes

#2

Looks like an interesting tool. Any chance you could create a version that is a library and could easily be integrated into cargo integration test scripts?

0 Likes

#3

I don’t see why not.

Of the top of my head two APIs come to mind.

let rt = retest::new("path/rt.rt")?;
rt.run_tests()?; // Assuming they were generated

or

let mut rt = retest::default()
    .app("myapp.exe").arg("-a").arg("-b"); // set other [env] things
let test = retest::new_test().name("alpha"); /// set other test things
rt.add(test);
rt.run()?;

Is this the kind of thing you had in mind?

0 Likes

#4

I think there latter would be better, and something like that should work nicely. It does leave open the question of how to collect the data in the first place…

0 Likes

#5

The advantages of using a .rt file are these:

  • You can write it independently of your code
  • You can generate the initial test expecteds and any subsequent ones using the retest executable
  • You can test using the retest executable (i.e., easier to debug)

The advantages of using a builder are:

  • You can do everything in code.

At the moment I can’t think of how to provide individual errors in the builder case. [edit] I’ll try to make the library compatible with the Log crate which will hopefully solve this issue.

Anyway, I’ll start with the .rt file API since that’s easiest and will help me split into a library and executable; then I’ll try to figure out the builder API.

0 Likes

#6

There other advantage of the builder API is that it more easily integrates with existing integration tests and CI infrastructure that uses cargo test.

0 Likes

#7

I’ve now got an API that uses a .rt filename and split into a library and an executable. This wasn’t too difficult because it was mostly changing and moving code rather than new stuff (and apart from changes to use statements etc., cargo + good error messages made it straightforward).

Next I’ll try to switch from println! to using the log crate (and outputting to stdout in the executable so its behavior is unchanged). Then I’ll try providing a builder API so that everything can be done in pure rust. This may take a bit of time, but I accept your point about integration so will try.

0 Likes

#8

I’ve now published 1.2.0 which provides retest as a library and as an executable.

The library has two APIs: one reads plain text retest plan files (.rt), and the other provides two builders, PlanBuilder and TestBuilder (a plan has one or more tests).

0 Likes

#9

After a quick look at the API I have a few suggestions.

Firstly, it would be nice to have a function that would combine the features of retest and generate, using an environment variable to decide when to generate. Any user of the API would have to do something similar in any case. This could just be a change to the retest method to do generate instead if a particular environment variable is defined.

I would rename TestBuilder to Test and remove its build method. Similarly, I’d cut the distinction between PlanBuilder and Plan. I’d think of Command as a model (which is clearly is in your API), which does not have a separate builder. Fewer types makes the API easier to learn.

For the pure rust API, I’d prefer for the generated executables to be in the path when running tests, so I could easily test both release and debug builds. The stdout method I’d prefer to take a path within the $STDOUT directory rather than expecting a string beginning with a dollar sign as is implied by the docs.

It would also be lovely to have a GitHub repository linked to on crates.io where we could file issues.

0 Likes

#10

I appreciate you taking the time to do this.

Do you have something in mind like this?

impl Plan {
    pub fn apply(&self) -> Result<Counts> {
        if environ("RETEST_GENERATE") exists & is "1" {  // psuedo-code
            self.generate()
        } else {
            self.retest()
        }
    }
}

Would you also want to be able to specify the expected and actual paths using environment variables? (Note to me: If so, these would need to be read in generate and retest so they’d work in those and for apply.)

I did use Command as a model as per the official builders doc’s preferred approach. In this model Command is a builder, and spawn() is its build function which returns a Process. So I’ve followed the model exactly except for using more explicit names.

[edit] I don’t want to merge PlanBuilder into Plan because their APIs and usage are too different. However, I have now merged TestBuilder into Test and got rid of TestBuilder (so upped the version to 2). I can’t get rid of the build() method because I want people to be able to both use the builder pattern or to use the builder methods as setters without having to keep reassigning back (see builder discussion).

I’m not sure what you’re suggesting here? Is it that you’d want to specify the expected and actual paths in environment variables?

For generating, only the expected path is needed. But both paths must be specified for retesting, since if you’re using an external comparison tool it needs to be passed the files to compare with their paths, e.g., generate() needs to pass, say, ["$EXPECTED_PATH/1.dat", "$ACTUAL_PATH/1.dat"].

Or are you referring to rust’s release and debug directories? In which case are you suggesting a RETEST_APP_PATH environment variable?

Maybe you’re looking at old docs? There’s no $STDOUT directory. For the Test::stdout() method you must supply a filename to be written to and its path must be $OUT_PATH, e.g., "$OUT_PATH/1.dat" since this path will be the expected path when generating and the actual path when retesting.

Even if these paths are specified as environment variables and picked up at runtime, $OUT_PATH is still needed because the stdout() method doesn’t know whether the test will be generated or retested.

This is tricky for me since I’ve never created a project there before and I use mercurial and have many test files that are specific to my setup that don’t make sense to publish. My email address is on the retest on crates.io page if anyone has issues.

Thank you.

0 Likes

#11

Yes, that’s what I had in mind.

I’m referring to those directories, and wouldn’t want you to use another environment variable, but rather to automatically listen to the one defined by cargo, so cargo test and cargo test --release would both work with no additional effort by the user.

What I meant was that I’d rather have the user specify “1.dat” and retest add the proper directory automatically. Requiring the string “$OUT_PATH” feels ugly, particularly when the string must always begin with the same nine characters. It would also be ideal to have a Path (or a Borrow<Path>) as input so users can more easily write portable code.

In this analog, Command is the equivalent of Plan, and Process (which you get by executing a Command) is the analog of your Counts.[quote=“mark, post:10, topic:24327”]
I don’t want to merge PlanBuilder into Plan because their APIs and usage are too different.
[/quote]
PlanBuilder has all of three productive methods: two setting paths, and one adding a test. Why would it be complicated to add those to Plan? They seem logically like they would be very natural to have in a Plan, and I suspect that you’re leaking your implementation into the API, which seems non-ideal. Presumably the use is that a Plan generated from an rt file cannot have its paths modified?

0 Likes

#12

To clarify what you’re after and what I plan to do:

  1. I will internally auto-prepend the "$OUT_PATH/" for non-absolute filenames so it is not needed by the Test::stdout() method.

  2. I will add a Plan::apply() method along the lines discussed using an env var called RETEST_GENERATE.

  3. If Test.app has no path and option_env!("CARGO_TARGET_DIR") produces Some(path), I will prepend Test.app with that path. (Or is using env!(...) more appropriate?)

  4. I will look again at merging PlanBuilder into Plan and dropping PlanBuilder.

I’ll have to bump the major version again since these are breaking changes.

I have 6 methods and 1 enum which take &strs represending a path or path and filename. In theory all of them could take a Path. I tend to avoid Path because I find Path and PathBuf hard to work with. Is it possible to have a signature that will take either a &str or a Path — and if so which is the form one would store as, String or PathBuf?

0 Likes

#13

I’d use AsRef<Path> which does what you want, b you can a &str or a &Path or &String or &PathBuf. See as an example Command::current_dir. You would then store an owned path-like thing as a PathBuf or if you wanted to mess with it an OsString.

0 Likes

#14

OK, I’ll try that.

BTW I notice that env! reads the environment at compile time, so shouldn’t I use std::env::var instead since that works at runtime? (I’m not clear on your use case so not sure which I need.)

0 Likes

#15

I’m having a problem with CARGO_TARGET_DIR. The application to test may or may not be a rust executable, so I can’t just prepend the Test.app with env::var("CARGO_TARGET_DIR") because that breaks tests of other things (e.g., I use retest to test Python apps).

One possible solution would be to have a Test::app_path() method to which users could pass, say, env!("CARGO_TARGET_DIR"), but which is empty by default. What do you think?

I’ve now got rid of $OUT_PATH from Test::stdout(). I’ve added the Plan::apply() method. And I’ve merged PlanBuilder into Plan (and thus eliminated PlanBuilder). None of this is released yet — will do so once the next bit is done.

Next I’ll try switching all path-accepting methods to using AsRef<Path> as you suggested.

0 Likes

#16

I’ve now released retest 3.0.0 which I believe incorporates all of the feedback you gave:

  • There’s now a Plan::appy() method that will generate or retest depending on the RETEST_GENERATE environment variable;
  • both TestBuilder and PlanBuilder have gone, with their builder methods merged into Plan and Test;
  • if Test::stdout() is used to specify a filename for capturing the application to test’s stdout, no path is needed: it will be prepended with the expected path when generating or with the actual path when retesting;
  • you can now make retest prepend every Test's application to test with a path from an environment variable (such as CARGO_TARGET_DIR) using Plan::app_path_env_var();
  • every method that took a &str filename or path now takes an AsRef<Path> as you suggested.

I also took the opportunity to do many other improvements, some user visible (a few new methods, some renamings, better docs), but most under the hood.

1 Like

#17

I guess I’m a little late in responding, but this misses what I meant to suggest, which was to add env!("CARGO_TARGET_DIR" to the PATH, not to prepend it to the path of the executable. This it wouldn’t interfere with calling any Python scripts unless you have a rust executable with the same name. Your function to specify an environment variable misses the point, since it results in the same amount of boilerplate as not having this feature. Still, it’s not that much boilerplate…

Retest seems to be doing two things: it is both a test harness and a tool for testing the reproducibility of output. I only expect to use it for the latter since I’m not interested in using two different test harnesses, which means that I expect to never have more than one Test per Plan. This having to call something only once per Plan is the same as once per Test.

When I get a chance I’ll try this out on one of my projects!

0 Likes

#18

Reproducibility of output was the main idea, and is what I use it for.

It is a pity I misunderstood about the path. I’ve made a note of it as an idea, but would like to wait a bit now to see how people get on with using it without any more user-visible changes.

0 Likes

#19

Thank you for your work. I’ve just tried it on my project - it works!

A few things that I’d expected from it out of the box but it doesn’t have:

  1. It looks like it doesn’t take PATH into account. The following plan doesn’t work:
[ENV]
APP: cargo
     run

because it tries to run ./cargo and I have to write

[ENV]
APP: /Users/konishchev/.cargo/bin/cargo
     run

(which is not portable) or create wrapper scripts in current directory.

  1. I’ve read the docs but found no option to get a diff output of failed tests. I surely can execute something like diff -ruP rt_expected rt_actual manually, but it looks like it’s a main feature for such tool which is logical to have right out of the box.
0 Likes

#20

Regarding your point 1., use

[ENV]
APP: $HOME/.cargo/bin/cargo
        run

Note that although $HOME looks Unix-y, retest makes it work on Windows too.

Regarding your point 2.

There could be lots of tests (100s or 1000s) so my concern was mostly whether a test broke or not. Furthermore the right diff to use may vary from test to test: for example, I have tests where .csv files are output which can be compared with diff, but other tests where .png files are output for which I need to use an image comparison tool, etc. And this may not necessarily be the same diff tool used by retest. This is because retest comparisons are about same/different as fast as possible, not the specifics of any differences.

However, you are right that it would be convenient to automate the output of differences where they occur.

One solution would be to add support for an ON_DIFF option which if on (the default being off) would tell retest what to do in the event of a difference being detected.

I’ll give this some thought and see if I can come up with a satisfactory solution.

Thanks for the feedback.

0 Likes