Confused by criterion options

I have been working through benchmarking with criterion. It's very good at showing me performance with a single code tree, but I am now experimenting with comparing different versions.

I am confused by the documentation which says:

  • --save-baseline <name> will compare against the named baseline, then overwrite it.
  • --baseline <name> will compare against the named baseline without overwriting it.
  • --load-baseline <name> will load the named baseline as the new data set rather than the previous baseline.

I can't understand what --baseline does compared to --load-baseline. They seem to me both to compare the new run against .

By default, criterion compares the current run to the previous run.

--save-baseline <name> compares to the old version of <name>, then saves the current version as <name>, to let you keep track of a baseline "acceptable" performance.

--baseline <name> changes the comparison so that you use <name> instead of the previous run.

--load-baseline <name> changes the comparison so that you use the run called <name> instead of the current run.

The idea is that when you're at a good baseline, you use --save-baseline to remember it. You then use --baseline to compare not against the previous run (for incremental improvements), but against the baseline. And you use --load-baseline to compare not the current state, but a named baseline.

A workflow with this looks like:

  • --save-baseline before-optimization to save the current state
  • --baseline before-optimization to compare to the baseline, not the previous state.
  • --save-baseline after-optimization-type-one to save a state that reflects one path to faster running.
  • --save-baseline after-optimization-type-two to save a second optimized state with a conflicting path to faster running
  • --load-baseline after-optimization-type-one --baseline after-optimization-type-two to skip running the code, and just compare the two optimized forms to decide which one to use as your new baseline.
1 Like

Oh, I see. So --load-baseline actually means "don't bench at all, just compare".

1 Like

Exactly. In particular, it means "use this named state as the result of benchmarking".

Many thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.