Is My Criterion Benchmarking Code Actually Answering My Question?

scottmcm · December 10, 2021, 7:53pm

The problem is that it's extremely difficult to extrapolate the results of nano-benchmarks to real code -- especially on modern super-scalar desktop chips. Not to mention what the optimizer will do with code -- if you measure general division, for example, the results are irrelevant for x / 10 because the compiler doesn't use division to do that.

CAD97 put together some great benchmarks in Converting a BGRA &[u8] to RGB [u8;N] (for images)? - #13 by CAD97 that show just how hard it is to understand how something will perform in aggregate. A bunch of operations show up as essentially free because of ILP and speculation and such -- in fact, one of the ones with the most instructions ends up being one of the fastest, and the one that's the fewest instructions is one of the slowest.

So it's critical to find a bigger chunk to measure. Ideally something with a meaningful loop that can run both smaller and larger instances of the problem -- how the unrolling & vectorization ends up can often be more important than how a single body run performs in isolation.

Topic		Replies	Views
Benchmarking with criterion using random data help	2	811	June 8, 2023
Benchmark Question help	4	198	December 2, 2023
Problem with criterion benchmarks help	11	1176	July 2, 2020
How to benchmark multi-threaded method? Is Criterion only useful for single-threaded code? help	3	2439	March 13, 2021
How to benchmark multiple versions of a library (with different compile-time constants) against each other? help	8	742	February 21, 2020

Is My Criterion Benchmarking Code Actually Answering My Question?

Related Topics