How to make good use of black_box from the test crate


#1

I used the benchmark feature of the test crate, and decided to benchmark two conversion functions I wrote.

#[bench]
fn bench_from_rgb(b: &mut Bencher) {
    b.iter(|| black_box(Lab::from_rgb(253, 120, 138)));
}

#[bench]
fn bench_to_rgb(b: &mut Bencher) {
    b.iter(|| black_box(PINK).to_rgb());
}

Result: the first took an average of 0ns and the second an average of 200ns per iteration. Because of the stark difference I suspected that the first test was getting overoptimized, despite using the black_box wrapper. I even tried returning a field of the created struct so that something was happening “outisde the black box” like in the second example, but it didn’t make a difference.

I gave up on trying to get a higher benchmark time out of the first example, because I was occasionally getting times of 1 or 2 ns per iteration, so I figured maybe it was just really quick.

Later, I changed the signature of the function to accept a slice to better suit its use case

b.iter(|| black_box(Lab::from_rgb(&[253, 120, 138])));

and suddenly the benchmark time shot up to an average of 180ns per iteration. So there’s obviously something I don’t understand about benchmarking and the black_box wrapper. Any ideas about what caused the test to suddenly do the work it was supposed to be doing all along?


#2

It could potentially be the case that the compiler was able to constant fold the first case so no work was actually being done at runtime. The slice may have defeated this folding for whatever reason - maybe due to bounds checks that weren’t able to be removed.