Is My Criterion Benchmarking Code Actually Answering My Question?

plasticartsshow · December 10, 2021, 5:32am

Being new to benchmarking, will the following code will actually reveal the knowledge I seek – Which std method is a faster way of determining if two floating point values have the same sign?

use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion};
// use num_traits::{float::Float};
use rand::{thread_rng, Rng};

type Pair = (f64, f64);

#[inline]
/// I think this will be slower because it includes a copysign call 
/// ^ I was wrong
fn compare_same_signum(test_pairs: &[Pair]) {
	test_pairs
		.iter()
		.for_each(|(a, b)| {
			let _ = a.signum() == b.signum();
		});
}

#[inline]
/// I think this will be faster because there's no copysign call 
/// ^ I was wrong
fn compare_same_is_sign(test_pairs: &[Pair]) {
	test_pairs
		.iter()
		.for_each(|(a, b)| {
			let _ = a.is_sign_negative() == b.is_sign_negative();
		});
}

#[inline]
/// I think this will maybe be on par with the is_sign methods which use to_bits 
/// ^ I was kind of right
fn compare_same_to_bits(test_pairs: &[Pair]) {
	let mask = 0x8000_0000_0000_0000u64;
	test_pairs
		.iter()
		.for_each(|(a,b)| {
			let _ = (a.to_bits() & b.to_bits()) & mask != 0;
		});
} 

fn bench_float_comparisons(c: &mut Criterion) {
	// Some random test data 
	let mut rng = thread_rng();
	let test_size = 100_000_000usize;
	let test_pairs = (0..test_size)
		.map(|_| (rng.gen(), rng.gen()))
		.collect::<Vec<Pair>>();
	// A benchmarking group
	let mut group = c.benchmark_group("float sign comparisons");
	// Comparisons on differently-logarithmic-decade-sized slices of the same random test data
	for i in (1..8).rev().map(|d| test_size/10usize.pow(d)) {
		group.bench_with_input(
			BenchmarkId::new("signum", i), 
			&i,
			|b, i| b.iter(|| {
				compare_same_signum(&test_pairs[0..*i])
			})
		);
		group.bench_with_input(
			BenchmarkId::new("is_sign_negative", i),
			&i,
			|b, i| b.iter(|| {
				compare_same_is_sign(&test_pairs[0..*i])
			})
		);	
		group.bench_with_input(
			BenchmarkId::new("to_bits", i),
			&i,
			|b, i| b.iter(|| {
				compare_same_to_bits(&test_pairs[0..*i])
			})
		);	

	}
	group.finish();
}

criterion_group!(benches, bench_float_comparisons);
criterion_main!(benches);

Is there a more accurate way to do this? Do you see any major issues with the way I write the benchmark?

Unnecessary background information:
I am partially doing this to learn to benchmark properly/accurately before I try benchmarking larger code sections that rely on the similar operations to classify points, vectors, etc that are known to be finite – otherwise I'd just stick to signum.

I ran the code a few times, but after viewing the results, my brain has neither grown any larger nor gained heightened powers of perception.

2e71828 · December 10, 2021, 5:38am

I think you need to return something from the test functions: As it stands right now, the optimizer might notice that your functions don't actually do anything and therefore remove the code you intend to test. For example, you could return the count of pairs that have the same sign:

#[inline]
fn compare_same_signum(test_pairs: &[Pair])->usize {
	test_pairs
		.iter()
		.filter(|(a, b)| a.signum() == b.signum())
		.count()
}

plasticartsshow · December 10, 2021, 8:17am

Thanks! This seems to produce more consistent results that make sense.

H2CO3 · December 10, 2021, 8:22am

By the way, if you are trying to learn proper benchmarking practice, do it with something much more obvious. In this case, I'd expect no difference; determining the sign of a floating-point number is trivial (it can be expressed by a couple of bitwise and operations and an equality check). With any reasonably smart compiler/optimizer, there will be no actual calls to copysign or any other functions, the equivalent code will be inlined. So try to do something non-trivial instead, which will have an obvious performance difference, like sorting a small and a big array or something.

One more thing, your compare_same_to_bits function is not correct, it checks if both numbers are negative. You probably want bitwise xor instead of the first bitwise and between the two numbers.

plasticartsshow · December 10, 2021, 10:21am

Thanks. I didn't notice that error in my compare_same_to_bits.

The motivation for testing this particular example was doing a bunch of comparisons of the signs of groups of floating point numbers in real code – I thought it'd be an easy start, but I'll definitely take your point into consideration.

scottmcm · December 10, 2021, 7:53pm

The problem is that it's extremely difficult to extrapolate the results of nano-benchmarks to real code -- especially on modern super-scalar desktop chips. Not to mention what the optimizer will do with code -- if you measure general division, for example, the results are irrelevant for x / 10 because the compiler doesn't use division to do that.

CAD97 put together some great benchmarks in Converting a BGRA &[u8] to RGB [u8;N] (for images)? - #13 by CAD97 that show just how hard it is to understand how something will perform in aggregate. A bunch of operations show up as essentially free because of ILP and speculation and such -- in fact, one of the ones with the most instructions ends up being one of the fastest, and the one that's the fewest instructions is one of the slowest.

So it's critical to find a bigger chunk to measure. Ideally something with a meaningful loop that can run both smaller and larger instances of the problem -- how the unrolling & vectorization ends up can often be more important than how a single body run performs in isolation.

system · March 10, 2022, 7:53pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Benchmarking with criterion using random data help	2	811	June 8, 2023
Benchmark Question help	4	198	December 2, 2023
Problem with criterion benchmarks help	11	1176	July 2, 2020
How to benchmark multi-threaded method? Is Criterion only useful for single-threaded code? help	3	2439	March 13, 2021
How to benchmark multiple versions of a library (with different compile-time constants) against each other? help	8	742	February 21, 2020

Is My Criterion Benchmarking Code Actually Answering My Question?

Related Topics