The input is a u8
which is four 2-bit uints packed together.
I want to check whether the four bit fields are exactly 0,1,2,3 in any order. Inputs that meet this requirement are considered valid.
(Additionally, if the highest-order bit field is 0, and the other three are exactly 0,1,2 in any order, the input is also considered valid)
Here's a naive implementation which gets the correct result (Rust Playground):
fn is_valid(packed: u8) -> bool {
let u0 = packed & 0b11;
let u1 = (packed >> 2) & 0b11;
let u2 = (packed >> 4) & 0b11;
let u3 = (packed >> 6) & 0b11;
u0 != u1 && u0 != u2 && u1 != u2
&& (u0 != u3 && u1 != u3 && u2 != u3 || u3 == 0 && u0 != 3 && u1 != 3 && u2 != 3)
}
The inputs would mostly (>99%) be valid, so I want to optimize for the happy path. The merit of short-circuiting in sad paths is practically worthless.
And here's how I benchmarked it using Criterion:
use criterion::{black_box, criterion_group, criterion_main, Criterion};
// Add imports here
fn criterion_benchmark(c: &mut Criterion) {
let valid_values: Vec<_> = (0..=255).filter(|&u| is_valid(u)).collect();
let mut samples = vec![0];
for _ in 0..3 {
samples.extend_from_slice(&valid_values);
}
c.bench_function("is_valid", |b| {
b.iter(|| {
samples[0] += 1;
for &u in &samples {
black_box(is_valid(u));
}
})
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
By the way, I've tried using enums with num_derive or num_enum and here's the brief result (the first one is the naive implementation):
is_valid time: [176.17 ns 178.18 ns 180.44 ns]
change: [-4.0889% -2.6412% -1.2874%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
is_valid_enum_num_enum time: [196.51 ns 198.01 ns 199.60 ns]
change: [-2.2483% -1.0571% +0.1369%] (p = 0.09 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
is_valid_enum_num_derive
time: [346.85 ns 351.05 ns 356.00 ns]
change: [-1.0824% +0.4368% +1.9960%] (p = 0.58 > 0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild