test gbbq::gbbq_str ... bench: 62,806,212 ns/iter (+/- 4,198,622)
test gbbq::gbbq_str_defined_in_crate ... bench: 26,883,857 ns/iter (+/- 1,706,824)
test gbbq::gbbq_string ... bench: 65,836,727 ns/iter (+/- 3,355,627)
test gbbq::gbbq_string_defined_in_crate ... bench: 65,671,881 ns/iter (+/- 4,196,998)
test gbbq_async::gbbq_str ... bench: 63,689,579 ns/iter (+/- 3,565,807)
test gbbq_async::gbbq_str_defined_in_crate ... bench: 27,713,700 ns/iter (+/- 2,540,403)
test gbbq_async::gbbq_string ... bench: 67,637,128 ns/iter (+/- 3,493,143)
test gbbq_async::gbbq_string_defined_in_crate ... bench: 28,840,113 ns/iter (+/- 2,735,867)
Let me explain some contexts.
- There are two mods in bench file,
gbbq
andgbbq_async
. Theread
fn ingbbq
mod usesstd::fs::read
and that ingbbq_async
mod usestokio::fs::read
. - And I write exactly the same codes in my crate and
benches/xx.rs
file, which you can tell from the suffix. The fn name withdefined_in_crate
suffix uses two iters defined in my crate, and that withoutdefined_in_crate
suffix uses the copied version in the bench file. And the iters yield identically except for one field in the yielded struct, for I want to bench the &str vs String field. The yielded structs are like this:
pub struct GbbqStr<'a> {
s: &'a str,
..
}
pub struct GbbqString {
s: String,
..
}
Here is the bench code if you want to have a look.
mod gbbq {
use super::*;
#[cfg(feature = "bench-test")]
#[bench]
fn gbbq_string_defined_in_crate(b: &mut Bencher) {
b.iter(|| {
data()[4..].chunks_exact(29)
.map(parse)
.map(|b| GbbqStringTDX::from_bytes(&b))
.last()
})
}
#[bench]
fn gbbq_string(b: &mut Bencher) {
b.iter(|| {
data()[4..].chunks_exact(29)
.map(parse)
.map(|b| GbbqString::from_bytes(&b))
.last()
})
}
#[bench]
fn gbbq_str(b: &mut Bencher) {
b.iter(|| {
GbbqStr::iter(&mut data()[4..]).last();
})
}
#[bench]
fn gbbq_str_defined_in_crate(b: &mut Bencher) {
b.iter(|| {
Gbbq::iter(&mut data()[4..]).last();
})
}
}
#[cfg(feature = "tokio")]
#[cfg(test)]
mod gbbq_async {
use super::*;
#[cfg(feature = "bench-test")]
#[bench]
fn gbbq_string_defined_in_crate(b: &mut Bencher) {
b.iter(|| {
rt().block_on(async { GbbqsStringTDX::from_file("assets/gbbq").await.unwrap().last() })
})
}
#[bench]
fn gbbq_string(b: &mut Bencher) {
b.iter(|| {
rt().block_on(async { GbbqsString::from_file("assets/gbbq").await.unwrap().last() })
})
}
#[bench]
fn gbbq_str(b: &mut Bencher) {
b.iter(|| {
rt().block_on(async {
let mut vec = GbbqStr::read_from_file("assets/gbbq").await.unwrap();
GbbqStr::iter(&mut vec[4..]).last();
})
})
}
#[bench]
fn gbbq_str_defined_in_crate(b: &mut Bencher) {
b.iter(|| {
rt().block_on(async {
let mut vec = Gbbq::read_from_file("assets/gbbq").await.unwrap();
Gbbq::iter(&mut vec[4..]).last();
})
})
}
}
The point is that I find the exactly same (with same iter and same yielded item) but located separately codes perform so distinct. Below is a concise table derived from the result, where you would see three pairs of counterparts.
I didn't change the way benchmark works. It's weird Rust treats a bench file different from the lib file, since both should have been optimised by default. But now it seems Rust truly optimised the codes in lib, and left the codes in benches dir less optimised.
As I stated earlier, I meant to bench the performance on &str vs String field. So maybe it's not a good idea to put my old data structs in a bench file. Instead use #[cfg(feature = "bench-test")]
or #[cfg(feature = "bench-old")]
before a specified mod lying in my crate , from where bench codes invoke with features flags.