After all the effort of getting “Local” allocation to work, I wanted to know if it was actually faster, so I just tried a simple benchmark:
#[divan::bench(args = [1, 5, 10, 100, 1000, 10000])]
fn allocate_stdbox(bencher: divan::Bencher, size: usize) {
bencher.counter(size).bench_local(|| {
let mut v = Vec::new();
for _i in 0..200 {
v.push( Box::new(99) );
}
})
}
#[divan::bench(args = [1, 5, 10, 100, 1000, 10000])]
fn allocate_lbox(bencher: divan::Bencher, size: usize) {
use rustdb::alloc::{Local,lvec,lbox};
Local::enable_bump();
bencher.counter(size).bench_local(|| {
let mut v = lvec();
for _i in 0..200 {
v.push( lbox(99) );
}
})
}
fn main() {
// Run registered benchmarks.
divan::main();
}
Results:
Timer precision: 50 ns
example fastest │ slowest │ median │ mean │ samples │ iters
├─ allocate_lbox │ │ │ │ │
│ ├─ 1 3.089 µs │ 9.034 µs │ 3.119 µs │ 3.31 µs │ 100 │ 100
│ │ 323.7 Kitem/s │ 110.6 Kitem/s │ 320.5 Kitem/s │ 302 Kitem/s │ │
│ ├─ 5 3.086 µs │ 6.233 µs │ 3.396 µs │ 3.383 µs │ 100 │ 200
│ │ 1.62 Mitem/s │ 802.1 Kitem/s │ 1.471 Mitem/s │ 1.477 Mitem/s │ │
│ ├─ 10 3.31 µs │ 9.734 µs │ 3.426 µs │ 3.533 µs │ 100 │ 200
│ │ 3.02 Mitem/s │ 1.027 Mitem/s │ 2.918 Mitem/s │ 2.829 Mitem/s │ │
│ ├─ 100 3.388 µs │ 7.259 µs │ 3.509 µs │ 3.573 µs │ 100 │ 200
│ │ 29.5 Mitem/s │ 13.77 Mitem/s │ 28.49 Mitem/s │ 27.98 Mitem/s │ │
│ ├─ 1000 3.38 µs │ 9.669 µs │ 3.512 µs │ 3.576 µs │ 100 │ 200
│ │ 295.7 Mitem/s │ 103.4 Mitem/s │ 284.6 Mitem/s │ 279.6 Mitem/s │ │
│ ╰─ 10000 3.393 µs │ 5.566 µs │ 3.628 µs │ 3.608 µs │ 100 │ 200
│ 2.946 Gitem/s │ 1.796 Gitem/s │ 2.756 Gitem/s │ 2.771 Gitem/s │ │
╰─ allocate_stdbox │ │ │ │ │
├─ 1 3.961 µs │ 32.57 µs │ 3.984 µs │ 5.503 µs │ 100 │ 100
│ 252.4 Kitem/s │ 30.7 Kitem/s │ 250.9 Kitem/s │ 181.7 Kitem/s │ │
├─ 5 4.088 µs │ 11.04 µs │ 4.301 µs │ 5.936 µs │ 100 │ 200
│ 1.223 Mitem/s │ 452.4 Kitem/s │ 1.162 Mitem/s │ 842.3 Kitem/s │ │
├─ 10 4.092 µs │ 15.92 µs │ 7.909 µs │ 6.443 µs │ 100 │ 100
│ 2.443 Mitem/s │ 628.1 Kitem/s │ 1.264 Mitem/s │ 1.551 Mitem/s │ │
├─ 100 4.216 µs │ 23.12 µs │ 7.323 µs │ 6.46 µs │ 100 │ 100
│ 23.71 Mitem/s │ 4.323 Mitem/s │ 13.65 Mitem/s │ 15.47 Mitem/s │ │
├─ 1000 4.214 µs │ 9.987 µs │ 7.262 µs │ 6.241 µs │ 100 │ 100
│ 237.2 Mitem/s │ 100.1 Mitem/s │ 137.6 Mitem/s │ 160.2 Mitem/s │ │
╰─ 10000 4.212 µs │ 10.73 µs │ 8.05 µs │ 6.431 µs │ 100 │ 100
2.374 Gitem/s │ 931.5 Mitem/s │ 1.242 Gitem/s │ 1.554 Gitem/s │ │
So on this allocation-intensive test, it was nearly twice as fast on average. I have to say I am quite dubious whether this is really worthwhile, but it has still been an interesting exercise.