I'm implementing the lock-free vector in the paper Lock-free Dynamically Resizable Arrays by Dechev et al., 2006, and my implementation is super slow. I followed the algorithm from the paper pretty closely.
Most of the implementation is in src/leaky.rs (about 500 LOC excluding tests/benches), except for a few convenience functions defined in src/lib.rs and src/alloc_error.rs
From the miri tests I've done, the vector doesn't do any UB. However, it does leak memory (that's ok for now, I haven't implemented a reclamation strategy). Note: at the moment, the vector can only be used for 64-bit types.
My main question is: why is my implementation is so slow: ? There are some mini-benchmarks in the bottom of leaky.rs that show this. Here's some example output:
test leaky::bench::mutex1 ... bench: 49,746 ns/iter (+/- 1,849)
test leaky::bench::mutex2 ... bench: 117,679 ns/iter (+/- 2,044)
test leaky::bench::mutex3 ... bench: 201,409 ns/iter (+/- 5,257)
test leaky::bench::unlocked1 ... bench: 200,582 ns/iter (+/- 62,360)
test leaky::bench::unlocked2 ... bench: 425,359 ns/iter (+/- 52,239)
test leaky::bench::unlocked3 ... bench: 684,456 ns/iter (+/- 120,988)
The benchmark just consists of many threads pushing to a vector 1000 times. These are the results for 1, 2, and 3 threads.
Thank you for your help! Any feedback not related to performance is of course welcome too
I haven't specifically compiled in release mode. It says here cargo bench does that automatically though.
Sadly, I haven't found a reference implementation so far, in any language (the paper was geared towards C++). The paper is a bit old though (2006), so maybe the algorithm is a bit outdated.
Thanks for the reminder about MIRI, can never be too safe
Note that generally speaking lock free algorithms have less average case throughputs but better worst case latency compared to the ones with fine-grained locks. It's possible that the benchmark is correct.
Ok, made some simple benchmarks and the performance doesn't seem as bad compared to Mutex<Vec<&>>. They're both linear but the lock-free vector has a greater slope. Thanks for the suggestion!