Concurrent Persistent Hashmap

janno · April 20, 2017, 11:24am

I have a very specific situation which requires, I think, a concurrent persistent hashmap. My (simplified) use case is to count bigrams in a large, changing sets of data. I use rayon to parallelize the search, giving me a set of results per data item which are aggregated into a shared hashmap.

My current solution, wrapping a standard HashMap into an RWLock, scales very poorly. To avoid taking a write lock whenever counts are added to an existing key, I use AtomicUsize instead of usize. The interior mutability allows me to perform a relaxed fetch_add on a key's count without acquiring the write lock of the map. While this helps a lot there still is plenty of contention on the map - especially in the beginning, when all keys are new and almost every access requires the write lock.

I have tried chashmap, but it seems to (a) give completely different results than my current approach and (b) non-deterministically crash. I have also looked into wf_hash (Non-blocking Hash Table in Rust) but it requires clonable values which, I think, means that I would have to have some kind of thread-safe arena or something similar to actually hold the AtomicUsize values and only give pointers to that to the hashmap. That arena would then again be a bottleneck as it would be shared amongst threads.

I would be very grateful for any pointers to other implementations of data structures that could be helpful in this case. I would also appreciate pointers to other data structures that could fit this use case. The general use case could be described as persistent, concurrent, sparse maps. (They are sparse in the sense that out of many possible keys only a small number are actually present at any given time.)

HadrienG · April 20, 2017, 11:56am

Could you avoid the contention issue entirely by having each thread fill a private HashMap, and merging these HashMaps at the end? This will cause more memory traffic at the end of the computation, but the scalability benefits are usually worth it.

matklad · April 20, 2017, 12:37pm

If you count bigrams (pairs of letters), perhaps you can avoid HashMap and use a Vec<AtomicU64> instead? That is, you can map bigrams to disjoint indices directly, without a generic hash function. If you are counting something more complex then bigrams (say, k-mers in the DNA), you can generalize this approach with the help of a perfect hash function.

janno · April 20, 2017, 1:25pm

Thanks for the suggestions!

I haven't had a chance to benchmark the code with that setup. It is definitely worth a try, I think. I wouldn't be surprised if it slows the program down, though. Even with the lock contention, the CPU utilization isn't too bad and I am worried that these sequential bits will introduce too much idle time. Maybe I can make the merge step slightly more parallel by recursively merging two of the remaining hashmaps.

The space of bigrams is very, very big. And if I use a hash function I would basically be re-implementing a hashmap. But maybe this is actually the right thing to do, seeing how all the existing maps don't quite match my use case.

matklad · April 20, 2017, 2:03pm

But the set of "actually present" bigrams is bounded by the size of dataset. One possible hack would be to statically precompute a set of most common bigrams and use something like

struct Bigrams {
    // statically known bigrams, no need to resize hash map
    common: &'a HashMap<BiGram, AtomicUsize>,
    rare: RwLock<&'a mut HashMap<BiGram, AtomicUsize>>,
}

cuviper · April 20, 2017, 6:09pm

This would basically look like a rayon fold to fill a local HashMap, and then reduce to merge them together. It still may feel serialized as the final reductions deal with larger maps though.

janno · April 20, 2017, 6:59pm

The data set here is changing throughout the execution. And the the frequency of bigrams will vary wildly between iterations.[quote="cuviper, post:6, topic:10463"]
This would basically look like a rayon fold to fill a local HashMap, and then reduce to merge them together. It still may feel serialized as the final reductions deal with larger maps though.
[/quote]

Oh, I didn't know that existed. That would certainly be easier than coding it myself. I will report back when I did more benchmarks. I guess there might be a way to calculate a good capacity estimate to at least avoid too unnecessary allocations while merging.

Topic		Replies	Views
Highly concurrent insert-only hashmap	7	1928	January 12, 2023
HashMap performance	26	28157	January 22, 2020
Concurrent HashMaps that work for the wasm32-unknown-unknown target? help	1	414	October 16, 2019
How to insert into Hashmap from multiple threads at the same time? help	13	6488	June 14, 2023
HashMap key access across threads where data is non-Send help	7	893	January 1, 2022

Concurrent Persistent Hashmap

Related topics