Hi all!
I'm constructing a HashMap<u64, u32>
for use as an index ; this index gets rather large, taking up around 300GiB of memory. At the moment, I'm (de-)serializing it with serde bincode, which works fine, but especially deserialization takes a lot of time. I think (please correct me if I'm wrong!) that this is due to the fact that deserialization operates by inserting one entry at a time into a new HashMap (with a new Hasher), i.e.: hashes need to be re-calculated and the HashMap probably needs to be resized a lot.
I've been thinking that, since I'm only doing lookups after deserialization, I don't really need to store the keys, only their hashes along with the hashfunction(s), as long as there are no hash collisions.
So my question is: What's the best way to store such an Index (which usually only needs to be constructed and written once) such that in can be read "quickly" (which usually happens more frequently).
(An alternative approach might be to use abomonation which seems like it might be useful in this case (as long as the index stays on the same machine). However, it does not support HashMaps, yet.)