HashMap performance

The documentation on Hasher and BuildHasher could go into that though.

1 Like

Is there a reason for this, or does this need fixing?

I believe it is to save memory. If you had a single array of structs where each struct contained the hash, key, and data, then you could waste a lot of memory on padding in that struct. Having parallel arrays ensures that no memory is wasted on padding.

There might also be cache benefits if you need to walk one of the arrays but not the others, but I'm not sure how often that applies in this situation (it could only happen when there's a collision).

That seems like a serious tradeoff of cache locality to save memory. I can easily imagine wanting to make the opposite tradeoff more often, or tuning the size of the keys and values to eliminate the padding.

The rationale is actually in a source comment; it's for space saving and to allow cache-efficient scanning of hash values at very high load factors.

Since Rust's hash tables use open addressing, "collisions" are not an applicable term but "load factor" (the fraction of the slots which are occupied) is. Rust's hash table stores the hash value as well as the key (since for most key types a u64 comparison will be faster than a key comparison, and the full u64 hash value is very unlikely to collide).

This is the really telling comment, and contains something that might be a logic error? Rust's hash tables are designed to operate with a load factor of 90.9%, under which circumstance scanning just the hash values makes a lot of sense; but the load factor in practice is much lower than that because the hash table size can only be a power of 2. Assuming your data satisfies Benford's law (approximately log-uniform over small scale ranges) and my math is right, that gives an expected load factor of 90.9% * \( \int^1_0 2^{-x} dx \) = 65%.

6 Likes

I'm working on a PR to make the layout hhhhkvkvkvkv. It's rare to have to probe the kv array itself as we're using 63bit hashes (most implementations use 32 or even 31). Both lookups and inserts are faster in pretty much every test and I'm still trying to make a test that shows a regression in the keys and values iteration.

PR Cache conscious hashmap table by arthurprs · Pull Request #36692 · rust-lang/rust · GitHub

3 Likes