Crate idea: fxhasher


#1

FxHasher is what the compiler uses in a lot of places. Using it mostly requires understanding its properties and when it’s best (it’s faster than fnv hash in many cases).

/// A speedy hash algorithm for use within rustc. The hashmap in libcollections
/// by default uses SipHash which isn't quite as speedy as we want. In the
/// compiler we're not really worried about DOS attempts, so we use a fast
/// non-cryptographic hash.
///
/// This is the same as the algorithm used by Firefox -- which is a homespun
/// one not based on any widely-known algorithm -- though modified to produce
/// 64-bit hash values instead of 32-bit hash values. It consistently
/// out-performs an FNV-based hash within rustc itself -- the collision rate is
/// similar or slightly worse than FNV, but the speed of the hash function
/// itself is much higher because it works on up to 8 bytes at a time.

Code: https://github.com/rust-lang/rust/blob/f88b24b34c6d17ebe4014bec5a0f7c2a57c529c7/src/librustc_data_structures/fx.rs


#2

Hello,

Mind if I ask, whats the benefit of using FxHasher compared to SeaHash

I’m trying to figure out what non-cryptographic hasher to use


#3

SeaHash’s docs talk mostly about throughput, which is important if you are hashing a large chunk of data. It doesn’t immediately tell me if SeaHash can compete with FxHasher if the key to hash is one u32, one u64 or a struct composed of a few fields of those.

FxHasher, just like FnvHasher, they do not compete in throughput for long chunks of data.


#4

I wrote a small static round robin hashmap implementation using FxHash as a backend and it was quite impressive for my use case. Lookups were about 5x faster than phf if I recall (based on about 4k entries with u32 indices).

The author responsible for implementing FxHash in rustc also had a few remarks (here):

Make sure you observe the rustc license (of course) and you should probably make it clear in the docs that it’s not a “well-designed” hash and so may not be suitable in all situations.


#5

Here is one benchmark. Benchmarks only answer which algorithm does better for that exact benchmark, but it’s better than guessing. With lto, seahash is on par with fnv for u32 keys, so it looks like seahasher is a good candidate. Hey with an fxhasher crate it could easily be in the comparison.


#6

I have been using fxhash for a personal project, so I’ve decided to release it as a crate: fxhash.