Quick Hashing tiny objects

ChrisJefferson · May 25, 2021, 12:40pm

I have a program where over 70% of my time is spend hashing tiny objects, like u64, (u32,u64), (u64, Wrapping<u64>).

I do not need cryptographically strong hashes, just "reasonable" ones (in particular, I wouldn't want the hash function for u32 or u64 to just be the identity function).

It seems like (from looking at valgrind's cachegrind output) the amount of code produces for hashing a u64 is significant. I tried a couple of alternative packages, like seahash, but even then if I do something like the following to hash a single u64, the amount of instructions generated is huge. Is there a better way of hashing small objects (other than writing some small custom hash functions), or could I be misinterpreting the profiles I'm seeing?

pub fn do_hash<T>(obj: T) -> Wrapping<usize>
where
    T: Hash,
{
    let mut hasher = seahash::SeaHasher::default();
    obj.hash(&mut hasher);
    Wrapping(hasher.finish() as usize
}

erelde · May 25, 2021, 12:44pm

If your keys are integers, you could try using a BTreeMap. Skip the hashing entirely.

RustyYato · May 25, 2021, 12:47pm

fxhash is good at hashing small integers

ChrisJefferson · May 25, 2021, 12:58pm

I agree that BTreeMap is a good datastructure, and some people reach for HashMaps too quickly, however in my case I'm not actually using the hashes for hash maps, but for something else.

I don't want to get into details (if I can avoid it). I'm implementing this academic paper: [1911.04783] Permutation group algorithms based on directed graphs (extended version) , which basically comes down to hashing LOTs of things, adding up lists of the resulting hashes (I realise adding hashes isn't a usual thing to do either!), and seeing if they are equal.

ChrisJefferson · May 25, 2021, 1:16pm

So, it seems fxhash was the right choice. In particular it has a function hash64 which, if I also do some link time optimisation, seems to lead to all the hashing getting inlined, leading to a half dozen CPU instructions. Now my hashing is only taking 15% of my runtime, so I'll go look at other things to optimise

system · August 23, 2021, 1:16pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Crate idea: fxhasher	6	1480	January 12, 2023
Hash function performance help	10	688	May 10, 2023
Fastest (lookup time) map for short keys help	8	4364	January 12, 2023
Simple fast hashing	4	2843	January 12, 2023
Fastest StrMap? help	16	2462	September 28, 2019

Quick Hashing tiny objects

Related Topics