Can `Hash` be implemented in a way to maximize chances for locality?

Hello, I want to try to avoid writing a custom data structure by abusing HashMap and Hash trait. My problem is, I need to store data hierarchically: DeviceId -> PropertyId -> Value. I would prefer it all to be in a single allocation, so HashMap<(DeviceId, PropertyId), Value> seems nice. But the hash table could be a couple megabytes then - and hashes of different properties of the same device would get spreaded totally different pages.

My idea is to make a type for a key that will preserve random behavior in respect to device id, be be linear I'm respect to properly id, so that, in most cases, those entries get partitioned together in memory.

Is it compatible with the swiss table and how hashbrown implements it? And could you advise on the Hash implementation, if it is possible? Thank you!

EDIT: device IDs are extremely random to begin with

I don't see why not. After all, nohash exists and is recommended for some uncommon use cases.

SwissTable uses hash % bucket_size as the place to put the element, so if you calculate a hash as hash(deviceId) + propertyId then they'll be contiguous, yup. (Well, or split in two if they happen to wrap the end of a bucket.)

You'd probably want to do that by making your own hasher for it. Bevy's EntityHasher does (or at least used to do) something similar to keep entity indexes near each other, so you could look at that for inspiration.

2 Likes