Trait proposal: hash::ToHash


Generating hash values is sort of awkward;

let foo = 1;
let mut hasher = SipHasher::default();
foo.hash(&mut hasher);
let hashed_foo = hasher.finish();

It would be much nicer if we could just call 1.to_hash(). Another limitation of the hash::Hash trait is that it makes hash memoization difficult. Hashing large objects can be an expensive operation; it would be nice if we could build a wrapper object that precomputes the hash on creation and then uses the precomputed hash each time its needed.

Here’s what I’m proposing:

trait ToHash<H>: PhantomFn<H> {
    fn to_hash(&self) -> u64;

The idea is to to provide a method that returns an object’s hashed value. The trait could be implemented for any type that implements Hash and any Hasher that implements Default:

impl<T> ToHash<SipHasher> for T where T: Hash {
    fn to_hash(&self) -> u64 {
        let mut hasher = SipHasher::default();
        self.hash(&mut hasher);

Aside from simplifying hash generation, the trait would enable hash memoization, for instance the following wrapper could be used;

struct Hashed<T,H> {
    hash: u64,
    value: T,
    marker: PhantomData<H>,
impl<T,H> Hashed<T,H> where T: Hash, H: Hasher + Default {
    fn new(value: T) -> Hashed<T,H> {
        let mut hasher = <H as Default>::default();
        value.hash(&mut hasher);
        Hashed {hash: hasher.finish(), value: value, marker: PhantomData}
impl<T> ToHash<SipHasher> for Hashed<T, SipHasher> {
    fn to_hash(&self) -> u64 {
        self.hash + 1

The impact on the rest of std would be fairly minimal; the trait could replace Hash in a few places where memoization would be useful. For instance, HashSet could be changed to look something like:

pub struct HashSet<T, H> where T: ToHash<H>, H: Hasher { ... }

This would allow Hashed<T,H> instances to be added to multiple sets without recomputing the hash each time.

Sound useful to anyone?


No it wouldn’t. Not in general at least. Hashers have state. In the case of SipHash that’s something like a 128 bit random key. If you tried to use the hashes from one HashSet in another it probably wouldn’t work. The memoized hashes you insert the value under wouldn’t match a lookup with an equivalent key that isn’t memoized. Your HashSets would become blackholes where values enter but can never be found.