Generating hash values is sort of awkward;
let foo = 1;
let mut hasher = SipHasher::default();
foo.hash(&mut hasher);
let hashed_foo = hasher.finish();
It would be much nicer if we could just call 1.to_hash()
. Another limitation of the hash::Hash
trait is that it makes hash memoization difficult. Hashing large objects can be an expensive operation; it would be nice if we could build a wrapper object that precomputes the hash on creation and then uses the precomputed hash each time its needed.
Here's what I'm proposing:
trait ToHash<H>: PhantomFn<H> {
fn to_hash(&self) -> u64;
}
The idea is to to provide a method that returns an object's hashed value. The trait could be implemented for any type that implements Hash
and any Hasher
that implements Default
:
impl<T> ToHash<SipHasher> for T where T: Hash {
fn to_hash(&self) -> u64 {
let mut hasher = SipHasher::default();
self.hash(&mut hasher);
hasher.finish()
}
}
Aside from simplifying hash generation, the trait would enable hash memoization, for instance the following wrapper could be used;
struct Hashed<T,H> {
hash: u64,
value: T,
marker: PhantomData<H>,
}
impl<T,H> Hashed<T,H> where T: Hash, H: Hasher + Default {
fn new(value: T) -> Hashed<T,H> {
let mut hasher = <H as Default>::default();
value.hash(&mut hasher);
Hashed {hash: hasher.finish(), value: value, marker: PhantomData}
}
}
impl<T> ToHash<SipHasher> for Hashed<T, SipHasher> {
fn to_hash(&self) -> u64 {
self.hash + 1
}
}
The impact on the rest of std
would be fairly minimal; the trait could replace Hash
in a few places where memoization would be useful. For instance, HashSet
could be changed to look something like:
pub struct HashSet<T, H> where T: ToHash<H>, H: Hasher { ... }
This would allow Hashed<T,H>
instances to be added to multiple sets without recomputing the hash each time.
Sound useful to anyone?