Implement custom HashSet

I would like to create a set of "products". As you may know, the product is commutative. That is, the pair (2, 3) is equivalent to (3, 2). I tried the following:

use std::collections::HashSet;

fn main() {
    let mut items: HashSet<Product> = HashSet::new();
    items.insert(Product(2, 3));
    items.insert(Product(3, 2));
    println!("{:?}", items);

    // expected: {Product(2, 3)}
    // returns:  {Product(2, 3), Product(3, 2)}
}

#[derive(Debug, Eq, Hash)]
struct Product(u64, u64);

impl PartialEq for Product {
    fn eq(&self, other: &Self) -> bool {
        self.0 * self.1 == other.0 * other.1
    }
}

But the previous approach doesn't work. Should I have to implement the trait Hash as well?

Yes, Hash needs to be implemented so that it is consistent with the notion of equality for the type.

Also something to think of after you get that to work - how to handle overflow, if the product is too big to represent in the numeric type?

1 Like

Any time you want to work with a pair (set of 2 elements) vs. a tuple (sequence of 2 elements), the trick is to use a tuple (since that's how data structures reason), with an imposed ordering:

#[derive(
    Debug,
    Clone, Copy,
    PartialEq, Eq, // it just works
    Hash, // it just works
    PartialOrd, Ord, // not really meaningful, but at least it is consistent
)]
pub
struct Pair<T : Ord> (T, T);

impl<T : Ord> Pair<T> {
    #[inline]
    pub
    fn new (x: T, y: T)
        -> Self
    {
        if x < y {
            Self(x, y)
        } else {
            Self(y, x)
        }
    }

    #[inline]
    pub
    fn as_ordered_tuple (self: &'_ Self)
        -> (&'_ T, &'_ T)
    {
        (&self.0, &self.1)
    }
}

assert_eq!(Pair::new(3, 4), Pair::new(4, 3));

In your case, if you want a set where the pairs compare equal based on equality of the product of the two elements, I'd use a newtype using HashMap<u64, Pair<u64>> as the underlying type; or having something like:

#[derive(
    Clone, Copy
)]
struct Product {
    factors: Pair<u64>,
    result: u64,
}

impl Eq for Product {}
impl PartialEq for Product {
    fn eq (self: &'_ Self, other: &'_ Self)
        -> bool
    {
        self.result == other.result
    }
}

impl Hash for Product {
    fn hash<H : Hasher> (self: &'_ Self, state: &'_ mut H)
    {
        self.result.hash(state)
    }
}
3 Likes

Ok, that wasn't a good idea. In any case, here's an implementation of Hash, which solves the previous problem:

use std::collections::HashSet;
use std::hash::{Hash, Hasher};

fn main() {
    let mut items: HashSet<Product> = HashSet::new();
    items.insert(Product(2, 3));
    items.insert(Product(3, 2));
    println!("{:?}", items);

    // expected: {Product(2, 3)}
    // returns:  {Product(2, 3), Product(3, 2)}
}

#[derive(Debug, Eq)]
struct Product(u64, u64);

impl PartialEq for Product {
    fn eq(&self, other: &Self) -> bool {
        self.0 * self.1 == other.0 * other.1
    }
}

impl Hash for Product {
    fn hash<H: Hasher>(&self, _: &mut H) {
        // hu hi, hu ha ha, pim pam
        // toma lacasitos
    }
}

Note: Yandros' solution is better than mine.

Is your intent only to map away the commutativity? Or do you want to treat any pairs with the same product as equal? For example, (2, 15), (3, 10), and (5, 6) all have the same product 30, thanks to the primitive factors (2, 3, 5). The product-based solution will consider these equal.

If you don't want those to be considered equal, then I would do your equality and hashing based on a normalized order -- first compare/hash the minimum of the pair, then the maximum. (Like @Yandros's imposed ordering, whether you do that at construction or during the comparison.)

It would also be ok if the hash used just the product while equality is more precise. Equal values must have the same hash, but unequal values can have the same or different hashes.

1 Like