Insert returns false without element in hashset

Dear all,

New to Rust. In my first program, I'm using a HashSet on a custom datatype with the Hash trait.
When inserting into the hashset insert function returns false, even though the hashset does not contain the value.

For a top-level loop like this:
      for p in cs1.clone() {
        for q in cs2.clone() {
          n.insert(ElementPair::new(&p.clone(),&q.clone()));
        }
      }

This would be an example output.

insert(): warning: duplicate element (6,8)
(6,3)(6,7)(6,6)(6,5)(6,12)
insert(): warning: duplicate element (6,9)
(6,11)(6,3)(6,7)(6,6)(6,5)(6,12)(6,4)

This happens non-deterministically over multiple runs.
The only thing I can imagine is that there's a hashmap collision, but that seems unlikely.
Any idea what I'm doing wrong in my code?
(Other than cloning all over the place, but ignore that inefficiency for now. Rust looks more than fast enough even with that.)

More relevant info:

pub trait Element : Clone + Eq + PartialEq + PartialOrd + std::fmt::Display + std::hash::Hash {}
impl Element for usize {}
#[derive(Clone, Hash)]
pub struct ElementPair<P,Q> where P: Element, Q: Element {
  p: Box<P>,
  q: Box<Q>
}
//... impl of all Element traits ...
impl<P: Element,Q: Element> Element for ElementPair<P,Q> { }
#[derive(Clone)]
pub struct Poset<E> where E : Element {
  set: HashSet<E>
}
pub fn insert (&mut self, e: E) -> bool { self.set.insert(e.clone()); }

Thanks,
Kees

-------- and the program (stripped to its essence) --------

    use std::cmp::Ordering;
    use std::collections::HashSet;

    /// elements of a poset need to implement the Element trait
    /// which is just a set of standard traits
    pub trait Element : Clone + Eq + PartialEq + PartialOrd + std::fmt::Display + std::hash::Hash {}

    ////////// ////////// ////////// //////////

    impl Element for usize {}
    // various other (more interesting) Element implementations omitted

    ////////// ////////// ////////// //////////

    /// a pair of Elements is again an element
    #[derive(Clone, Hash)]
    pub struct ElementPair<P,Q> where P: Element, Q: Element {
      p: Box<P>,
      q: Box<Q>
    }

    impl<P: Element, Q: Element> PartialEq for ElementPair<P,Q> {
      fn eq(&self, other: &Self) -> bool {
        self.p.partial_cmp(&other.p) == Some(Ordering::Equal)
      }
    }

    impl<P: Element, Q: Element> Eq for ElementPair<P,Q> {}

    impl<P: Element, Q: Element> PartialOrd for ElementPair<P,Q> {
      /// lexicographic ordering: p1 <= p2 and then q1 <= q2
      fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        let pe = self.p.partial_cmp(&other.p);
        let qe = self.q.partial_cmp(&other.q);
        match (pe,qe) {
          (None, _) | (_,None) => None,
          (Some(Ordering::Less), _) => Some(Ordering::Less),
          (Some(Ordering::Greater), _) => Some(Ordering::Greater),
          (Some(Ordering::Equal), Some(Ordering::Less)) => Some(Ordering::Less),
          (Some(Ordering::Equal), Some(Ordering::Greater)) => Some(Ordering::Greater),
          (Some(Ordering::Equal), Some(Ordering::Equal)) => Some(Ordering::Equal)
        }
      }
    }

    impl<P: Element, Q: Element> std::fmt::Display for ElementPair<P,Q> {
      fn fmt (&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(f, "({},{})",&self.p,&self.q)
      }
    }

    impl<P: Element,Q: Element> ElementPair<P,Q> {
      pub fn new (p: &P, q: &Q) -> ElementPair<P,Q> {
        ElementPair { p: Box::new(p.clone()), q: Box::new(q.clone()) }
      }
    }

    impl<P: Element,Q: Element> Element for ElementPair<P,Q> { }

    ////////// ////////// ////////// //////////

    /// partially ordered set of elements
    #[derive(Clone)]
    pub struct Poset<E> where E : Element {
      set: HashSet<E>
    }

    // do we want a mutable HashSet? or keep copying the HashSet?
    impl<E : Element> Poset<E> {
      pub fn new() -> Poset<E> { Poset { set: HashSet::<E>::new() } }
      pub fn len(&self) -> usize { self.set.len() }
      pub fn insert (&mut self, e: E) -> bool {
        let c = &e.clone();
        if !self.set.insert(e) {
          println!("insert(): warning: duplicate element {}",c);
          for x in &self.set { print!("{}",x); }; println!("");
          false
        }
        else { true }
      }
    }

    /// iterate over all elements in the poset, also using the standard for loop syntax
    // how to create an iterator that loops over &E?
    impl<E: Element> IntoIterator for Poset<E> {
      type Item = E;
      type IntoIter = std::collections::hash_set::IntoIter<Self::Item>;
      fn into_iter(self) -> Self::IntoIter {
        self.set.into_iter()
      }
    }

    impl<E : Element> std::fmt::Display for Poset<E> {
      fn fmt (&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        let mut first = true;
        let _ = write!(f,"[");
        for e in &self.set {
          if first { let _ = write!(f,"{}",e); } else { let _ = write!(f,",{}",e); }
          first = false;
        };
        write!(f,"]")
      }
    }

    fn main()
    {
      let mut cs1 = Poset::<usize>::new();
      let mut cs2 = Poset::<usize>::new();
      for p in 0..10 { cs1.insert(p); cs2.insert(3+p); }
      println!("subsets done sizes {} {}",cs1.len(),cs2.len());
      let mut n = Poset::new();
      for p in cs1.clone() {
        for q in cs2.clone() {
          //println!("{} {}",p,q);
          n.insert(ElementPair::new(&p.clone(),&q.clone()));
        }
      }
      println!("created product len={}",n.len());
      println!("n={}",n);
    }

Indeed. Hash collisions with a broken Eq impl:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5e7190f9c33a3fda7b907ec5a789fb8a

1 Like

Shouldn’t this also be comparing Q?

or

impl<P: Element, Q: Element> PartialEq for ElementPair<P, Q> {
    fn eq(&self, other: &Self) -> bool {
        self.partial_cmp(&other) == Some(Ordering::Equal)
    }
}

Man. That was quick, and correct. I should have realised this could have been a cause. Thanks!

1 Like

Also, I think #[derive(Eq,PartialEq,PartialOrd,Ord)] will produce implementations equivalent to yours. From the PartialOrd docs:

This trait can be used with #[derive] . When derive d on structs, it will produce a lexicographic ordering based on the top-to-bottom declaration order of the struct's members. When derive d on enums, variants are ordered by their top-to-bottom declaration order.

Thanks. I use derives where I can.
The idea behind the Element trait is that I can instantiate it with many things, e.g. bool, usize, sets of anything, natural numbers incl. infinity, and so on. Tuples of Element are Element again, allowing complex Elements with custom PartialOrd per Element to be built up.
The partial_cmp for the ElementPair is lexigraphic ordering, which is different from the default for tuples.
With this I create partially ordered sets of Elements, and can minimise those to Pareto sets.
Amazingly, it all works, and is very efficient even for large Posets.
As an example, this is a (pareto-minimised) poset of one element with type (natinf x (set of nat)) x (nat x (set of nat))
ss min=[(Inf,{0,1,2,4}),(100,{0,1,2,10,11,12})]

// as an example, create sets that are Elements, i.e. have all the right traits (incl. hashable)
#[derive(Clone, Hash)]
pub struct SetElement
where T: Clone + Eq + PartialEq + std::hash::Hash + std::fmt::Display {
set: Vec // cannot use HashSet since isn't hashable
}
... e.g. partial_cmp for sets is subset
impl Element for SetElement
where T: Clone + Eq + PartialEq + std::hash::Hash + std::fmt::Display {}

How so? According to the tuple docs:

The sequential nature of the tuple applies to its implementations of various traits. For example, in PartialOrd and Ord , the elements are compared sequentially until the first non-equal set is found.

Isn’t this exactly what the code you wrote does?

Default ordering of tuples is lexicographic.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.