Insert Rc<RefCell<MyType>> into HashSet

I'm trying to keep track of some struct into a HashSet. This is the struct:

struct DawgState {
    /// Whether or not this is an accepting state
    is_final: bool,

    /// Transitions to next states
    transitions: BTreeMap<char, SharedDawgState>
}

and here is my implementation for the Hash trait:

impl Hash for DawgState {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.is_final.hash(state);

        for key in self.transitions.keys() {
            key.hash(state);
        }
    }
}

As you can see, I only care about the is_final and the keys in the transitions. And finally, I'v implemented PartialEq and Eq which is using the hash to say whether or not 2 structs are the same.

Now, the problem is that, this struct is wrapped by an Rc and a RefCell as such:

type SharedDawgState = Rc<RefCell<DawgState>>;

and I'm trying to have an HashSet<SharedDawgState>.

However, when I try to insert or get in the HashSet, I get the following error:

error[E0599]: the method `get` exists for struct `HashSet<Rc<RefCell<DawgState>>>`, but its trait bounds were not satisfied
 [...]
    |
309 | pub struct Rc<T: ?Sized> {
    | ------------------------ doesn't satisfy `Rc<RefCell<DawgState>>: Hash`
    |
    = note: the following trait bounds were not satisfied:
            `Rc<RefCell<DawgState>>: Hash`

So is there a way to hash the SharedDawgState type into a HashSet? or is there a better way to store these (I need DawgState to be hashed and I don't care about ordering, just fast access) ?

The RefCell indicates you're modifying these even when you have only a reference. This means you could change the keys of the HashSet (and thus their hashes) even when it's owned by the HashSet. This would be problematic, and is probably why RefCell doesn't implement Hash.

You could newtype SharedDawgState instead of type aliasing it, and then you could write your own implementations, but there's a chance you'd run into logic problems / weird behavior for the above reason.

DawgState does implement Hash -- you've implemented it -- but it doesn't contain a HashSet<SharedDawgState> either, unless you've left something out of your example. What are you using the HashSet for?

1 Like

Right, the HashSet is another struct. This one:

/// A DAWG data structure
pub struct Dawg {
    /// The start state
    start: SharedDawgState,

    /// A register for keeping track of equivalent states
    register: HashSet<SharedDawgState>,
}

and the goal of the register is to help me find the equivalent states (same is_final and same transitions keys) and join them.

other than that, what does "newtype SharedDawgState" mean ?

"newtype" means make a new type of your own that wraps some other type (generally so you can change its behavior in some way, e.g. implement traits for it).

Here's an example of an UpDawg newtype, along with an example of how it can confuse HashSet. The example is pretty mild; the HashSet could panic or loop forever instead, for example.

So I don't really recommend it. If you don't need the mutability, you don't need the RefCell, but presumably you do.

2 Likes

I do need mutability for the DAWG struct, is there a way to store only Rc in the register ? or something that just links to a pointer of a DawgSate?

Well, if the goal is to find equivalent states... and these states can dynamically change... you're going to have to recalculate equivalence after the changes. I haven't thought of an alternative that seems good.

The thing is that after being inserted in the register they do not need to be mutable anymore. The mutability matters only before that.

Why not just insert MyType then? Inserting Rc<RefCell<T>> is never a good idea.

2 Likes

Then the newtype is probably fine. Document the requirements and maybe consider some api surface to ensure no modification after insertion.