Trait objects as HashMap keys

Hi all.

I'd like to couple references to Box<dyn Something> to integer values using an HashMap. This is a pretty naive example, obviously not working:

Basically what I'm trying to do is what Python does when using objects (of non-basic types) as dictionary keys. Also, I'm not going to move the HashMap outside the function that creates it and adds references into its keys.

Any Idea?

It's tricky in Rust for a number of reasons. First of all, you're going to have to decide what equality means in this situation. Are b and c equal here?

    let b: Box<dyn Something> = Box::new(Two(20));
    let c: Box<dyn Something> = Box::new(Two(20));

If so, you're probably going to have to use downcasting and thread all the required hash-related bounds through your trait.

If not, hashing the data pointer portion of the Box<dyn Something> may be attractive, but that will also fall down in the case of zero-sized types. So if you want to do that, you probably still want the ability to compare base types and then the check for equality could be something like

same_type(a, b) && (size_of_type(a) == 0 || data_ptrs_equal(a, b))

And the hash could be the hash of the ZST or the hash of the data pointer for non-ZSTs.

But let's take a step back. You say you're not going to move the HashMap. Are you taking any dyn Something inputs? If not -- if you know all the concrete types and they implement the appropriate traits -- perhaps an enum (potentially also implementing Something) would solve your use case.


The data pointer portion of Box<dyn Something> would be enough, or even the pointer to the Box itself (as long as this doesn't hide problems I can't foresee due to my very limited knowledge of what's behind the scene). Even casting the data pointer to usize (is it possible?) and use this integer as key in the HashMap would be good; these pointers are not going to change for the lifespan of my function execution. Very unelegant, but effective if the reaching the goal is going to be really complex.

The zero-sized types case it totally new to me. I guess there's no memory to allocate so... no pointer. Right? This isn't my specific case but... interesting problem.

I could work around this problem assigning a unique identifier to my trait objects via getter/setter methods in the trait definition.

I'll think about the enum idea. Thanks!

Well, what's the goal again? Comparing pointers will give different results for things that "look" the same, like a couple of Two(20) at different locations. So if the goal is for things like that to compare the same (the usual case in Rust), pointer equality isn't really effective. It has different semantics.

I threw something together for comparing and hashing more typically. Note how I insert three times but the map only has two elements, because of the equality. Pointer comparison would give you three.

But you can do &*the_box as *const dyn Something as *const () (and then as usize if you want) to get a pointer or usize, yes.

There's a pointer, but it dangles,[1] and could be the same or different for the same type of ZSTs. (Pointer equality isn't a great fit for Rust IMO.) You could have a ZST implement your trait (there's nothing preventing it). But perhaps you know all the base types you're working with have a non-zero size.

  1. this is fine even for references as all the memory it points to that is valid to read -- 0 bytes -- represents a valid ZST ↩ī¸Ž

I'll take my time to study what you wrote in your example. There's a lot to learn there. Thanks a lot!

To be clear, what I need is that multiple Two(20) objects appear as different keys in the HashMap. So, pointer comparison would be correct in my case.

No, it's not right. You can totally have a pointer to a zero-sized type:

struct Foo;
let x = Foo;
let ptr = &x; // or `*const Foo`, etc.

The problem with zero-sized types is that since their size is zero (gasp!), two pointers to two "distinct" instances of a ZST can point to the same address.

This will give you false positives: inserting one instance of the ZST and then looking up another instance can potentially succeed whereas you wanted it to fail, or vice versa, a second insertion can erroneously kick out an existing instance because their pointers compare equal.

OK, but that just means that idiomatic equality is ineffective instead; they still semantically differ. Anyway, here's a first approximation that treats all type-erased values of the same base ZST type equal; there may be other pointer equality corner cases I've overlooked.

I finally decided to convert the object's address to usize and use this value as HashMap key. Here
This covers perfectly my use case even if it isn't the correct way to make my objects hashable.

I'm actually not very comfortable to use code I can't explain; I'll come back to your solution in the future.

Just one more question: could you explain how the pointer to usize cast works?

self as *const dyn Something as *const () as usize

I presume it's to avoid footguns when using things like as _, but you have to generally take pointer casts one step at a time. Also, if you didn't know, references and pointers to dyn Trait objects are wide references, consisting of a data pointer and a vtable pointer. So we have

self                     // &dyn Something: a wide reference
as *const dyn Something  // *const Something: a wide pointer
as *const ()             // *const (): a thin pointer
                         //     (the data pointer of the *const Something)
as usize                 // usize: integer interpretation of the thin ptr