Hey Guys, first post so be easy on me.
I have the following problem. i want to go over a hashmap and mutate the values based on a key, but also based on a "partial" key. My key looks something like
Just to clarify, do you have a collection of elements and do you want to query the elements matching id: Uuid, or the element matching id: Uuid, r#type: MyEnum?
Could you use HashMap<Uuid, HashMap<MyEnum, T>> where T is your value type?
Isn't that what you'd use some sort of prefix tree for? Maybe storing your data in a HashMap<PartialKey, HashMap<Uuid, V>> or HashMap<PartialKey, Vec<(Uuid, V)>> is efficient enough in practice? Efficiency matters kind of depends on the regular workload you are expecting. I.e. will there be a lot of insertions and deletions or far more MyKey lookups compared to PartialKey lookups? Optimizing data structures for certain operations often means that you have to make trade-offs.
Yeah you are spot on, the problem is that under normal workloads we have a lot of mutable lookups based on both type of keys (more of MyKey than PartialKey) and less insertions/deletions. which is why i thought HashMap<PartialKey, Vec<(Uuid, V)>> does not fit the trade-offs. also usually i think the vec would be of length 1, but i need to make sure of that. I feel like HashMap<PartialKey, HashMap<Uuid, V>> will be good enough, but I wanted to make sure there isn't another option which is used in these situations. The other issue with the nested hashmap is that it requires a lot more branching when working with MyKey because both your first lookup can "fail" and then also the second lookup can also "fail". Where in theory if i can find a value for MyKey i will find atleast 1 value for PartialKey.
An alternative solution, if you can ensure everything implements Ord (Uuid does, assuming it's the one from the uuid crate), would be to use a BTreeMap instead. Then your partial key lookups would correspond to looking up a range using range() or range_mut(). Which solution is better may depend on exactly how you are using the data structure.
It looks like the upper and lower bounds for Uuid would be Uuid::nil() and Uuid::max(), since the ordering is just the default ordering for an array of u8s.
It would be doing a lookup over a tree, ordered by the ordering, so it just needs to find the right one in the order (which is O(log(N)) complexity). For the partial lookups it would be using a lexicographic ordering, where it would look up the things at the start of the key type first (which correspond to the partial keys), and then it can go through the Uuids at the end when it has found the right one (for a range lookup, this just corresponds to returning them one by one in an iterator).