HashMap with Vec<String> key

I would like to store values in hash map by keys of type Vec<String>

This is totally possible, but I do not want to create Vec<String> every time I want to access the hash map. I actually have Vec<&str> when I need to access it.

So, I was thinking I will create new type struct MyKey(Vec<String>) and implement Borrow for it. But it actually looks impossible.

1 Like

Yes, Borrow::borrow must return something that is fully borrowed from the implementing type, you can't construct a new type (the Vec<&str> in your case) inside the method and return that, because the method requires returning a reference.

How about using Cow instead of your newtype approach?

use std::borrow::Cow;
use std::collections::HashMap;

fn main() {
    let map: HashMap<Vec<Cow<'_, str>>, ()> =
        HashMap::from_iter([(vec![Cow::from("foo".to_owned())], ())]);

    assert!(map.get([Cow::from("foo")].as_slice()).is_some());
}

Playground.

2 Likes

That's an interesting idea, but I have to store this hash map, so it has to have static lifetime, and therefore it will be Cow<'static, str>. I am not sure, it will accept non-static str.

#[derive(Debug, Default)]
struct MyStruct {
    map: HashMap<Vec<Cow<'static, str>>, u32>,
}

impl MyStruct {
    // this does not compile: lifetime may not live long enough
    fn get(&self, key: &[&str]) -> Option<&u32> {
        self.map
            .get(&key.iter().map(|s| Cow::Borrowed(*s)).collect::<Vec<_>>())
    }
}

But, actually, I use CowStr crate in my project, which is a refcounted CoW string, and it has SubStr functionality, allowing to me to have a key consisting of such SubStr's. I have to try it.

Yes, it works with CowStr. That is not a universal solution, but I already use CowStr and my keys already come as substrings of CowStr, so it works for me.

use cowstr::{CowStr, SubStr};
use std::collections::HashMap;

#[derive(Debug, Default)]
struct MyStruct {
    map: HashMap<Vec<SubStr>, u32>,
}

fn main() {
    let mut t = MyStruct::default();
    t.map.insert(vec!["a".into(), "b".into()], 1);

    let src: CowStr = "a b".into();
    let key = src
        .split_ascii_whitespace()
        .map(|s| SubStr::from_contained_str(&src, s))
        .collect::<Vec<_>>();
    dbg!(t.map.get(&key));
}

I was having such an API in mind, if that helps you:

use std::borrow::Cow;
use std::collections::HashMap;

#[derive(Debug, Default)]
struct MyStruct<'key> {
    map: HashMap<Vec<Cow<'key, str>>, u32>,
}

impl<'key> MyStruct<'key> {
    fn get<'slf, 'k: 'slf>(&'slf self, key: &[Cow<'k, str>]) -> Option<&'slf u32> {
        self.map.get(key)
    }
}

Playground.

2 Likes

Instead of using a complex value like a Vec<String> or Vec<&str> as a HashMap key, how about calculating a u64 hash value yourself and use that hash value as key for the HashMap?

What if two hashes collide?

2 Likes

Use a good hashing algorithm so that the probability of hash collisions is negligibly small. Or, if you are really concerned, use two different hash algorithms to create (u64, u64).

On second thought, yeah, u64 may be too small, but 128 bits should be fine.

If this is in your user code and not in a library you can also use hashbrown directly. It offers the RawEntry API which allows you to access entries directly with a given hash, and make sure that you're accessing the correct key with a provided comparison function.

See this example: (Playground)

use core::hash::{BuildHasher, Hash};
use hashbrown::hash_map::HashMap;

fn compute_hash<K: Hash + ?Sized, S: BuildHasher>(hash_builder: &S, key: &K) -> u64 {
    use core::hash::Hasher;
    let mut state = hash_builder.build_hasher();
    key.hash(&mut state);
    state.finish()
}

fn main() {
    let mut map = HashMap::new();
    map.extend([
        (vec!["yellow".to_string()], 10),
        (vec!["submarine".to_string()], 20),
    ]);

    let key = &["yellow"];
    let hash = compute_hash(map.hasher(), &key);

    let Some((retrieved_key, value)) = map
        .raw_entry()
        .from_hash(hash, |q| key.iter().zip(q).all(|(x, y)| x == y))
    else {
        panic!("Key not found!")
    };
    assert_eq!(retrieved_key, &vec!["yellow".to_string()]);
    assert_eq!(value, &10);
    println!("Success!");
}

Note that the raw_entry API in std is still in nightly.

If you're using hashbrown, you should just implement Equivalent.

4 Likes

This would require a newtype though, due to orphan rules.

As I recall it's getting removed entirely. If you want raw APIs, use hashbrown directly.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.