HashMap entry API and ownership?

Just curious regarding ownership and the HashMap API. I think I misunderstand both.

use std::collections::HashMap;

fn foo() {
    let mut hm: HashMap<u8, Vec<String>> = HashMap::new();
    let a = "abc".to_owned();
    hm.entry(0)
        .and_modify(|e| e.push(a))
        .or_insert(vec![a]);
}

We get this:

error[E0382]: use of moved value: `a`
  --> src/main.rs:99:25
   |
96 |     let a = "abc".to_owned();
   |         - move occurs because `a` has type `std::string::String`, which does not implement the `Copy` trait
97 |     hm.entry(0)
98 |         .and_modify(|e| e.push(a))
   |                     ---        - variable moved due to use in closure
   |                     |
   |                     value moved into closure here
99 |         .or_insert(vec![a]);
   |                         ^ value used here after move

For more information about this error, try `rustc --explain E0382`.

How would you solve this idiomatically and why? I assume there is a better solution than:

fn foo() {
    let mut hm: HashMap<u8, Vec<String>> = HashMap::new();
    let a = "abc".to_owned();
    hm.entry(0)
        .and_modify(|e| e.push(a.clone()))
        .or_insert(vec![a]);
}

Thank you!

The problem here is with how closures capture values. When a closure captures a value, that capture happens when the closure is constructed, and lasts as long as the closure lives. So "a" is moved out of the closure on line 98 before the function call to and_modify even occurs.

Of course we know those two usages of a are theoretically mutually exclusive, since the entry can't both have a value to modify AND need to insert a default. But the type system doesn't have any way to express that relationship, so we have to use another strategy.

Option::take is a handy method for situations like this

Playground

use std::collections::HashMap;

fn test_entry(hm: &mut HashMap<u8, Vec<String>>) {
    let mut a = Some("abc".to_owned());
    hm.entry(0)
        .and_modify(|e| e.push(a.take().unwrap()))
        .or_insert_with(|| vec![a.take().unwrap(), "or".into()]);
}

fn main() {
    let mut hm: HashMap<u8, Vec<String>> = HashMap::new();

    test_entry(&mut hm);
    test_entry(&mut hm);

    println!("{:?}", hm);
}

Now the closure only captures an &mut a instead of moving out of a, and we use take to get the value out and replace it with None. This does mean we have to call unwrap, but we know these methods should be exclusive so that's not a problem.

I changed or_insert to or_insert_with because that method takes a closure. In your code you were always constructing that Vec even though in the case there was already an entry it would never be used. (It's also important for correctness in the take version, if you switch it back to or_insert you'll get a panic)

There's also an arguably slightly more idiomatic way to do it for this simple case, which is to just exploit the fact that Entry is an enum
Playground

use std::collections::{hash_map::Entry, HashMap};

fn test_entry(hm: &mut HashMap<u8, Vec<String>>) {
    let a = "abc".to_owned();
    match hm.entry(0) {
        Entry::Occupied(mut occupied) => occupied.get_mut().push(a),
        Entry::Vacant(vacant) => {
            vacant.insert(vec![a, "or".into()]);
        }
    }
}

fn main() {
    let mut hm: HashMap<u8, Vec<String>> = HashMap::new();

    test_entry(&mut hm);
    test_entry(&mut hm);

    println!("{:?}", hm);
}

Since the compiler does know the match arms are exclusive, you can just move out of a normally!

5 Likes

I think that this is not a problem of not understanding ownership; this is a problem of code elegance.

IMO the way you are trying to express this isn't the right way. You are conditionally creating a non-empty vector (without pushing), or pushing to it if it exists.

It would be nicer to observe that the empty vector is the identity element of the operation "collect elements in this vector", which naturally lends itself to the shorter, simpler, explicit condition-free, and compiling solution of:

fn foo() {
    let mut hm: HashMap<u8, Vec<String>> = HashMap::new();
    let a = "abc".to_owned();

    hm.entry(0)
        .or_insert_with(Vec::new)
        .push(a);
}

A similar soultion would apply to counting elements: instead of inserting 1 or incrementing conditionally, you would insert 0 (the identity of addition) and always increment.

10 Likes

Of course! Good observation, thank you. I love this. I just changed it to pre-allocate for 1 element. In my real use case it has to be done over millions of elements so it performs a little better that way from the few benchmarks I could do.

Thank you! I do prefer the second way of doing it but I certainly learned something with the first one.

P.S. I wish I could select more than one answer as the solution.

1 Like

Then you might want to try using || Vec::with_capacity(expected_capacity) instead of Vec::new.

3 Likes

In this case, you might want to look into using a crate like tinyvec that can store a few items inline without a separate heap allocation.

2 Likes