How to share a HashMap between threads

Hello,
I'm trying to make a function that count the frequency of letters in multithread.
But I am stuck because I don't know how to share a HashMap between threads.
I tried that :


But I have the following error :
 move occurs because `freq_map` has type `std::sync::Arc<std::sync::RwLock<std::collections::HashMap<char, i32>>>`, which does not implement the `Copy` trait

Can you explain me how to share a hashmap between threads please ?

You must clone the Arc it before spawning the thread.

let freq_map_clone = Arc::clone(&freq_map);
thread::spawn(move || {
    for i in start..end {
        let mut this_freq_map = freq_map_clone
            .write()
            .expect("RwLock poisoned");
        
        ...
    }
});

You will then encounter the same error for all_input, which you will also need to clone before spawning the thread.

1 Like

Hello, Thank you so much it works like that !
I just need to figure out how to return the hashmap and not the RwLock now

Following @alice’s suggestions will make your current code work, but you might not see much performance improvement over a single-threaded solution: each thread needs to take the write lock for every character it wants to tally, so there isn’t a lot of chance for parallelism to happen. Consider having each thread build a temporary HashMap of counts, and then only add those totals to the main result after the local count is finished.

Another improvement would be to only construct the chars() iterator once per thread:

let local_input = all_input.clone();
thread::spawn( move ||
    for c in local_input.chars().skip(start).take(end-start) {
        /* ... */
    }
)

You’ll need to join() all of the worker threads to wait for them to finish, and then use Arc::try_unwrap() and RwLock::into_inner() to get the HashMap out to return.

3 Likes

+1 because I was just going to say that instead of locking, one should accumulate in independent threads and then summarize the partial results to get the final result.

Thank you for your help,
Is it better like that:

    let freq_map = Arc::new(RwLock::new(HashMap::new()));
    let all_input = input.join("");
    let input_len = input.len();
    println!("{}", input_len);
    let mut handles = vec![];
    for n in 0..worker_count {
        let start = n * worker_count;
        let end = start + input_len / worker_count;
        let all_input_clone = all_input.clone();
        let freq_map_clone = Arc::clone(&freq_map.clone());
        handles.push(thread::spawn(move || {
            let mut local_hashmap = HashMap::new();
            for c in all_input_clone.chars().skip(start).take(end - start) {
                *local_hashmap.entry(c).or_insert(0) += 1;
            }
            let mut this_freq_map = freq_map_clone
                .write()
                .expect("RwLock poisoned");
            for (c,count) in local_hashmap {
               *this_freq_map.entry(c).or_insert(0) += count;
            }
        }));
    }
    for handle in handles {
        let _ = handle.join();
    }

or is that even better :

    let mut freq_map =HashMap::new();
    let all_input = input.join("");
    let input_len = input.len();
    println!("{}", input_len);
    let mut handles = vec![];
    for n in 0..worker_count {
        let start = n * worker_count;
        let end = start + input_len / worker_count;
        let all_input_clone = all_input.clone();

        handles.push(thread::spawn(move || {
            let mut local_hashmap = HashMap::new();
            for c in all_input_clone.chars().skip(start).take(end - start) {
                *local_hashmap.entry(c).or_insert(0) += 1;
            }
            return local_hashmap;
        }));
    }

    for handle in handles {
        let local_hashmap = handle.join().unwrap();
        for (c, count) in local_hashmap {
            *freq_map.entry(c).or_insert(0) += count;
        }
    }

They both work; the second is probably closer to what I’d write myself.

Thank you !
Now I don't need it , but for the first , how could have return the value of the hashmap inside :

    let freq_map = Arc::new(RwLock::new(HashMap::new()));

please ?

Once there's no other copy of the Arc around anymore, you can use try_unwrap.

1 Like

It’d be something like this (untested):

let result = Arc::try_unwrap(freq_map).expect("Worker still running!")
                 .into_inner().expect("Lock poisoned!");
1 Like

There are a couple more things wrong with that code:

  • It's incorrect. It doesn't account for the input length not being evenly divisible by the number of workers. This results in some parts of the string being completely ignored.
  • It unnecessarily clones the entire string for every worker.

Here's an updated playground that accounts for these problems.

1 Like