Hi all, I'm learning Rust and currently writing a function to count character frequencies from an array of string slices in parallel.
use std::collections::HashMap;
use std::thread::{self, JoinHandle};
pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize> {
let batch_size = input.len() / worker_count;
fn count_chars(input: Vec<&str>) -> HashMap<char, usize> {
let mut counter = HashMap::<char, usize>::new();
for s in input {
for c in s.chars() {
counter
.entry(c)
.and_modify(|count| *count += 1)
.or_insert(1);
}
}
counter
}
// 1. split the input into worker_count batch-sizes of m
let mut batched_inputs: Vec<Vec<&str>> = Vec::new();
for i in 0..worker_count {
batched_inputs.push(input[i * batch_size..i * batch_size + batch_size].to_vec());
}
// 2. spawn threads and pass in the copies of each slice to the workers
let mut workers: Vec<JoinHandle<_>> = Vec::new();
for batch in batched_inputs {
workers.push(thread::spawn(move || count_chars(batch)));
}
// 3. each worker returns a hashmap
let mut results = Vec::new();
for handle in workers {
results.push(handle.join().unwrap());
}
// 4. the main thread consolidates all the hashmaps into a single hashmap
results
.iter()
.fold(HashMap::new(), |mut acc_counter, curr_counter| {
for (k, v) in curr_counter {
acc_counter
.entry(*k)
.and_modify(|counter| *counter += *v)
.or_insert(*v);
}
acc_counter
})
}
However, I'm getting a compiler warning about lifetimes and I don't quite get why that's the case here. Appreciate an explanation and how I can fix the issue.
I'm also aware the code isn't optimal (e.g. not needing to store batched_inputs
) but I just want to be explicit for my own learning. Thanks again.
Compiler error:
workers.push(thread::spawn(move || count_chars(batch)));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| `input` escapes the function body here
| argument requires that `'1` must outlive `'static`
I've tried removing move
but doesn't work either.