I have code whose pertinent structure is morally equivalent to something like:
fn doit(events: Vec<Event>) -> Vec<Data> {
let mut accumulator = vec![0.0; A_FEW_THOUSAND];
for event in events {
// increment small fraction of `accumulator`'s elements
}
accumulator
}
The calculation performed in the loop is embarrassingly parallel, but the results are stored by mutating external state, so parallelizing it isn't trivial. I'm thinking of distributing the events between different processors, and then combining the resulting accumulators at the end. I was hoping to use rayon
, but it's not clear how to express this idea in rayon
.
Another approach would be to replace the loop with a map which converts each event into some sparse vector of modified elements, which would then be reduced into the dense result accumulator, but I suspect that allocating a sparse vector for each event will turn out to be significantly expensive.
Can you suggest a good approach to this problem? Any crates that might be well suited to it?