Summary
I'm trying to parallelize code which operates on a type that, at its core, looks like this:
struct MyType {
inner: Box<dyn MyTrait>,
}
impl MyType {
pub fn new(make_inner: &(dyn Fn() -> Box<dyn MyTrait>)) -> Self {
Self { inner: make_inner() }
}
}
The initial compiler error is ... (click to expand)
error[E0277]: `(dyn MyTrait + 'static)` cannot be shared between threads safely
--> src/io/hdf5.rs:102:10
|
102 | .map(my_function)
| ^^^ `(dyn MyTrait + 'static)` cannot be shared between threads safely
|
= help: the trait `Sync` is not implemented for `(dyn MyTrait + 'static)`
= note: required because of the requirements on the impl of `Sync` for `Unique<(dyn MyTrait + 'static)>`
= note: required because it appears within the type `Box<(dyn MyTrait + 'static)>`
note: required because it appears within the type `MyType`
--> src/foo.rs:21:12
|
21 | pub struct MyType {
| ^^^^^^
= note: required because it appears within the type `&MyType`
= note: required because it appears within the type `[closure@src/io/hdf5.rs:95:26: 99:6]`
note: required by a bound in `rayon::iter::ParallelIterator::map`
--> /home/jacg/.cargo/registry/src/github.com-1ecc6299db9ec823/rayon-1.5.3/src/iter/mod.rs:586:34
|
586 | F: Fn(Self::Item) -> R + Sync + Send,
| ^^^^ required by this bound in `rayon::iter::ParallelIterator::map`
For more information about this error, try `rustc --explain E0277`.
error: could not compile `petalo` due to previous error
Wherever I try to add Sync
(and/or Send
) constraints, it quickly leads to an explosion of compiler errors, which give the impression of only taking me further from a working solution.
It's not clear to me whether these constraints should be introduced as supertraits of MyTrait
or only in some select locations where generic types appear, or both.
(I am not at all convinced that the design of MyType
and MyTrait
is a good one: maybe reviewing these would simplify the problem.)
The Question
How should I go about making this code parallelizable?
In more detail
At the risk of missing out some crucial information on the one hand, and giving irrelevant info on the other, here is further information about the context:
-
My code requires the use of multi-dimensional histograms (for which I use
ndhistogram
). -
The number of dimensions in the histogram is not known until runtime.
-
ndhistogram
represents histograms with different dimensions, by different types. -
I have an abstraction (names changed, to remove potentially confusing, irrelevant domain-specific details)
pub trait SingleHisto { fn fill (&mut self, datum: &Datum); fn value(& self, datum: &Datum) -> usize; }
whose purpose is to allow histogramming of values of the type I care about in my program (
Datum
), without caring about the number of dimensions in the underlying histogram. Behind the scenes,fill
andvalue
must extract however many features are needed from thedatum
, in order to associate it with a praticular bin in the histogram.This trait is implemented for histograms with different numbers of dimensions, for example
impl<X, Y> SingleHisto for ndhistogram::Hist2D<X, Y, usize> where X: Axis<Coordinate = Datum>, Y: Axis<Coordinate = Datum>, { fn fill (&mut self, datum: &Datum) { Histogram::fill (self, &(*datum, *datum)) } fn value(& self, datum: &Datum) -> usize { *Histogram::value(self, &(*datum, *datum)).unwrap_or(&0) } }
-
I actually need multiple similarly-shaped histograms to work with, so there is a concrete type for holding these together in a single object
pub struct MultiHisto { aaa: Box<dyn SingleHisto>, bbb: Box<dyn SingleHisto>, }
with an interface for constructing, filling and querying the inner histograms
impl MultiHisto { pub fn new(make_empty_single_histo: &(dyn Fn() -> Box<dyn SingleHisto>)) -> Self { let aaa = make_empty_single_histo(); let bbb = make_empty_single_histo(); Self { aaa, bbb } } pub fn fill(&mut self, kind: Kind, datum: &Datum) { match kind { Kind::AAA => self.aaa.fill(datum), Kind::BBB => self.bbb.fill(datum), } } pub fn value(&self, datum: &Datum) -> ... { // some function of `aaa` and `bbb` in the relevant bin } }
-
Having filled a
MultiHisto
, it needs to be queried repeatedly while processing a bazillionDatum
s.let multi_histo: MultiHisto; // Initialized somehow let pre_data: Vec<PreDatum>; // Initialized somehow // Closure converting one `PreDatum` into one `Datum`. Uses `multi_histo`. let pre_datum_to_datum = |pre_datum: PreDatum| { let mut datum: Datum = pre_datum.into(); datum.feature = multi_histo.value(&datum); datum }; // Generate bazillion data. Queries `multi_histo` via `pre_datum_to_datum` closure let data: Vec<_> = pre_data .into_iter() // want .into_par_iter() .map(pre_datum_to_datum) .collect();
-
I would like to parallelize this process, by replacing
into_iter
withrayon
'sinto_par_iter
. Trying this gives me the errorerror[E0277]: `(dyn SingleHisto + 'static)` cannot be shared between threads safely --> src/io/hdf5.rs:102:10 | 102 | .map(pre_datum_to_datum) | ^^^ `(dyn SingleHisto + 'static)` cannot be shared between threads safely | = help: the trait `Sync` is not implemented for `(dyn SingleHisto + 'static)` = note: required because of the requirements on the impl of `Sync` for `Unique<(dyn SingleHisto + 'static)>` = note: required because it appears within the type `Box<(dyn SingleHisto + 'static)>` note: required because it appears within the type `MultiHisto` --> src/histos.rs:21:12 | 21 | pub struct MultiHisto { | ^^^^^^^^^^ = note: required because it appears within the type `&MultiHisto` = note: required because it appears within the type `[closure@src/io/hdf5.rs:95:26: 99:6]` note: required by a bound in `rayon::iter::ParallelIterator::map` --> /home/jacg/.cargo/registry/src/github.com-1ecc6299db9ec823/rayon-1.5.3/src/iter/mod.rs:586:34 | 586 | F: Fn(Self::Item) -> R + Sync + Send, | ^^^^ required by this bound in `rayon::iter::ParallelIterator::map` For more information about this error, try `rustc --explain E0277`. error: could not compile `foo` due to previous error