I'm trying to execute a text analysis in parallel. I have a extraction method which accepts two &str parameter and I want to execute 4-8 of such calls in parallel if possible. But I get this error:
std::string::String + 'static` cannot be shared between threads safely
I've tried to clone the strings and pass those and I've already tried to use a scoped threading with the scoped-pool crate. But I'm still getting this error...
Here's a snippet of my code and how I'm using it currently:
for chunk in records.chunks(8) {
pool.scoped(|scope| {
for record in chunk {
let f = record[0].to_owned();
let c = record[1].to_owned();
scope.execute(move || {
analysis.extract(&f, &c);
});
}
});
}
error[E0277]: for<'r> std::ops::Fn(&'r str) -> std::string::String + 'static cannot be shared between threads safely
--> src<a><b>.rs:89:35
|
89 | scope.execute(move || {
| ^^^^^^^ for<'r> std::ops::Fn(&'r str) -> std::string::String + 'static cannot be shared between threads safely
|
= help: the trait std::marker::Sync is not implemented for for<'r> std::ops::Fn(&'r str) -> std::string::String + 'static
= note: required because of the requirements on the impl of std::marker::Sync for std::ptr::Unique<for<'r> std::ops::Fn(&'r str) -> std::string::String + 'static>
= note: required because it appears within the type std::boxed::Box<for<'r> std::ops::Fn(&'r str) -> std::string::String + 'static>
= note: required because it appears within the type ngrammatic::Corpus
= note: required because of the requirements on the impl of std::marker::Send for &ngrammatic::Corpus
= note: required because it appears within the type <b><'_>
= note: required because it appears within the type [closure@src\<a>\<b>.rs:89:43: 91:30 <b><'_>, record:csv::StringRecord]
Ok, that's a better error message (in the future, try to paste the full error message) .
It looks like Corpus has a Box<Fn(&str) -> String> trait object. You'll need to indicate that the underlying closure is Send so that the trait object is as well, i.e. Box<Fn(&str) -> String + Send>
It's a hack, but you can wrap (newtype) Corpus in your own type and implement Send manually:
struct MySendableCorpus(ngrammatic::Corpus);
unsafe impl Send for MySendableCorpus {}
You then need to ensure the key_trans closure you pass is, indeed, Send or else you'll cause UB.
As it stands, Corpus (and CorpusBuilder) cannot move across threads. I don't know if this was an intentional design decision by ngrammatic's author(s) - you can ask them.
Another approach might be to use thread_local to initialize a distinct Corpus on each worker thread, but I don't how useful or viable that really is.