Hi everyone,
I'm writing a small Python extension in Rust using PyO3. The extension simulates a collection of independent, finite Markov processes as state machines that are stepped from state-to-state by drawing random numbers to determine the next state and when it occurs. I hit various bottlenecks implementing this in both single-threaded pure Python and using Python's multiprocessing
, so I decided next to try to step each state machine in parallel in Rust. I need to simulate about 10 million state machines, with each one going through between 0 and ~100 transitions per simulation step.
Here's a simplified version of something close to what I would like to do (scroll down to see the whole code):
#[pyclass]
struct StateMachine {
state: i32
}
impl StateMachine {
#[new]
fn new() -> Self {
StateMachine { state: 0 }
}
fn run(&mut self) -> Transition {
// Normally random number generation occurs here, but do this
// instead for the sake of simplicity
self.state += 1;
Transition { data: 1.0 }
}
}
#[pyclass]
struct Transition {
data: f64
}
#[pyfunction]
fn par_run(machines: Vec<&PyCell<StateMachine>>) -> Vec<Transition> {
Python::with_gil(|_| {
machines
.into_par_iter()
.map(|machine| machine.borrow_mut().run())
.collect()
})
}
StateMachine
instances should be created in Python and are owned by the Python interpreter. par_run
is a Python function that takes a list of StateMachine
s as an argument. Then, the run() method for each object is farmed out to a Rayon parallel iterator, where each StateMachine's Rust run
method is called.
The problem is that &PyCell<StateMachine>
is not Sized
, which is required by Rayon's IntoParallelIterator trait by way of its ParallelIterator
trait. And if I understand correctly, I must use PyCell
to get references to the state machines coming from Python because Rust doesn't own them.
The compiler error is:
error[E0599]: the method `into_par_iter` exists for struct `Vec<&pyo3::PyCell<StateMachine>>`, but its trait bounds were not satisfied
--> src/main.rs:25:14
|
25 | .into_par_iter()
| ^^^^^^^^^^^^^ method cannot be called on `Vec<&pyo3::PyCell<StateMachine>>` due to unsatisfied trait bounds
|
= note: the following trait bounds were not satisfied:
`[&pyo3::PyCell<StateMachine>]: Sized`
which is required by `[&pyo3::PyCell<StateMachine>]: rayon::iter::IntoParallelIterator`
`[&pyo3::PyCell<StateMachine>]: rayon::iter::ParallelIterator`
which is required by `[&pyo3::PyCell<StateMachine>]: rayon::iter::IntoParallelIterator`
The crux of my question is: how can I run a method on a collection of independent Python objects in parallel?
Thanks!