Sharing RNG between threads - does any RNG implement Send?


#1

Hi there!

I have a struct with a random number generator as member, which is needed to randomly return a tuple when a given pattern has multiple valid matches.

pub struct TupleSpace {
    data: Vec<Tuple>,
    rng: ThreadRng,
}
pub fn read(&mut self, tup: Tuple) -> Option<Tuple> {
    // Iterate over vector and keep indices of all valid results.
    let mut index = self.data.len();
    let mut index_vec: Vec<usize> = Vec::new();
    for i in 0..self.data.len() {
        // Check if contents of both tuples match
        if tup.content == self.data[i].content {
            index = i;
            index_vec.push(i);
        }
     }
     if index < self.data.len() {
         let i: usize;
         // Choose a random valid result to return
         i = *self.rng.choose_mut(index_vec.as_mut_slice()).unwrap();
         let return_tup = self.data[i].clone();
         Some(return_tup)
     } else {
         // If the index didn't change, then not match has been found.
         None
     }
}

This works fine on a single thread, but does not compile when multiple threads attempt to use the TupleSpace.

let t_space = Arc::new(Mutex::new(TupleSpace::new()));
let ts1 = t_space.clone();
let handle_a = thread::spawn(move || { read_some_stuff_from_ts(ts1); });

let ts2 = t_space.clone();
let handle_b = thread::spawn(move || { read_some_stuff_from_ts(ts2); });

Makes absolute sense because ThreadRng does not implement std::marker::Send, thus the threads cannot share the rng member of TupleSpace. This can be remedied by removing the rng member and creating a new local ThreadRng instance every time TupleSpace::read() is called.

However that solution would result in considerable overhead if the function is called frequently, would it not?
I’m not certain how expensive rand::thread_rng() is, or any rng constructor for that matter.

Is there more elegant way to handle RNGs between threads?


#2

I’m pretty sure the ThreadRng is a thread-local object, so almost by definition you won’t be able to send it across threads. The other random number generators in rand should be Send though, so you may want to give them a try.

rand::thread_rng() caches the random number generator in a thread-local so I imagine the first time you use it will require creating a new one, but then subsequent uses should be cheap because you’re reusing the one you’ve already created.


#3

Does your application have any reproducibility requirements on the random number sequence (e.g. “for a given seed, the application should always produce the same output”)? If not, I would advise calling rand::thread_rng() at the beginning of each thread’s execution and keeping the ThreadRng around after that for minimal overhead.

One possibility would be to adjust the TupleSpace struct as follows:

/// This struct is now locally spawned by each thread
pub struct TupleSpace {
    /// The main thread creates the shared Vec and passes a reference to the threads.
    ///
    /// Add a Mutex if you need to modify the data infrequently. If you need frequent
    /// modification by multiple threads, a Vec may not be what you want.
    ///
    data: Arc<Vec<Tuple>>,

    /// The thread spawns this locally during initialization, then keeps it around.
    rng: ThreadRng,
}

If you do have reproducibility requirements, on the other hand… well, welcome to my world, and this is going to be more tricky.


#4

You’re correct, changing to Isaac64Rng solved the issue. I really should have caught that detail when reading up on the module. Thank you!

Intuitively I tried to keep the data structure implementation separate from the parallelization, though your suggestion seems like the better way to go here.
You’re also right on the vector. It works for now, to try and get a feeling for how a tuple space can work in Rust. In future revisions I’ll replace the underlying data structure with something that is more suitable for frequent access and searches involving pattern matching.

Thanks for your help!