Rand::thread_rng() separate copies

Hello,

I am doing my own implementation of rand::thread_rng() because I need it to be a deterministic seedable RNG.
I found this related post:

And the current rand::thread_rng() impl:

However, on top of that, I need to use several RNGs at the same time without interfering, i.e., I need separate copies per thread. So, I came up with the following struct:

pub struct ThreadRng<R: RngCore + Clone + Send>{
    seed:R,
    thread_local: Arc<ThreadLocal<Rc<UnsafeCell<R>>>>
}

But the compiler complains about Rc<UnsafeCell<R>> not implementing Send. Using Arc instead of Rc does not help because UnsafeCell<R> does not implement Sync. I think I am doing the same as other implementations but using the thread_local crate instead of the macro to be able to have separate copies, but it seems I am missing something.

Thanks in advance.

Maybe you could just directly pass single a different deterministic rng (if necessary seeded with the same seed) to each individual thread. For exanple one of the rng's in rand_chacha if you need a high quality rng. Or one in rand_pcg if you need a very fast but lower quality one.

1 Like

Given this, why not just pass the RNGs you use directly as needed, without any thread-locals involved?

1 Like

That struct was meant to centralize the logic of generating and obtaining the rngs. The problem I have is the same rand::thread_rng() solves, but I need more than one pool of rngs, and rand::thread_rng() only allow for a shared pool.

I see. That seems like a very unusual situation — usually when performing a computation on multiple threads, the identity of the threads is (or should be) irrelevant to the outcome of the computation, but when you use thread locals this way, the sequence of numbers obtained will depend on which thread asks. So, for example, if you split your computation into more or fewer threads, the outputs will change. Are you sure that that’s sufficient for your purposes?

If so, the reason for your compilation failure is that thread_local::ThreadLocal demands that T: Send, because

  • It allows accessing the “thread-local” data from different threads (using iter()).
  • Even if it didn’t, dropping the values is still going to end up happening on the thread owning the ThreadLocal rather than the threads the values are for.

So, it’s not really thread-local, just thread-specific, and can’t do anything that only a real thread-local can do.

You could, I think, work around this by:

  • writing a wrapper struct around the UnsafeCell that unsafely implements Send and Sync when R: Send,
  • while making sure not to call ThreadLocal::iter() because that would allow unsound multi-threaded access to the Rc<UnsafeCell>s (that is, this constraint should be explained in the safety comments of the code), and
  • changing the Rc to Arc so that drops’ effects on reference counts are properly synchronized.

To have less unsafe, you would have to find a different data structure than thread_local::ThreadLocal which doesn’t have the iteration operation, and somehow ensures the value dropping happens on each thread or not at all, so no Send bound is needed. (I don't know if such a thing exists.)

Yeah, I know about the caveats of working with multiple threads, that is one of the reasons why it is important for me to come up with a structure that centralizes how rngs are obtained.

I see. Then, I think that what would solve the problem if ever get stabilized is Tracking issue for `thread_local` stabilization · Issue #29594 · rust-lang/rust · GitHub

At least, thanks to your comments, I think I can write a not very ugly workaround.

Thank you

I think that by making the RNG used implicitly dependent on the current thread, you are creating a problem for yourself that you do not need to have. It would be better to explicitly pass RNGs as normal function parameters and closure captures, so that a change in your choice of threading cannot change which RNG is used (it will either work identically, or require further changes).

2 Likes