How to seed thread rnd generator

I'm using random numbers in several structs of my code; I need my random numbers to come from a single source, so I use ThreadRng. However, for testing purposes I need to seed the generator. How can I do that for ThreadRng?

If this is not possible, can I wrap my rnd generator in a singleton and call from everywhere in my code?

Let's simplify the problem and say there will always be a single thread.

1 Like

Don't use thread_rng. You need a SeedableRng and a seed per thread: docs.

I would absolutely not even attempt to make this work. If at all possible, I would pass the RNG in to where it's being used so you can be absolutely certain how, where, and in what order it's being used. And I can say that because I wrote a simulator that used a lot of random sequences, and that's exactly what I did, because I needed to guarantee reproducibility in the results.

Edit: if you use the singleton RNG in 100 places, and exactly one of them is throwing off your tests, you will tear your hair out before you figure out which one is the problem. I genuinely would not even risk that happening.

3 Likes

Sure, but how can I ensure there is a single generator instance in my code?

Actually, I though what my problem really is and the answer is: I want my code to produce reproducible results despite the fact I use random numbers in several unrelated structs.

So far (the C++ version) I used rnd generator stored in a thread-safe singleton.
Otherwise, If each of my structs has its own generator, all of them must be seeded in a predictable way.

If at all possible, I would pass the RNG in to where it's being used

yes, I'm considering this as a possible solution. My major concern here is that signatures of my methods get longer and longer. That solution also requires huge refactoring as many methods need to access the rndgen

So I wonder if there is other option.

In my experience, although having lots of arguments isn't great, global dependencies are significantly worse when it comes to understanding and debugging your code.

So, the original version of the simulator was written in D and used a single, global RNG. That bit about "you will tear your hair out"? Guess how I know that.

I rewrote the entire thing in Rust a few years back, and one of the changes was the complete elimination of implicit global RNG. In the several years since, there have been absolutely no bugs from any RNG source. And that's with extensive use of parallel processing.

Like I said, I wouldn't even risk global RNG being a concern at this point. Yes, it might take a fair bit of refactoring, but it was absolutely worth it. Just to be clear: I am only one data point, so while I am very emphatic about this, you absolutely shouldn't take the word of one internet rando with a bad experience as gospel. :slight_smile:

4 Likes

How does that work with threads? Assuming you have figured out how to serialize access to your rng you could just reproduce your C++ solution with a static Mutex<Option<SmallRng>>. If you don't want to pay the overhead of the mutex you can create a thread-local version of this with RefCell<Option<SmallRng>>>.

Another way is: you can have multiple random number generators that are all seeded from a single random number generator.

3 Likes

@tczajka’s suggestion is what I used recently because I wanted to proptest a complex structure but didn’t want to generate it in terms of a Strategy. It was just too difficult to generate a sequence of items that were randomized but also interrelated.

The solution was putting StdRng into a thread_local and seeding it from an environment variable with a fallback to a u64 from ThreadRng. Whatever seed was chosen, it’s stored in a static OnceLock. This lets multiple threads safely race to initialize the global seed.

Then I use the thread-id crate (caveat: see edit below) to a get a unique u64 from the current thread and wrapping_add it to the seed. The result seeds the thread_local PRNG. And finally print the global seed to stdout so test failures can be reproduced.

My use case is a little different. I don’t use the PRNG at runtime. It’s only used in tests. It is also kind of a pain to remember to initialize the thread_local PRNG each time a new thread is spawned. If I was using the PRNG at runtime, I would consider passing a reference to it instead of using thread_local. Just as @DanielKeep suggested earlier. It’s much easier to control tests with the dependency inversion.


edit: I just noticed that the thread-id crate does not return deterministic values across separate invocations. I should have known that would happen. But I've reverted back to my original strategy of hashing the backtrace. This works for me because the call to initialize the thread_local PRNG happens on different source lines for each thread. But it would not work if all threads were spawned with identical stacks (e.g. in a loop). For that you would have to pass the loop index for even more entropy.

1 Like