Predictable random numbers


#1

I need predictable random numbers.

The rand crate is almost what I need but I cannot control the seed. How can I get repeatable random sequences?

Worik


#2

You can control the seed using this:
https://docs.rs/rand/0.4.2/rand/trait.SeedableRng.html#tymethod.from_seed


#3

But I have to pass that generator around, or reseed one for every scope.

Passing it around makes every thing it touches mutable

Reseeding in every scope is (a) not good practise (b) means the seed has to be passed around, or hard coded

If I understand the rand crate makes the smame generator available in a thread.


#4

Passing it around makes every thing it touches mutable

You could put it in a RefCell if you want to hide this mutable state.

Though I’d think that if you want predictability, the infectious nature of a mutable Rng is actually a good thing.
Or I guess, it depends on what kind of predictability you want; if you want predictability across different versions of the code, then it’s a good thing because it makes you more aware of when a change to the code might cause a different amount of random numbers to be drawn. It also prevents you from accidentally using it in multithreaded scenarios (a mistake that would destroy predictability even between multiple runs of the same version of the code).


#5

I will look into RefCell. I do not know what that is, so there is a lesson for me. Thank you.

The purpose of predictability is to facilitate debugging. Once I am at the point of using it for real I ill want as much entropy as I can find. I “grew up” with C and the (without looking up the API) I call setseed at the start and I am away. I use a constant in debugging and then switch over to /dev/rand for the seed. (Or in the bad old days time_t, that was never a good idea but most of us did it)

I was hoping for some thing similar in rust.

IMHO a random number generator that cannot be seeded thus coerced into repeating itself is close to useless. Unrepeatable bugs are a menace and we should not be making major changes to move from predictable to true (pseudo) randomness.

But I guess like all things rust (?) there is a good reason for it…


#6

You could make the rng an argument and keep it in main or some other top-level function. You could make it a global, constructing it during startup.

And this goes without saying, but: if you’re involving threads, reproducibility is going to be a hell of a lot harder than just seeding an RNG.


#7

Interesting. In that case, having to cart around an Rng just to help during debugging seems like too much of a chore to be worth it. I might consider the following alternative: A static global Rng, locked behind a Mutex.

(Beware: This solution will make adding threads impossible, because attempts to use the Rng from multiple threads will panic. Of course, it’s impossible for an rng to predictable in that case anyways, so threaded code should just use the thread_rng())

#[macro_use]
extern crate lazy_static;
extern crate rand;

mod rng {
    use rand::{ChaChaRng, Rand, Rng, SeedableRng};
    use std::sync::{self, Mutex};
    
    lazy_static! {
        static ref RNG: Mutex<ChaChaRng> = {
            let rng = ::rand::random();
            Mutex::new(rng)
        };
    }

    // Convenience method
    pub fn reseed(seed: &[u32])
    { get().reseed(seed) }
    
    // Convenience method
    pub fn random<T: Rand>() -> T
    { get().gen() }
    
    pub fn get<'a>() -> sync::MutexGuard<'a, ChaChaRng>
    { RNG.lock().expect("Attempted to borrow the Rng multiple times!") }
}

The general idea is:

  • Code that needs a single random number uses ::rng::random().
  • Code that requires top performance may use ::rng::gen() to borrow the Rng and generate many random numbers…
  • …But! You must be careful not to call methods in the ::rng module while it is borrowed! (this can be an easy mistake to make while refactoring, or by calling other functions that happen to use random numbers)
fn main() {
    use ::rand::Rng;
    
    ::rng::reseed(&[]); // seed with zeros
                        // (without this it is randomly seeded on the first use;
                        //  this happens in the lazy_static! macro above)
    
    println!("{}", rng::random::<i32>());
    println!("{}", rng::random::<f64>());
    
    { // block to scope the borrow
    
        // borrow the rng for multiple calls
        let mut rng = ::rng::get();
        
        let bytes: [u8; 4] = [rng.gen(), rng.gen(), rng.gen(), rng.gen()];
        println!("{:?}", bytes)
        
    } // guard gets dropped here and `::rng` methods are safe to call again

    println!("{}", rng::random::<f64>());
}

Here is a playground link implementing the above code with more extensive comments and documentation.


#8

If you’re using multiple threads, I’m curious how you intend to get predictable results. Do they coordinate their use of random numbers? For both performance and reproducibility I’d expect to want to use just one rng per thread. But then, I’m accustomed to Monte Carlo simulations where the RNG is a significant contributor to total run time. It sounds like you’re doing something more cryptographic, perhaps?


#9

https://rr-project.org/


#10

Obviously not using multiple threads. Develop and debug one thread at a time. And yes, one RNG per thread.

Clearly I am new to Rust but I was hoping that something I have seen around “thread_rng” would be as you describe and have a mechanism for seeding. I cannot use a RNG that cannot be seeded.

I am doing genetic programming so build a lot of trees randomly, and select random nodes. So if I strike a bug in a operator, or whatever, I need to reproduce it. If I cannot seed the RNG I cannot do that.

I am curious what the idea is not having a way to seed the process. Unless it has no pseudo generation and just reads straight from /dev/random. Which I cannot see as useful. But I am new here so need to be careful with my opinions!

Lastly the main problem I had (I say with hindsight) is that I did not know how to introduce global functions. This has shown me how, and I think in that context the locks are useful. No point when single threaded but some day some clown (me) will introduce a thread and ask for a random number and forget and…


#11

error: use of unstable library feature ‘rustc_private’: this crate is being loaded from the sysroot, an unstable location; did you mean to load this crate from crates.io via Cargo.toml instead? (see issue #27812)
–> src/main.rs:26:1
|
26 | extern crate lazy_static;
| ^^^^^^^^^^^^^^^^^^^^^^^^^

Failing at first hurdle


#12

If you’re using multiple threads, I’m curious how you intend to get predictable results.

I didn’t get the impression that this is what the user was asking, but if you’re asking on theoretical grounds, it is possible. A parent thread generates seeds and hands them off to child threads in sequential order. Child threads seed their own rngs, and can’t adjust their workload dynamically (so no rayon).

Wow, that is a terrible, terrible error message. (even with the part that seems to have been recently added about Cargo.toml)

The fix is to edit Cargo.toml and add (under [dependencies])

lazy_static = "1.0.0"

I got the version number by searching for lazy_static on crates.io.

The rest of the error message is a total red herring (it’s a message that gets produced because rustc itself happens to depend on lazy_static, or something; simply put, lazy_static is that widely used).


#13

Thanks.

I looked there too but obviously not as clearly as you did!

Thank you


#14

I think the reasons you can’t seed thread_rng are basically summed up here, albeit indirectly.

(DISCLAIMER: I am not a crypto expert; just a parrot.)

thread_rng uses a cryptographically secure PRNG, and is meant to be suitable for use by everybody. Even the standard library itself used to use it to initialize HashMap state. Basically, you’re sharing it with all of the libraries you use, and having the ability to reseed it for one purpose could compromise the security it promises to all of the other code that uses.

thread_rng goes so far as to even reseed itself from the OS occasionally, which in theory should not be necessary for cryptographic security, but may help it recover from exploits that expose the state of the CSPRNG…? (again: I am not a crypto expert!) In any case, this makes user reseed-ability pointless.


#15

How do you contact the authors of crates?

I am not convinced.

I am not a expert on cryptography either, but we do not have to be. It is clear that for the purposes of secure cryptography the closer you get to true randomness (what ever that is - there is a not very helpful definition from information theory) the better.

Which is why we have pseudo random numbers that let us simulate randomness but in a deterministic manner. We have to switch to true (?) randomness for actual use in the wide world, but we also need to be able to develop, debug and test that code. For that deterministic is essential


#16

From what I’ve read, this is really one of the greatest myths about practical uses of random numbers.


#17

No. They are talking about something different. (Either /dev/urandom Vs. /dev/random I could care less, I use the standard libraries)

The discussion there about “true randomness” is confusing. Quantum effects are the ultimate source of entropy I believe. But that does not explain what it is!

In information theory (IIRC it has been 25 years for me) randomness is defined relative to a turing machine (or equivelent) and a number is random if there is no programme shorter than the number that outputs that number. That is the number is incompressible.

You can see how that does not really help if I am simulating a coin toss. I do not know or care about compressibility.

We have a rough definition that is sufficient: Randomness means unpredictable. Mathematically that is hard to pin down and I have not seen it done, but really why should we care?

The point about having psudo randomness is to simulate unpredictability predictably. Hence the authors of Thread_rand need a talking too about supplying a debug mode. IMO

But it has been a long time since I was studying randomness in a information sense so I would be thrilled to be corrected. I would be learning something.

BTW the forum software is harassing me about continuing this discussion! A bit creepy but I guess it has a point

Thank you for your help, I think I have it working now, a lot of re-factoring to do. I’ll find the authors of thread_rand and ask them what their opinions are

Have a nice one!


#18

See my reply here:

The code is for fast seedable RNGs for multiple threads, but you can easily remove the thread_local. It will panic if you forget to set the seed, but is otherwise safe.


#19

If you remove the thread_local, couldn’t that lead to memory corruption if two threads access the RNG at the same time?


#20

Yeah, you only want to remove thread_local if you’re using a single thread. But I suppose even then, there’s no harm in leaving it.