Rng, iterators, multi-threading, reproducibility and more fun

Hello everyone,

My goal is to create some generic struct, let's call it Sampler, that will implement the Distribution<Outcome> trait from the rand crate for some type Outcome.

I would also like a method like

impl Sampler {
    fn blocks<R: Rng>(&self, block_size: usize, rng: R) -> Blocks<R> {
        ... 
    }
}

where Blocks is some iterator that yield iterators that each sample block_size times. That is, something like

struct Blocks<'a, R> {
    sampler: &'a Sampler,
    block_size: usize,
    rng: R,
}

impl<'a, R: Rng> Iterator for Blocks<R> {
    type Item = Outcome;

    fn next(&mut self) -> Option<Self::Item> {
        Some(
            an iterator that call block_size times the
            sample method of sampler with the given rng
        )
    }
}

Finally, I want to use Sampler in a Monte Carlo simulation over many threads. And I want reproducibility of the simulation by specifying some seed for the rng. I plan to use the xoshiro rng to do that.

https://rust-random.github.io/rand/rand_xoshiro/struct.Xoshiro256StarStar.html

Do you have any tips and ideas for this task? I'm a bit loss right now.

Well first, if you want to store a reference to the Sampler in your Blocks iterator, you should take self by reference, not by value in your blocks method.

As for your returned iterator.. what exactly do you want it to do? You want an iterator of iterators that does what?

For the reference, it was a typo. I edit it. Thanks.

I want each iterator yielded by Blocks to be an iterator that sample block_size times.

Example

If Outcome = bool and block_size = 10, I want each call to Blocks.next to return some iterator that yields 10 random booleans using Sampler.

The main challenge you will run into is that the functions on the Rng trait require &mut self to be called, which in turn means that you need exclusive access to call them. Furthermore the iterator trait is such that the returned items do not borrow from the iterator, and therefore you can create several independent iterators that coexist:

let mut blocks = sampler.blocks();

let iter1 = blocks.next().unwrap();
let iter2 = blocks.next().unwrap();
// now use both iter1 and iter2

As you can see, neither iter1 nor iter2 would have exclusive access to the Rng in this case. To avoid this you can either:

  1. Use a RefCell.
  2. Create a new Rng per returned iterator, so each of them has exclusive access to its own rng.
  3. Generate all 10 samples in the call to next and return a std::vec::IntoIter.

Note that if your sampler only requires &self to generate a sample, e.g. if you have this signature:

impl Sampler {
    fn sample<R: Rng>(&self, rng: &mut R) -> SomeValue {
        ...
    }
}

then you don't need exclusive access to the Sampler, only the Rng, making option two quite promising.

1 Like
  1. Create a new Rng per returned iterator, so each of them has exclusive access to its own rng.

I believe this to be a good option. It is the case that I only need a immutable reference to Sampler to generate the values.

1 Like

In that case you are almost there: playground.

@alice Thank you for your help!

There seems to be one problem with that approach. That is, if you clone an rng, you get the exact same sequence twice.

Then you will have to find some other method of splitting the rng. You can't share it between the iterators.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.