Why Vec.shuffle() may produce the same result?

I suppose this is more of natural of shuffle or pseudo-random generation issue. Anyway, here is my question.

I understand that normally it's because seeding with the same rng value, resulting in the same result without items being shuffled. However, my case is I already provide with ThreadRng by calling rng(). Actually the vec does shuffle. It is just sometime the result produced is the same. For instance, the code below produces ["b", "c", "a"] at the first run, but the second run it produces ["a", "b", "c"], which is the same as my input vec data. I appreciate any advice or suggestions. Thanks.

let mut vec: Vec<&str> = vec!["a", "b", "c"];
let mut r = rng();
vec.shuffle(&mut r);
println!("{:?}", vec) // this may print ["a", "b", "c"]

When you trow 3 dices, two times, both results can be identical. I assume it is the same for shuffle, unless you actively prevent this case.

[EDIT]

If this hint was not clear enough: Shuffling a vector with N elements is typically implemented by swapping each position i with a random position between 0 and N-1. Obviously, this random position can be identical to i, so no visible change occurs. And when you have only 3 items, it might occur that for all 3 positions no visible change happens (or that a subsequent swap undoes a previous one). The shuffle algorithm typically takes no special care to avoid this case.

1 Like

Three element vector has only 6 possible permutations. So it is quite likely (about 16%) that after shuffling it will have the original permutation. Have you tried running this test many times and looking at permutation distribution? Documentation of SliceRandom::shuffle (from rand crate) says:

The resulting permutation is picked uniformly from the set of all possible permutations.

So if shuffle operation does not change the order of elements 1/6th of the time (or rather after changes, relative order of all elements is preserved), then everything works as it should.

6 Likes

Most likely this is an expected result. For reasons others have given above.

If it were the case that a supposed random shuffling could not produce the same as the input then it would not be random now would it.

I think if you want to run through all the different possible arrangements with no repetitions you need to permute rather than shuffle.

For example with itertools:

use itertools::Itertools;
let data = vec![1, 2, 3];
for perm in data.into_iter().permutations(3) {
    println!("{:?}", perm);
}
2 Likes

That's the nature of randomness.

Whenever in doubt about using random numbers, you can make a first test: execute your random experiment more often and check simple statistics:

use rand::{seq::SliceRandom, thread_rng};
use std::{collections::HashMap, env, process};

fn main() {
    let args: Vec<String> = env::args().collect();

    if args.len() != 2 {
        eprintln!("Usage: {} <number_of_shuffles>", args[0]);
        process::exit(1);
    }

    let iterations: u64 = match args[1].parse() {
        Ok(n) if n > 0 => n,
        _ => {
            eprintln!("Error: Please provide a positive integer.");
            process::exit(1);
        }
    };

    let mut rng = thread_rng();
    let mut counts: HashMap<String, u64> = HashMap::new();

    for _ in 0..iterations {
        let mut items = vec!["a", "b", "c"];
        items.shuffle(&mut rng);
        let key = format!("[{}, {}, {}]", items[0], items[1], items[2]);
        *counts.entry(key).or_insert(0) += 1;
    }

    let mut results: Vec<(String, u64)> = counts.into_iter().collect();
    results.sort_by(|a, b| a.0.cmp(&b.0));

    println!("Shuffle statistics ({iterations} iterations):");
    println!("------------------------------------------");

    for (outcome, count) in &results {
        let percentage = *count as f64 / iterations as f64 * 100.0;
        println!("{outcome}  {count:>10}  ({percentage:6.2}%)");
    }

    println!("------------------------------------------");
}

1 Like

In case of unit tests, however, a same-seed => same-output strategy would be preferable.

1 Like

This is about randomness, not about unit-tests.

1 Like

This is a good example of how random shuffling (picking a random permutation from a uniform distribution) may not match the intuitive expectation of what "shuffling" is. This is well known by creators of audio players – what people actually want when they toggle shuffle play is not a random permutation but something with more constraints.

(Also, it's worth noting that the likelihood of drawing any given permutation from a uniform distribution decreases extremely quickly as the number of elements grows. For a three-element list you're 17% likely to end up with the same order, but with ten-elements the likelihood already drops to one in 3.6 million, and with twenty elements, getting the same list back is astonishingly unlikely, even after years of continuous shuffling.)

3 Likes