Populate a vector of hashsets


#1

Hi Rustaceans, good day! I am a newbie here. Got a question about populating a vector of hashsets where the values to populate in a particular hashset in the vector depends on the values of the hashset in a previous element. If I fail to sound articulate enough, consider the following code:

use std::collections::HashSet;

fn main() {

    // create a vector of HashSets
    let mut v: Vec<HashSet<isize>> = Vec::new();
    for _ in 0..10 {
        v.push(HashSet::new());
    }

    // populate the HashSet in element 0.  (0..5 is arbitrary)
    for i in 0..5 {
        v[0].insert(i);
    }

    // traverse the list and populate the hash sets in element 1 to 9
    // the values to insert are 2 times each of the values in the previous element
    // 2 times is arbitrary here, just to show dependencies between elements
    for i in 1..10 {
        let states = &v[i-1];
        for k in states.iter() {
            v[i].insert( k*2 ); // cannot borrow `v` as mutable because it is also borrowed as immutable
        }
    }
}

Think of the vector v as a sequence of nodes and in each of the nodes, there is a set of possible (integer) states that are stored in the HashSet. The initial set of states (i.e. for element 0) is known. Now I need to begin from element 0 and populate the states of all the subsequent elements, according to some algorithm which requires looking at what the states are in the previous element.

The above doesn’t work because an immutable reference is borrowed by the variable “states” once and can’t be borrowed again mutably in v[i].insert(…). Changing “states” to a mutable reference won’t work either obviously because there can only be one mutable reference.

The only way I figured so far, is to clone the HashSet instead of holding on to (or borrowing) a reference to it. like so:

        ...
        let states = v[i-1].clone();
        ...

I know for sure that when I am trying to update the hashset in node i, I will never need to change anything in the nodes before i.

The question is, is there any other ways to do it which avoids cloning/copying? Or in general, what’s the idiomatic way to populate a “nested” data structure that requires looking at some parts of the data structure when populating other parts? I suppose it doesn’t have to be a vector of hashsets, it would be the same problem with a vector of vectors or a hashmap of vectors, etc etc.

The original problem is to implement a particular algorithm called “forward shooting grid” to price path-dependent financial derivatives. The “forward” here means exactly that the algorithm needs to populate a particular node based on the states of the previous nodes. The above code example is just an attempt to extract the “feature” of the problem.

Thanks for reading and thank you in advance for any pointers/advice which would be very much appreciated!

Cheers,
Albert


#2

Are you sure you need isize values? Usually integral values with known size are better.

Is this what you want to do?

fn main() {
    use std::collections::HashSet;

    let mut v: Vec<_> = (0 .. 10)
                        .map(|_| HashSet::<usize>::new())
                        .collect();

    for i in 0 .. v.len() / 2 {
        v[0].insert(i);
    }

    for i in 1 .. v.len() {
        let (v1, mut v2) = v.split_at_mut(i);
        v2[0].extend(&v1[v1.len() - 1]);
    }
}

Perhaps there are better ways to do the same. Unfortunately v.windows(2) can’t be used yet (this needs type-level integers):

for (w1, mut w2) in v.windows_mut<2>() {
    w2.extend(&w1);
}

Edit: I forgot the k*2, so you can’t use extend(), sorry.


#3

Genius!

v.split_at_mut(i) does the magic, letting me have a ref and a mutable ref at the same time to different parts of the vec,

Thank you very much leonardo!

I love this forum :slight_smile:


#4

While Leonardo’s solution works, I feel like it obfuscates what you are trying to do. Something you might consider is:

use std::collections::HashSet;
use std::cell::RefCell;

fn main() {

    // create a vector of HashSets
    let mut v: Vec<RefCell<HashSet<isize>>> = Vec::new();
    for _ in 0..10 {
        v.push(RefCell::new(HashSet::new()));
    }

    // populate the HashSet in element 0.  (0..5 is arbitrary)
    for i in 0..5 {
        v[0].borrow_mut().insert(i);
    }

    // traverse the list and populate the hash sets in element 1 to 9
    // the values to insert are 2 times each of the values in the previous element
    // 2 times is arbitrary here, just to show dependencies between elements
    for i in 1..10 {
        for k in v[i-1].borrow().iter() {
            v[i].borrow_mut().insert( k*2 ); 
        }
    }
}

I think it’s a little easier to read; yeah you have the RefCell in the middle, but the for loop behaves more like you’d expect, we only have immutable access on the array V and v[i-1] and have mutable access to v[i].

I’m not sure how much overhead the RefCell adds, I feel like the compiler should be able to optimize most of it out.


#5

Hi Pixel, thanks a lot for pointing me to Cell/RefCell which I haven’t actually come to read about before reading your solution.

Looks like “internal mutability” is exactly the problem RefCell is trying to solve (if I understand the documentation correctly) Also, it would work for not just a vector of something but also a hashmap of something or a hashset of something, as long as that something being contained is wrapped in a RefCell.

Thanks again for the help! Much appreciated!