Process Struct in Parallel

I have a struct with hashset which I need to access/update from different threads, potentially in parallel.

struct abc {
  field_a: HashSet<u32>,
}

Also have a function in the struct which I want to call in parallel

impl abc {
     fn update(self, i: u32) {
        self.insert(i.clone());
  }
}

// another method in the struct updates the collection in parallel
//! Invalid code here!
        for i in 1..max + 1 {
            let handle = std::thread::spawn(move || {
                self.update(i);
            });
            handles.push(handle);
        }

How can I make this logic work?

I also want global_instance.update(n) to call from a random thread where global_instance would be accessible from all threads.
If you need more details to help me, I will provide them

I like this pattern:

use std::collections::HashSet;
use std::sync::{Arc, Mutex};

#[derive(Clone)]
struct Abc {
    field_a: Arc<Mutex<HashSet<u32>>>,
}

impl Abc {
    fn insert(&self, value: u32) {
        self.field_a.lock().unwrap().insert(value);
    }
}

Due to the Arc, cloning the Abc gives you another object that shares the same set, and changes to any clone are visible in any other clone.

2 Likes

I guess it's time to spend some time to understand arc and all things I always was trying to avoid :laughing:
Thanks @alice

The Arc type is a container where all clones access the same object. It is used for cross-thread shared access. Generally, the way it is used is by making a clone before spawning a new thread, then moving the clone into the thread. The shared value is destroyed when the last clone is. An Arc only allows immutable access to the value inside it.

The Mutex type is a way to get mutable access to something that is shared (i.e. something you only have immutable access to). It works via locking, where only one thread is permitted access to the value inside it at any one time. If several threads call lock in parallel, the other threads will wait for each other such that they each get a turn to look at the value.

1 Like

One important alternative to consider: You can have each thread create its own HashSet, and then when they are done, you can combine them into one HashSet. This may be faster than what I originally suggested, as it avoids locking.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.