I have some components (A, B) as struct. Over the period of time, we may add some additional components (C, D, ..). So I want the rust tool to be flexible for this changes.
I have a for_each function which process all the S struct as shown in the example [1]. It is a time consuming function to update S and hence, I want to use rayon. I thought of, adding the components’ S as a mutable references to a collection and modify them in a function. Such that, changes in the S collection should reflect in components internal structs. I am not going to access the S collection and Components at the same time, so there should not be any data racing. Also S is not shared between the components or structs.
I created an example in [1]. I looked into ‘rc’ and ‘RefCell’ which might support my use case but it doesn’t support Rayon. Any suggestion on this with reduced computation time?
hmm. Thanks. Is it possible to do the clone() of Arc<Mutex<T>> similar to T.clone(). Because, in my code I will be doing both operations. Arc::clone() for simultaniously storing value in 2 different var and T.clone() for creating a copy of T and store in a different var.
I'm not sure what you are asking. Arc is an atomic reference counted smart pointer. Hence it's cloneable, but if you dereference it, you can clone the underlying value if that is cloneable.
Thanks @mbrubeck - One quick question - I came across pin and do you think it will be useful in this scenario?
Background: My each components' struct has different structure. I have a function which update a S struct. I would like to call this function on all S inside a component. Hence, I am collecting all S in each struct to a variable and processing this variable. This way I don't have to know what is the structure of the component struct is and I can blindly call the function on my S struct from the collection.
Thanks. I believe Vec<Arc<Mutex<T>>> is slower than the Vec<T> while iterating using rayon. Is it possible? The items in the Vec are mutually independant.
If switching between 1 and n threads doesn't make a difference, there's a good chance that one of the following occurred:
Your use of Mutex is effectively causing everything to happen sequentially because a Mutex makes sure only one thread can access something at a time,
That operation was never a bottleneck, so adding more parallelism doesn't help performance (see Amdahl's law)
the operation is so fast that you don't notice the speed increase from parallelism, or
Rayon decided your input is small enough that it's faster to do all the work on one thread instead of incurring the overhead of sending different chunks of the input to different threads and then working out how to merge the results at the end.
Looking at the code linked to on the playground, I'm guessing it's the 4th option. If you've only got a couple inputs and the operation is fast anyway, adding more parallelism can actually slow things down because you've got to coordinate the different threads.
Something to keep in mind is that multithreading isn't free. You'll mainly use it when there are a large number of operations/items, or each operation is slow and can be done independently of the others (imagine hashing the contents of a bunch of files).
Thanks @Michael-F-Bryan - My inputs are around 2000 items in the Vec. I gave a minimal code here with 2 items. It takes 15 mins to process the entire Vec
I am using Mutex to share the same value between 2 variables. The reason is, I have lot of different Parent structs and each struct's deep down has one to many S struct. In my code, then I have to go to each parent struct and update the value of S struct. So, I decided to have a Vec on all my Parent to contain S pointer and therefore, I don't have to drill down to update S
In my code I do
let all = vec![];
all.append(&mut parent1.all);
all.append(&mut parent2.all);
....
It almost sounds like you're walking a tree (or many trees so... a forest?). Instead of explicitly using par_iter(), what about using the more general rayon::scope() to run a closure which will recursively walk the tree, using Scope::spawn() to do the next level of recursion (possibly) in parallel.
As long as updating an S can be done independently of neighbouring S's, you should be able to take mutable references to each field instead of needing any form of synchronisation (Mutex).
It feels like you're trying to fight your architecture here.... Is it possible to restructure your data in a way that is more amenable to parallelism?
Usually Arc<Mutex<T>> hurts parallelism when you want to frequently access the T because it's a Mutex's purpose to make sure only one thread of execution can run at a time.
I'm guessing the update_a_var_in_s() from your example isn't the real code? Because 2000 items isn't that many, a computer could chew through hundreds of thousands of increment operations in a single second.
If the individual operations are slow, you'll probably get better performance gains by using smarter algorithms (e.g. O(n log n) instead of O(n^2)), not collecting into temporary Vec's, and doing it on a single thread so you can get mutable access without the cost of synchronisation.
In my day job I work with a lot of moderately sized data and a pretty tight latency budget (operations need to complete within a couple hundred milliseconds or the user gets frustrated) and have found that algorithmic improvements can give orders of magnitude better performance than doing the naive thing in parallel... Of course, depending on your task there may not be any smarter algorithms available for you (e.g. updating every item in an array will always require touching every item in the array).