Rayon and join for multiple results

vron · January 3, 2023, 1:39pm

Hi,

I'm using rayon and find myself often wanting to "divide" the work into more than two pices. I know that there is Scope and spawn() for that. It has however two downsides:

It's cumbersome to return values since a outer variable have to be defined before and updated making to longer code.
It's documented as slower than join

I have found myself using a macro such as Rust Playground allowing me to (to my understanding) split the work efficiently and with short code into multiple parts:

fn my_func() -> (i32, i64, f32, f64) {
    join!(|| calc1(), || calc2(), || calc3(), calc4())
}

Is there any downside I'm not seeing to using such a macro/structure? I would almost assume so since I'm surprised there is nothing similar available in core rayon?

(And as a bonus - is it possible to write that macro recursively to allow any number of arguments - I have tried in vain but resorted to spelling out each of the cases from 2 to 8 arguments)

Thanks

erelde · January 3, 2023, 1:50pm

Why don't you just write an array of your arguments and call into_par_iter ?

Your example would be:

(1..9).into_par_iter().sum::<i32>()

vron · January 3, 2023, 2:01pm

Firstly because I honestly did not consider the possibility - but when now thinking about it I'm struggling to see how to nicely/shortly/efficiently do that. Do you mean something like: Rust Playground ?

Im struggling to get that to work without adding type annotations (which can be lengthy for many closures)

(to clarify - the sum is just a silly example - the types and calcs involved are way more complex - hence that the join! macro solution becomes nicer)

vron · January 3, 2023, 2:03pm

Furthermore - and most critically when I think about it - the array solution requires all the closures to return the same type - they typically might not...

erelde · January 3, 2023, 2:07pm

You've got some computations you want to run in parallel. They take bunch of arguments and returns different things, maybe they are all different. If they are, write a function taking an enum of those different arguments and dispatches them to the intended function and return another result enum. Meanwhile your "main" function only consists of:

list_of_param_enums.into_par_iter()
    .map(|args| dispatcher(args))
    .collect::<Vec<_>>()

(Maybe you don't want to collect into anything, that's only an example)

vron · January 3, 2023, 2:11pm

Thanks - agreed that it would work / is a different way to do it.

However - this way would force a allocation of a Vec (and most likely for each of the enum values as well), would force me to write a type and dispatcher for each position in the code where I want to run multiple different things in parallel. I would also argue that it's less clear in the code.

Again, maybe I am missing something - but what do you mean is the advantage of this approach? Or why am the join! macro a bad idea?

(I do apprechiate the input though - I hope I do not come across as unwilling to trying something else - just wnat to find the best and clearest way)

/Viktor

vron · January 3, 2023, 2:12pm

Really - my point with the join! macro is that it's really easy to "spray" all over the code whenever somthing can be split up without adding any complecity or more difficult to read code.

vron · January 3, 2023, 2:13pm

I.e. just the way the join function from rayon allows - but allowing me to split in more thatn 2 pieces without having to each time hacing to type out a recursive / multilayer closure code.

erelde · January 3, 2023, 2:16pm

In general when you have a bunch of parallel computations to do they should be pretty "homomorphic" in respect to their types. That's why I prefer to lean on the type system rather than macros.

In your particular case, if you really only have like 6-10 things, chained joins may be your best solution. Above that number I would reframe the initial problem to get a nice typed source iterator.

steffahn · January 3, 2023, 2:51pm

Unsurprisingly, there’s others that have already been wondering the same.

Yes, that should definitely be possible. By the way, I am not sure what kind of nesting for multiple join calls is the best / most efficient. Might make sense to get familiar with the actual implementation if the thing to judge that best; and possibly compare to the design decisions for parallel iterators, I guess…

A join would, as far as I can tell, also add a layer of dynanicness (in its interaction with the thread pool, work queue and such), so adding another layer and producing &mut dyn FnMut() callbacks could be feasible. I’d be curious whether such an (yet to be implemented, I’m trying it at the moment…) approach, using parallel iterators, would perform better or worse than the “use lots of rayon::join” approach.

steffahn · January 3, 2023, 3:45pm

Here we go: Rust Playground

vron · January 3, 2023, 3:58pm

Thank you (I think ) - but I'll need a while to even start to understand what is goin on here

steffahn · January 3, 2023, 4:03pm

The most difficult part w.r.t. the macro stuff is understanding the dance of creating a bunch of identifiers recursively, distinguished only by hygiene. The implementation then merely packs up all the closures into nice dyn FnMut() + Sends, which involves some Options to still allow FnOnce on the one hand, and pass back the return value on the other hand; the reason why FnMut is used in the first place is to avoid Boxing.

As always with understanding macros, feel free to add trace_macros!(true); to the beginning of the file, and follow a trace of macro evaluation, though note that the output printed there will have all those identifiers distinguished by hygiene looking indistinguishable. (Lots of “f” and “r”.)

Michael-F-Bryan · January 3, 2023, 5:37pm

When you are just wanting to apply the same operation to all the items in a collection or iterator - kinda like the normal Iterator methods, just in parallel - into_par_iter() is your friend.

For something where the operations all do different things and return different types, I'd reach for the "thread pool"-like approach you get with Scope and spawn().

If that feels cumbersome, then I'd normally interpret that as my code saying there's some sort of "impedance mismatch" that I could resolve by taking a different approach. I normally shy away from using macros to resolve awkward code because they tend to just sweep the problem under the rug.

To bastardize some well-known Go Proverbs,

Clear is better than clever.

~~Reflection~~ Macros are never clear.

system · April 3, 2023, 5:38pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
What is the best way to allocate one task per cpu? help	10	1763	January 12, 2023
Dynamically using (or not using) rayon help	11	3566	January 12, 2023
New version of mandel-rust: uses Rayon, added benchmark announcements	38	5854	January 12, 2023
Rayon prevent wait help	12	798	November 25, 2020
[solved] Iterating a normal iterator with rayon help	2	1012	January 12, 2023

Rayon and join for multiple results

Related topics