I’m contemplating a breaking change to the Rayon parallel iterator API and would appreciate some input. Currently, the parallel iterator API assumes that each individual work item is cheap, and therefore it tries to group together multiple work items onto one thread to reduce overhead. The
weight() method can be used to control this, by making Rayon use a multiplier to consider some items as more expensive.
However, I’ve noticed in practice that most parallel iterators seem to just call
weight_max(), which basically forces all items to be (potentially) launched onto their own threads. Moreover, a lot of people come to Rayon with a small list of expensive tasks and then get surprised when they don’t see any parallelization (because they are not calling
To try and address this problem, I am contemplating changing the default so that Rayon assumes tasks are expensive by default. PR 81 deprecates
weight_max() and introduces a new method,
sequential_threshold(), which can be used instead.
sequential_threshold() defaults to 1 but can be set higher to try – when the number of items falls below the threshold, Rayon will try not to parallelize.
The questions at hand
- Which do you think is the right default?
- Can you think of a more elegant way to control the sequential cutoff point?
A couple of further thoughts
One problem with sequential threshold is that it does not compose as cleanly as weight did. For example, what should the threshold be if you chain two iterators together, and one has a threshold of 32 and the other 64? Really, the threshold makes the most sense as something you set as the last step only.
In theory, using Rayon with a threshold of 1 (that is, assuming expensive) is what you ought to do. We ought to tune the runtime so that this is low overhead and/or dynamically adapt to workloads. But in practice it doesn’t always work out that way and I’m not sure that we’ll be able to achieve it, so for the time being I’d like to keep the manual knob.