Rayon's parallel iterators work great when each calculation is *independent* from every other calculation. For example, we can multiply each number in an array by two and then `sum()`

the products, where summation is an example of an *associative* operation we can use with `reduce_with()`

.

```
let data: Vec<i32> = vec![1, 2, 3, 4];
let result: i32 = data.par_iter().map(|x| x * 2).sum();
assert!(result == 1*2 + 2*2 + 3*2 + 4*2);
```

This gets to be a little trickier if we need some sequential dependency between calculations. Suppose we want to multiply each number by two, * then take the difference* with the previous product, and then sum the results? This is sort of operation is very common, for example, when computing a derivative or convolving with some kernel.

Solution:There is no way to express this kind of sequential dependency in Rayon as of now. There is a feature request to convert parallel iterators to sequential iterators without`collect()`

that would make this possible, but it dates back to 2017 and won't likely me merged any time soon.If your first operation is distributive (scalar multiplication is)

andyou have a slice of the data you want to work on already in memory:`let data: Vec<i32> = vec![1, 2, 3, 4]; let result: i32 = data.par_windows(2) // requires a slice .map(|v| 2 * (v[1] - v[0])) .sum();`

Or if you don't have the slice in memory you can use a sequential iterator like

`Itertools::tuple_windows()`

to feed a parallel iterator:`let data: Vec<i32> = vec![1, 2, 3, 4]; let data = data.into_iter(); // don't need all data in memory at once let result: i32 = data.tuple_windows::<(_,_)>() .par_bridge() .map(|(x, y)| 2 * (y - x)) .sum();`

If you can't distribute the first operation or you don't have all the data in memory then it's about trade-offs. If you're memory constrained then be prepared to do the first op serially or redundantly in parallel:

`// Serial scalar multiplication, then parallel reduction. data.tuple_windows::<(_,_)>().map(|x| x*2).par_bridge().map(|(x,y)| y-x).sum(); // Everything's parallel, but multiplication done twice. data.tuple_windows::<(_,_)>().par_bridge().map(|x,y| (2*x,2*y)).map(|(x,y)| y-x).sum();`

If the first op is computationally expensive and you're not memory constrained then

`collect()`

partway through like this:`let data: Vec<i32> = data.par_iter().map(|x| x*2).collect(); let result: i32 = data.par_windows(2).map(|v| 2 v[1] - v[0]).sum();`

See also https://users.rust-lang.org/t/parallel-work-collected-sequentially/13504.

Thanks to @dthul for pointing out

`ParallelSlice::par_windows()`

and @drewkett for pointing out`Itertools::tuple_windows()`

, which make these concise workarounds possible. The rest of my original post continues below for posterity.

The obvious (to me) solution is to perform the first operation (multiplication) in parallel, * then collect the results*, and then perform the dependent (difference) operation in parallel:

```
let data: Vec<i32> = vec![1, 2, 3, 4];
let products: Vec<i32> = data.par_iter().map(|x| x * 2).collect();
let result: i32 = products.iter().zip(products.iter().skip(1)).par_bridge().map(|(x,y)| y - x).sum();
assert!(result == (2*2 - 1*2) + (3*2 - 2*2) + (4*2 - 3*2));
```

However, imagine the data are *very long* and actually come from a lazily-evaluated iterator instead of a vector. Collecting them into the intermediate `products`

vector could cost a lot of memory! Is there a more efficient/idiomatic way to achieve this sort of non-associative reduction, or to deal with sequential dependencies between parallel calculations in general?