Rayon: how to get a parallel mut iterator over sub-slices obtained from a given slice and indices

Hi.
I have a slice of some type T data: &[T] and slice of indices idx: &[usize].
The idx - is a indices that indicate where each subslice start and one extra element at the end which is the count of the data
for example:

let mut data = [1, 2, 3, 1, 2, 3, 4, 1, 2];
let idx = [0, 3, 7, 9]; 

// Idx is just encoded ranges 0..3 , 3..7 , 7..9 for indexing 'data'
// Result must be iterator by mut slices: &mut data[0..3], &mut data[3..7], &mut data[7..9] 

Is it possible to make par iterator like this to mutate 'data'?

... without having to hold the whole file in memory to, I suppose.

Not necessary. At least for now

If you want to actually mutate data in-place in parallel, you'll have to wrap it in a Mutex.

If you need to get the slices and mutate them : playground
If you just need the slices but no mutation : playground

You don't need a mutex as long as the ranges don't overlap. Are the indexes in sorted order?

Yes , indices always in sorted order. And not overlap

You can use this with rayon:

pub fn subslices<'a, T>(
    arr: &'a mut [T],
    idxs: impl IntoIterator<Item = (usize, usize)>,
) -> impl Iterator<Item = &'a mut [T]> {
    idxs.into_iter().scan((0, arr), |(off, tail), (a, b)| {
        let (piece, new_tail) = std::mem::take(tail)[a - *off..].split_at_mut(b - a);
        *off = b;
        *tail = new_tail;
        Some(piece)
    })
}

fn main() {
    let mut data = [1, 2, 3, 1, 2, 3, 4, 1, 2];
    let idxs = [(0, 3), (7, 9)];
    
    for arr in subslices(&mut data, idxs.iter().copied()) {
        println!("{:?}", arr);
    }
}
2 Likes

Thanks! But how i can use this with rayon? Via par_bridge()?

Yes, I think that's what it's called.

(hijacking the thread a little) I wanted to answer that in this case it'd be actually viable to directly implement ParallelIterator instead of regular Iterator + par_bridge, I've took a look at rayon internal readme and looks like it should be fairly possible to implement Producer (it requires implementation of split_at to "drive" the parallel iterator, which should be doable with split + split_at_mut), but... what now? I couldn't find a way of going from Producer to ParallelIterator. @alice, any ideas?

I agree that it would be possible.

1 Like

You can use rayon::iter::split for this:

#[derive(Debug)]
pub struct SubSlices<'a, 'b, T> {
    idx: &'a [usize],
    data: &'b mut [T],
}

impl<'a, 'b, T> SubSlices<'a, 'b, T> {
    fn splitter(self) -> (Self, Option<Self>) {
        if self.idx.len() <= 2 {
            return (self, None);
        }
        let mid = self.idx.len() / 2;
        let (idx_r, idx_l) = (&self.idx[0..mid + 1], &self.idx[mid..]);
        let (data_r, data_l) = self.data.split_at_mut(idx_l[0] - idx_r[0]);
        (
            Self {
                idx: idx_r,
                data: data_r,
            },
            Some(Self {
                idx: idx_l,
                data: data_l,
            }),
        )
    }
}

fn main() {
    let mut data = [1, 2, 3, 1, 2, 3, 4, 1, 2];
    let idx = [0, 3, 7, 9];
    rayon::iter::split(
        SubSlices {
            idx: &idx,
            data: &mut data,
        },
        SubSlices::splitter,
    )
    .for_each(|s| {
        dbg!(s);
    });
}
5 Likes

Oh, that's indeed helpful! And it's even linked from module docs:

If you’d like to build a custom parallel iterator, or to write your own combinator, then check out the split function and the plumbing module.

Perhaps too easy to miss though?

Anyway, looking at the source of Split showed me how to get ParallelIterator from Producer. I've even tried to implement IndexedParallelIterator. Seems to work! – Playground. I guess this should be slightly better than split, as rayon should be able to fall back to plain regular iteration for small chunks. (Or perhaps, if the chunks are not evenly sized, implementing IndexedParallelIterator is not a good idea?)

Anyway, there's really a lot of boilerplate – even though the implementation contains basically just three functions:

  • Iterator::next
  • Producer::split_at
  • len (separate impl in 4 traits :upside_down_face:).

Perhaps I'm still missing some helper from rayon that would allow to cut on boilerplate?

1 Like

Wow! Thanks!
Is it possible to turn this into IndexedParallelIterator ?
Because zip and enumerate is not worked without it..

The solution of @krdln should work for IndexedParallelIterator. I don't know of a way to reduce the boilerplate.

1 Like

Thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.