Writing a (parallel) zip for a const generic number of (parallel) iterators

I was playing around with zipping a const generic number of iterators. With a little bit of (hopefully sound?) unsafe and some unstable features, I got (including DoubleEndedIterator and ExactSizeIterator implementation, which are used below):

#![feature(array_map)]
#![feature(maybe_uninit_uninit_array)]
#![feature(maybe_uninit_array_assume_init)]

use std::mem::MaybeUninit;

pub struct ConstGenericZip<I, const N: usize>([I; N]);

impl<I, const N: usize> ConstGenericZip<I, N> {
    pub fn new<J>(iters: [J; N]) -> Self
    where
        J: IntoIterator<IntoIter = I>,
    {
        Self(iters.map(J::into_iter))
    }
}

impl<I, const N: usize> Iterator for ConstGenericZip<I, N>
where
    I: Iterator,
{
    type Item = [I::Item; N];

    fn next(&mut self) -> Option<Self::Item> {
        let mut dst = MaybeUninit::uninit_array();

        for (i, iter) in self.0.iter_mut().enumerate() {
            dst[i] = MaybeUninit::new(iter.next()?);
        }

        // SAFETY: If we reach this point, `dst` has been fully initialized
        unsafe { Some(MaybeUninit::array_assume_init(dst)) }
    }
}

impl<I, const N: usize> ExactSizeIterator for ConstGenericZip<I, N>
where
    I: ExactSizeIterator,
{
    fn len(&self) -> usize {
        self.0.iter().map(|x| x.len()).min().unwrap()
    }
}

impl<I, const N: usize> DoubleEndedIterator for ConstGenericZip<I, N>
where
    I: DoubleEndedIterator,
{
    fn next_back(&mut self) -> Option<Self::Item> {
        let mut dst = MaybeUninit::uninit_array();

        for (i, iter) in self.0.iter_mut().enumerate() {
            dst[i] = MaybeUninit::new(iter.next_back()?);
        }

        // SAFETY: If we reach this point, `dst` has been fully initialized
        unsafe { Some(MaybeUninit::array_assume_init(dst)) }
    }
}

I was hoping that it would be possible also to write the analogous parallel version of this iterator using rayon, assuming that the inner I is a parallel iterator. I got this far at first:

use rayon::iter::{
    plumbing::{bridge, Consumer, Producer, ProducerCallback, UnindexedConsumer},
    IndexedParallelIterator, IntoParallelIterator, ParallelIterator,
};

use std::array::IntoIter;

pub struct ParConstGenericZip<I, const N: usize>([I; N]);

impl<I, const N: usize> ParConstGenericZip<I, N> {
    pub fn new<J>(iters: [J; N]) -> Self
    where
        J: IntoParallelIterator<Iter = I>,
    {
        Self(iters.map(J::into_par_iter))
    }
}

impl<I, const N: usize> ParallelIterator for ParConstGenericZip<I, N>
where
    I: IndexedParallelIterator,
{
    type Item = [I::Item; N];

    fn drive_unindexed<C>(self, consumer: C) -> C::Result
    where
        C: UnindexedConsumer<Self::Item>,
    {
        bridge(self, consumer)
    }
}

impl<I, const N: usize> IndexedParallelIterator for ParConstGenericZip<I, N>
where
    I: IndexedParallelIterator,
{
    fn drive<C>(self, consumer: C) -> C::Result
    where
        C: Consumer<Self::Item>,
    {
        bridge(self, consumer)
    }

    fn len(&self) -> usize {
        self.0.iter().map(|x| x.len()).min().unwrap()
    }

    fn with_producer<CB>(self, callback: CB) -> CB::Output
    where
        CB: ProducerCallback<Self::Item>,
    {
        todo!()
    }
}

However, I'm having trouble expanding the todo!() above. I figure I have to write a Producer, and assuming I could get the Producer of each I, I came up with this:

pub struct ParConstGenericZipProducer<P, const N: usize>([P; N]);

impl<P, const N: usize> Producer for ParConstGenericZipProducer<P, N>
where
    P: Producer,
{
    type Item = [P::Item; N];
    type IntoIter = ConstGenericZip<P::IntoIter, N>;

    fn into_iter(self) -> Self::IntoIter {
        ConstGenericZip::new(self.0.map(Producer::into_iter))
    }

    fn split_at(self, index: usize) -> (Self, Self) {
        let mut left_array = MaybeUninit::uninit_array();
        let mut right_array = MaybeUninit::uninit_array();

        for (i, producer) in IntoIter::new(self.0).enumerate() {
            let (left, right) = producer.split_at(index);
            left_array[i] = MaybeUninit::new(left);
            right_array[i] = MaybeUninit::new(right);
        }

        // SAFETY: Arrays are guaranteed to be fully initialised at length `N`
        let left_array = unsafe { MaybeUninit::array_assume_init(left_array) };
        let right_array = unsafe { MaybeUninit::array_assume_init(right_array) };

        (
            ParConstGenericZipProducer(left_array),
            ParConstGenericZipProducer(right_array),
        )
    }
}

However, I can't find a way to create the Producer of each I in order to initialize a ParConstGenericZipProducer in the with_producer method. Is there a good way forward here? Or another, alternative way to accomplish something like this?

(Playground.)

It's definitely not trivial to deal with ProducerCallback, unfortunately. The plumbing README goes into what that is, and why, and I'd also recommend looking at the regular Zip::with_producer.

So it will require recursive calls to the inner I::with_producer, but since all your Is are the same type, I think it might be possible to do this with one generic ProducerCallback type and a bunch of MaybeUninit. But, the callback type will also be nesting, so I'm not sure that will work.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.