Is there a better way to implement this transposing iterator?

I'm trying to write an iterator that receives an array of "column" iterators and interleaves the columns into an iterator over rows. The rows themselves should be iterators. The goal is to avoid allocating intermediate vectors, but I'm trying to figure out if it's possible to also avoid the Rc<RefCell<_>>.

Playground Link

It would also be nice to avoid peeking the row iterator in the loop.


struct Transpose<'a, I>
where
    I: Iterator,
    I::Item: 'a,
{
    iters: Rc<RefCell<Vec<I>>>,
    row: Row<'a, I>,
}

impl<'a, I> Transpose<'a, I>
where
    I: Iterator,
    I::Item: 'a,
{
    fn new(iters: Rc<RefCell<Vec<I>>>) -> Self {
        Self {
            row: Row::new(Rc::clone(&iters)),
            iters,
        }
    }

    fn next(&mut self) -> Option<&mut Row<'a, I>> {
        if self.row.empty {
            return None;
        }
        self.row = Row::new(Rc::clone(&self.iters));
        Some(&mut self.row)
    }
}

pub struct Row<'a, I>
where
    I: Iterator,
    I::Item: 'a,
{
    index: usize,
    empty: bool,
    iters: Rc<RefCell<Vec<I>>>,
    phantom: PhantomData<&'a I>,
}

impl<'a, I> Row<'a, I>
where
    I: Iterator,
    I::Item: 'a,
{
    fn new(iters: Rc<RefCell<Vec<I>>>) -> Self {
        Self {
            index: 0,
            empty: false,
            iters,
            phantom: PhantomData,
        }
    }
}

impl<'a, I> Iterator for Row<'a, I>
where
    I: Iterator,
    I::Item: 'a,
{
    type Item = I::Item;

    fn next(&mut self) -> Option<Self::Item> {
        let mut iters = self.iters.borrow_mut();
        if self.index == iters.len() {
            return None;
        }

        let item = iters[self.index].next();
        self.index += 1;
        self.empty = item.is_none();
        item
    }
}

Things I've tried so far:

  • streaming_iterator: Must be Option<&Self::Item>, but the inner iterator needs to be &mut
  • itertools::IntoChunks: I'd like to be able to pass iterator created by transpose around, composing it with more transposing iterators, but I had difficulty storing/composing IntoChunks since it is consumed when you call into_iter.

Would the following suit your needs?

fn transpose<'iter, Column, Item : 'iter> (columns: &'iter [Column])
  -> impl Iterator<Item = impl Iterator<Item = &'iter Item>>
where
    &'iter Column : IntoIterator<Item = &'iter Item>,
{
    (0 ..)
        .scan((), move |&mut (), row_idx| Some({
            let mut columns_iterator = columns.iter();
            let first_column = columns_iterator.next()?;
            let first: &'iter Item =
                first_column
                    .into_iter()
                    .nth(row_idx)?
            ;
            Iterator::chain(
                ::core::iter::once(first),
                columns_iterator
                    .map(move |column| {
                        column
                            .into_iter()
                            .nth(row_idx)
                            .unwrap() // assumes the columns are of equal length
                    })
                ,
            )
        }))
}
2 Likes

Perhaps studying the implementation of group_by and chunks from itertools may help here. So far, it seems that your Transpose is not even an iterator. If it was one, the user of that iterator could use e.g. the second row before the first one; group_by and chunks implementations seem to solve this by conditionally buffering elements in case of multiple groups/chunks being kept around in parallel.

With the next funtion’s return type lifetime bound by the mutable borrow like you have it, Iterator cannot be implemented, but also you can make do without any Rc or whatever. For example like this: →playground. The last row will always be shorter than the previous ones in my implementation, it can even be empty. To avoid empty rows, one would need to introduce some peeking to make sure that there’s a next element before the next row is returned.

Edit: I totally forgot that I wanted (and need) to add a Drop implementation for Row. I will write another post once that one’s done.

1 Like

Wow, I have a lot to learn from these examples. Thank you!

@Yandros your example does what I was trying to achieve.

I was hoping I might be able to compose them something like this:

// columns1 and 2 are columns of integers
let a = transpose(&columns1).map(|row| row.sum());
let b = transpose(&columns2).map(|row| row.product());

let columns3 = [a, b];
let c = transpose(&columns3).map(|row| row.sum());
// ... and so forth

But looks like I might have to Box the closures or the iterators because of the closures being different.

1 Like

fixed now: [updated playground]

By the way, there are still opportunities for optimization here that I haven’t taken, e.g. by manually implementing things like fold and size_hint for Row.

Since your closures capture no state, you can coerce them to fn(_) -> _ pointers, which is a unifying type:

// columns1 and 2 are columns of integers
let a = transpose(&columns1).map::<_, fn(_) -> _>(|row| row.sum());
let b = transpose(&columns2).map::<_, fn(_) -> _>(|row| row.product());

let columns3 = [a, b];

should your closures capture some state, then you should use &mut (dyn FnMut(_) -> _) as your unifying type:

let (mut storage1, mut storage2) = (None, None); // stack allocations
let a = transpose(&columns1).map::<_, &mut (dyn FnMut(_) -> _)>(
    storage1.get_or_insert(|row| row.sum())
);
let b = transpose(&columns2).map::<_, &mut (dyn FnMut(_) -> _)>(
    storage2.get_or_insert(|row| row.product())
);

let columns3 = [a, b];
  • (or Box<dyn FnMut(_) -> _> should you need to return them).

Note that for your let c = transpose(&columns3) you may need to tweak the code of my previous post to be using &mut references instead, and thus use here let c = transpose (&mut columns3);.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.