Nth with multiple sorted indexes?


#1

For some iterators I’d like a function like nth() that accepts an ordered sequence of more than one index. This is useful to avoid scanning the same iterable more than once.

A eager and rough version of it:

fn multi_nth<It1, It2>(mut seq: It1, idxs: It2) -> Vec<u64>
where It1: Iterator<Item=u64>, It2: Iterator<Item=usize> {
    let mut result = vec![];
    let mut si = 0;

    'OUTER: for i in idxs {
        while si <= i {
            if let Some(x) = seq.next() {
                if si == i { result.push(x); }
                si += 1;
            } else {
                break 'OUTER;
            }
        }
    }
    result
}

Do you know where to find a lazy version of this, and if it’s a good idea to add it to std lib or itertools?


#2

Here’s a rough version I just threw together: playground

pub struct MultiNth<I, N> {
    iter: I,
    indexes: N,
    iter_idx: usize
}

impl <I: Iterator, N:Iterator<Item=usize>> MultiNth<I, N> {
    fn new(iter: I, indexes: N) -> Self {
        MultiNth {iter, indexes, iter_idx: 0}
    }
}

impl<I: Iterator, N:Iterator<Item=usize>> Iterator for MultiNth<I, N> {
    type Item = I::Item;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        match self.indexes.next() {
            None => None,
            Some(idx) => {
                let mut res = match self.iter.next() {
                    None => return None,
                    Some(next) => next
                };
                self.iter_idx += 1;
                while self.iter_idx < idx {
                    self.iter_idx += 1;
                    res = match self.iter.next() {
                        None => return None,
                        Some(next) => next
                    };
                }
                Some(res)
            }
        }
    }
}

#3

I think it would be good for itertools


#4

I’ve tried to improve your code a little, now it’s usable in an iterator chain. But the handling of indexes is still wrong, and now I’m too much sleepy to debug it (debugging from other people is welcome):

#[derive(Debug)]
struct MultiNthState<I, N>
    where I: Iterator
{
    indexes: N,
    iter_idx: usize,
    underlying: I,
}

trait MultiNth: Iterator {
    fn multi_nth<N>(self, indexes: N) -> MultiNthState<Self, N>
        where N: Iterator<Item=usize>,
        Self: Sized,
    {
        MultiNthState { indexes, iter_idx: 0, underlying: self }
    }
}

impl<I> MultiNth for I where I: Iterator {}

impl<I, N> Iterator for MultiNthState<I, N>
    where I: Iterator,
          N: Iterator<Item=usize>,
{
    type Item = I::Item;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        match self.indexes.next() {
            None => None,
            Some(idx) => {
                let mut res = match self.underlying.next() {
                    None => return None,
                    Some(next) => next
                };
                self.iter_idx += 1;
                while self.iter_idx < idx {
                    self.iter_idx += 1;
                    res = match self.underlying.next() {
                        None => return None,
                        Some(next) => next
                    };
                }
                Some(res)
            }
        }
    }
}

fn main() {
    // Indexes must be ordered and strictly increasing?
    for f in [10, 20, 30, 40, 50, 60, 70]
             .iter()
             .multi_nth([1, 2, 5].iter().cloned()) {
        println!("{}", f); // Should print: 20 30 60
    }
}

#5

Alright, I figured out how to actually make it work by just using enumerate to do the index tracking for us:
playground

use std::iter::Enumerate;

pub struct MultiNth<I, N> {
    iter: I,
    indexes: N,
}

impl <I: Iterator, N:Iterator<Item=usize>> MultiNth<Enumerate<I>, N> {
    fn new(iter: I, indexes: N) -> Self {
        MultiNth {iter: iter.enumerate(), indexes}
    }
}

impl<B, I: Iterator<Item=(usize, B)>, N:Iterator<Item=usize>> Iterator for MultiNth<I, N> {
    type Item = B;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        match self.indexes.next() {
            None => None,
            Some(idx) => {
                loop {
                    match self.iter.next() {
                        None => return None,
                        Some((idx2, next)) => {
                            if (idx2 == idx) {
                                return Some(next)
                            }
                        }
                    }
                }
            }
        }
    }
}

#6

Repeated application of .nth() could help some iterators turn this into a faster operation.

But then a good way to handle errors is needed (indices not sorted)?


#7

A possible solution is to assume the indexes are few (and they are Copy), you can define an Increasing struct with a smart constructor that gives the error:

[10, 20, 30, 40, 50, 60, 70]
.iter()
.multi_nth(Increasing([1, 2, 5].iter().cloned()).unwrap())

Other ideas are welcome.


#8

Here’s a simple implementation using nth: playground.

pub struct MultiNth<I, N> {
    iter: I,
    indexes: N,
    offset: usize,
}

impl <I: Iterator, N:Iterator<Item=usize>> MultiNth<I, N> {
    fn new(iter: I, indexes: N) -> Self {
        MultiNth {iter, indexes, offset: 0}
    }
}

impl<I: Iterator, N:Iterator<Item=usize>> Iterator for MultiNth<I, N> {
    type Item = I::Item;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        match self.indexes.next() {
            None => None,
            Some(idx) => {
                let o = self.offset;
                let res = self.iter.nth(idx - o);
                self.offset = idx + 1;
                res
            }
        }
    }
}

You could keep a prev field in the Multi struct to detect if an out of date index is given.


#9

But the point of multi_nth is to avoid calling nth many times on the input sequence.


#10

Well, it keeps the Iterator and offset, so it’s not scanning from the beginning every time. So complexity-wise it should have the same efficiency.


#11

Using nth makes it (algorithmically at least) faster with iterators that can turn that into an O(1) step instead of O(n). For example slice iterators.