I'm trying to implement a custom iterator in a struct as following:
#[derive(Clone)]
pub struct MovingWindow<'a, T, N> {
slice: &'a [T],
n: N,
count: usize,
init_size: usize
}
impl<'a, 'b, T> Iterator for MovingWindow<'a, T, &'b Vec<usize>> {
type Item = &'a [T];
#[inline]
fn next(&mut self) -> Option<Self::Item> {
if self.count <= self.init_size {
let i = self.count;
let win = self.n[i - 1];
let start_i = if i < win { 0usize } else { i - win };
// println!("i: {i}, win: {win}, start_i: {start_i}");
let ret = Some(&self.slice[start_i..i]);
self.count += 1;
ret
} else {
None
}
}
}
pub trait Rolling {
type T;
fn rolling<N>(&self, n: N) -> MovingWindow<Self::T, N>;
}
impl<N> Rolling for [N] {
type T = N;
fn rolling<K>(&self, n: K) -> MovingWindow<Self::T, K>
{
MovingWindow { slice: self, n, count: 1, init_size: self.len()}
}
}
The iterator's Item is a Slice, the len of which depends on the MovingWindow's field n(a Vec<usize>).
And I have the following trait on [T], which would do the same thing as above but with using loop:
pub trait RollWindow: AsRef<[f32]> {
fn roll_window_sum(&self, window: &[usize]) -> Vec<f32> {
let data = self.as_ref();
let mut res = vec![f32::NAN; data.len()];
for (i, win) in window.iter().enumerate() {
let start_i = if i + 1 < *win { 0 } else { i + 1 - win };
let data_part = &data[start_i..i + 1];
res[i] = data_part.iter().sum::<f32>()
}
res
}
}
impl RollWindow for [f32] {}
impl RollWindow for Vec<f32> {}
I tested the performace of both methods as following:
let data1 = vec![1f32; 100_000_000];
let data2 = vec![10usize; 100_000_000];
t!(data1.rolling(&data2).map(|x| x.iter().sum::<f32>()).collect::<Vec<f32>>());//test the iter method
//344.632407ms
t!(data1.roll_window_sum(&data2));//test the loop method
//280.973004ms
This is the playground
I use both Iter(the standard iterator, not implementing by myself) and loop a lot, and I didn't find any significant performance difference between them. What am I missing about the implementing above that make the iter slow?