How do a better job of reusing the code base

This is related to a previous post on how to implement num_traits Zero for an enum used to unify number-like primitives.

At a high level, the fn has a type Series -> Series. I have the following code that computes decile for all the types that make sense for this computation. The decile function is Series -> Vec<u32>.

There are two dimensions to the question:

  1. is there a way to write this code more generically?
  2. how might I re-use the code for functions other than decile?
pub fn decile_series(series: &Series, name: &str) -> Result<Series> {
    //
    // execution requires type specific code
    // return value for decile is Vec<u32>
    //
    let (idxs, data): (Vec<usize>, Vec<u32>) = match series.dtype() {
        DataType::Int8 => {
            let mut data_with_idx: Vec<(usize, i8)> =
                series.i8()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::Int16 => {
            let mut data_with_idx: Vec<(usize, i16)> =
                series.i16()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::Int32 => {
            let mut data_with_idx: Vec<(usize, i32)> =
                series.i32()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::Int64 => {
            let mut data_with_idx: Vec<(usize, i64)> =
                series.i64()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::UInt8 => {
            let mut data_with_idx: Vec<(usize, u8)> =
                series.u8()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::UInt16 => {
            let mut data_with_idx: Vec<(usize, u16)> =
                series.u16()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::UInt32 => {
            let mut data_with_idx: Vec<(usize, u32)> =
                series.u32()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::UInt64 => {
            let mut data_with_idx: Vec<(usize, u64)> =
                series.u64()?.into_no_null_iter().enumerate().collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::Float32 => {
            let mut data_with_idx: Vec<(usize, AnyNumber<f32>)> = series
                .f32()?
                .into_no_null_iter()
                .map(AnyNumber)
                .enumerate()
                .collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        DataType::Float64 => {
            let mut data_with_idx: Vec<(usize, AnyNumber<f64>)> = series
                .f64()?
                .into_no_null_iter()
                .map(AnyNumber)
                .enumerate()
                .collect();
            data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

            // extract vectors
            let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
            let data: Vec<u32> = decile(&data);
            Ok((idxs, data))
        }
        _ => Err(eyre!(
            "The underlying type is not a AnyNumber: {}",
            series.dtype()
        )),
    }?;

    // zip and sort by the df idxs
    let mut data_with_idx: Vec<(_, _)> = std::iter::zip(idxs, data).collect();
    data_with_idx.sort_unstable_by(|(idx_a, _), (idx_b, _)| idx_a.cmp(idx_b));

    // extract data (drop idxs)
    let (_, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
    let new_series = Series::new(name, data);

    Ok(new_series)
}

Note: AnyNumber is a wrapper used to implement Zero, Ord and others not already implemented by the float primitives.

struct AnyNumber<T>(T);

I've never worked with polars before, but there are two observations here that can direct us to a general purpose solution to this kind of problem:

  1. You have basically two "flavors" of code in decile_series, integer and floating point. There aren't any differences in how you handle the different integers, and the same is true for the floating point variants.
  2. The fact that you're working with different data types in each arm of the match complicates things. Closures seem like an obvious solution to problems like this, but since you're working with different data types you can't use them directly. A closure can't be generic in the same way a function can.

A simple way of extracting your repeated code is using generic functions in a similar way to how you'd use a closure.

Note I had to just sort of wing it at far as how you were implementing AnyNumber and decile so it may not apply directly to your code as written now without some tweaking

Function implementation
use eyre::{eyre, Result};
use num_traits::{Num, One, Zero};
use polars::{
    prelude::{ChunkedArray, DataType, NamedFrom, PolarsNumericType},
    series::Series,
};
use std::ops::{Add, Div, Mul, Rem, Sub};

fn decile<T: num_traits::Num>(_: &[T]) -> Vec<u32> {
    todo!();
}

pub fn decile_series(series: &Series, name: &str) -> Result<Series> {
    //
    // execution requires type specific code
    // return value for decile is Vec<u32>
    //

    /// All of the integer types are handled the same, so we just accept a generic chunked array with the constraints we need on the native type it contains.
    fn handle_int<A: PolarsNumericType>(data: &ChunkedArray<A>) -> Result<(Vec<usize>, Vec<u32>)>
    where
        A::Native: Num + Ord,
    {
        let mut data_with_idx: Vec<(usize, _)> = data.into_no_null_iter().enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));
        let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
        let data: Vec<u32> = decile(&data);
        Ok((idxs, data))
    }

    /// The floats need to be handled slightly differently, so they need their own function
    fn handle_float<A: PolarsNumericType>(data: &ChunkedArray<A>) -> Result<(Vec<usize>, Vec<u32>)>
    where
        A::Native: Float + Num,
    {
        let mut data_with_idx: Vec<(usize, AnyNumber<_>)> = data
            .into_no_null_iter()
            .map(AnyNumber)
            .enumerate()
            .collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

        // extract vectors
        let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
        let data: Vec<u32> = decile(&data);
        Ok((idxs, data))
    }

    let (idxs, data): (Vec<usize>, Vec<u32>) = match series.dtype() {
        DataType::Int8 => handle_int(series.i8()?),
        DataType::Int16 => handle_int(series.i16()?),
        DataType::Int32 => handle_int(series.i32()?),
        DataType::Int64 => handle_int(series.i32()?),
        DataType::UInt8 => handle_int(series.u8()?),
        DataType::UInt16 => handle_int(series.u16()?),
        DataType::UInt32 => handle_int(series.u32()?),
        DataType::UInt64 => handle_int(series.u64()?),
        DataType::Float32 => handle_float(series.f32()?),
        DataType::Float64 => handle_float(series.f64()?),
        _ => Err(eyre!(
            "The underlying type is not a AnyNumber: {}",
            series.dtype()
        )),
    }?;

    // zip and sort by the df idxs
    let mut data_with_idx: Vec<(_, _)> = std::iter::zip(idxs, data).collect();
    data_with_idx.sort_unstable_by(|(idx_a, _), (idx_b, _)| idx_a.cmp(idx_b));

    // extract data (drop idxs)
    let (_, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
    let new_series = Series::new(name, data);

    Ok(new_series)
}

/// Trait for abstracting over the float types which don't implement [Ord]
///
/// Depending on how you're actually implementing [Ord] for [AnyNumber], you may need to modify the signature of this trait
trait Float: Num + PartialOrd {}

impl Float for f32 {}
impl Float for f64 {}

struct AnyNumber<T>(T);

// Float is used as a bound on AnyNumber's impls, which is why it works as a bound in the handle_float fn
impl<T: Float> Eq for AnyNumber<T> {}
impl<T: Float> PartialEq for AnyNumber<T> {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}

impl<T: Float> PartialOrd for AnyNumber<T> {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
        self.0.partial_cmp(&other.0)
    }
}

impl<T: Float> Ord for AnyNumber<T> {
    fn cmp(&self, _: &Self) -> std::cmp::Ordering {
        todo!()
    }
}

impl<T> Zero for AnyNumber<T>
where
    T: Zero,
{
    fn zero() -> Self {
        Self(T::zero())
    }

    fn is_zero(&self) -> bool {
        self.0.is_zero()
    }
}

impl<T> One for AnyNumber<T>
where
    T: One,
{
    fn one() -> Self {
        Self(T::one())
    }
}

impl<T> Add for AnyNumber<T>
where
    T: Add<Output = T>,
{
    type Output = Self;

    fn add(self, rhs: Self) -> Self::Output {
        Self(self.0 + rhs.0)
    }
}

impl<T> Sub for AnyNumber<T>
where
    T: Sub<Output = T>,
{
    type Output = Self;

    fn sub(self, rhs: Self) -> Self::Output {
        Self(self.0 - rhs.0)
    }
}

impl<T> Mul for AnyNumber<T>
where
    T: Mul<Output = T>,
{
    type Output = Self;

    fn mul(self, rhs: Self) -> Self::Output {
        Self(self.0 * rhs.0)
    }
}

impl<T> Div for AnyNumber<T>
where
    T: Div<Output = T>,
{
    type Output = Self;

    fn div(self, rhs: Self) -> Self::Output {
        Self(self.0 / rhs.0)
    }
}

impl<T> Rem for AnyNumber<T>
where
    T: Rem<Output = T>,
{
    type Output = Self;

    fn rem(self, rhs: Self) -> Self::Output {
        Self(self.0 % rhs.0)
    }
}

impl<T> Num for AnyNumber<T>
where
    T: Num,
    Self: PartialEq,
{
    type FromStrRadixErr = T::FromStrRadixErr;

    fn from_str_radix(str: &str, radix: u32) -> std::result::Result<Self, Self::FromStrRadixErr> {
        T::from_str_radix(str, radix).map(Self)
    }
}

For abstracting over other kinds of operations, you can replace those functions with a trait to generalize that solution into something that can be reused on other operations more easily

Trait implementation
use eyre::{eyre, Result};
use num_traits::{Num, One, Zero};
use polars::{
    prelude::{ChunkedArray, DataType, NamedFrom, PolarsNumericType},
    series::Series,
};
use std::ops::{Add, Div, Mul, Rem, Sub};

fn decile<T: num_traits::Num>(_: &[T]) -> Vec<u32> {
    todo!();
}

trait SeriesHandler {
    type Output;

    /// The actual data processing work is done here by the implementor.
    fn process<T: Num>(&mut self, data: &[T]) -> Self::Output;

    // The logic for converting into the Vec is provided by the default trait methods below so different implementations can just implement process rather than having to re-write the common code over and over.

    /// All of the integer types are handled the same, so we just accept a generic chunked array with the constraints we need on the native type it contains.
    fn handle_int<A: PolarsNumericType>(
        &mut self,
        data: &ChunkedArray<A>,
    ) -> Result<(Vec<usize>, Self::Output)>
    where
        A::Native: Num + Ord,
    {
        let mut data_with_idx: Vec<(usize, _)> = data.into_no_null_iter().enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));
        let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
        let data = self.process(&data);

        Ok((idxs, data))
    }

    /// The floats need to be handled slightly differently, so they need their own function
    fn handle_float<A: PolarsNumericType>(
        &mut self,
        data: &ChunkedArray<A>,
    ) -> Result<(Vec<usize>, Self::Output)>
    where
        A::Native: Float + Num,
    {
        let mut data_with_idx: Vec<(usize, AnyNumber<_>)> = data
            .into_no_null_iter()
            .map(AnyNumber)
            .enumerate()
            .collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

        // extract vectors
        let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
        let data = self.process(&data);
        Ok((idxs, data))
    }

    fn handle(&mut self, series: &Series) -> Result<(Vec<usize>, Self::Output)> {
        match series.dtype() {
            DataType::Int8 => self.handle_int(series.i8()?),
            DataType::Int16 => self.handle_int(series.i16()?),
            DataType::Int32 => self.handle_int(series.i32()?),
            DataType::Int64 => self.handle_int(series.i32()?),
            DataType::UInt8 => self.handle_int(series.u8()?),
            DataType::UInt16 => self.handle_int(series.u16()?),
            DataType::UInt32 => self.handle_int(series.u32()?),
            DataType::UInt64 => self.handle_int(series.u64()?),
            DataType::Float32 => self.handle_float(series.f32()?),
            DataType::Float64 => self.handle_float(series.f64()?),
            _ => Err(eyre!(
                "The underlying type is not a AnyNumber: {}",
                series.dtype()
            )),
        }
    }
}

struct DecileSeries;

impl SeriesHandler for DecileSeries {
    type Output = Vec<u32>;

    fn process<T: Num>(&mut self, data: &[T]) -> Self::Output {
        decile(data)
    }
}

pub fn decile_series(series: &Series, name: &str) -> Result<Series> {
    //
    // execution requires type specific code
    // return value for decile is Vec<u32>
    //

    let mut handler = DecileSeries;
    let (idxs, data): (Vec<usize>, Vec<u32>) = handler.handle(series)?;
    // zip and sort by the df idxs
    let mut data_with_idx: Vec<(_, _)> = std::iter::zip(idxs, data).collect();
    data_with_idx.sort_unstable_by(|(idx_a, _), (idx_b, _)| idx_a.cmp(idx_b));

    // extract data (drop idxs)
    let (_, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
    let new_series = Series::new(name, data);

    Ok(new_series)
}

/// Trait for abstracting over the float types which don't implement [Ord]
///
/// Depending on how you're actually implementing [Ord] for [AnyNumber], you may need to modify the signature of this trait
trait Float: Num + PartialOrd {}

impl Float for f32 {}
impl Float for f64 {}

struct AnyNumber<T>(T);

// Float is used as a bound on AnyNumber's impls, which is why it works as a bound in the handle_float fn
impl<T: Float> Eq for AnyNumber<T> {}
impl<T: Float> PartialEq for AnyNumber<T> {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}

impl<T: Float> PartialOrd for AnyNumber<T> {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
        self.0.partial_cmp(&other.0)
    }
}

impl<T: Float> Ord for AnyNumber<T> {
    fn cmp(&self, _: &Self) -> std::cmp::Ordering {
        todo!()
    }
}

impl<T> Zero for AnyNumber<T>
where
    T: Zero,
{
    fn zero() -> Self {
        Self(T::zero())
    }

    fn is_zero(&self) -> bool {
        self.0.is_zero()
    }
}

impl<T> One for AnyNumber<T>
where
    T: One,
{
    fn one() -> Self {
        Self(T::one())
    }
}

impl<T> Add for AnyNumber<T>
where
    T: Add<Output = T>,
{
    type Output = Self;

    fn add(self, rhs: Self) -> Self::Output {
        Self(self.0 + rhs.0)
    }
}

impl<T> Sub for AnyNumber<T>
where
    T: Sub<Output = T>,
{
    type Output = Self;

    fn sub(self, rhs: Self) -> Self::Output {
        Self(self.0 - rhs.0)
    }
}

impl<T> Mul for AnyNumber<T>
where
    T: Mul<Output = T>,
{
    type Output = Self;

    fn mul(self, rhs: Self) -> Self::Output {
        Self(self.0 * rhs.0)
    }
}

impl<T> Div for AnyNumber<T>
where
    T: Div<Output = T>,
{
    type Output = Self;

    fn div(self, rhs: Self) -> Self::Output {
        Self(self.0 / rhs.0)
    }
}

impl<T> Rem for AnyNumber<T>
where
    T: Rem<Output = T>,
{
    type Output = Self;

    fn rem(self, rhs: Self) -> Self::Output {
        Self(self.0 % rhs.0)
    }
}

impl<T> Num for AnyNumber<T>
where
    T: Num,
    Self: PartialEq,
{
    type FromStrRadixErr = T::FromStrRadixErr;

    fn from_str_radix(str: &str, radix: u32) -> std::result::Result<Self, Self::FromStrRadixErr> {
        T::from_str_radix(str, radix).map(Self)
    }
}

This strategy is similar to the visitor pattern, as used by serde and rustc, though I didn't call it a visitor since it's really only operating on a single case at a time.

You can of course tweak the trait to abstract over whatever other details might vary between operations too.


In the future please include code snippets that compile[1]. It makes it much easier to help you.


  1. or at least don't compile in specific places related to your question â†Šī¸Ž

1 Like

The codebase that I have now exploits the first part of the question and answer you provided. Now that I have a better understanding of the task, I was able to streamline the types and code that much more.

pub fn decile_series(series: &Series, name: &str) -> Result<Series> {
    //
    // execution requires type specific code
    // return value for decile is Vec<u32>
    //
    fn handle_int<A>(data: &ChunkedArray<A>) -> Result<(Vec<usize>, Vec<u32>)>
    where
        A: PolarsNumericType,
        A::Native: Num + Ord,
    {
        let mut data_with_idx: Vec<(usize, _)> = data.into_no_null_iter().enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));
        Ok(compute(data_with_idx))
    }
    fn handle_float<A>(data: &ChunkedArray<A>) -> Result<(Vec<usize>, Vec<u32>)>
    where
        A: PolarsNumericType,
        A::Native: Num,
        WithOrd<A::Native>: Ord, // <<< new for me; reads well
    {
        let mut data_with_idx: Vec<(usize, _)> =
            data.into_no_null_iter().map(WithOrd).enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));
        Ok(compute(data_with_idx))
    }

    fn compute<T: Number>(data_with_idx: Vec<(usize, T)>) -> (Vec<usize>, Vec<u32>) {
        let (idxs, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
        let data: Vec<u32> = decile(&data);
        (idxs, data)
    }

    let (idxs, data): (Vec<usize>, Vec<u32>) = match series.dtype() {
        DataType::Int8 => handle_int(series.i8()?),
        DataType::Int16 => handle_int(series.i16()?),
        DataType::Int32 => handle_int(series.i32()?),
        DataType::Int64 => handle_int(series.i64()?),
        DataType::UInt8 => handle_int(series.u8()?),
        DataType::UInt16 => handle_int(series.u16()?),
        DataType::UInt32 => handle_int(series.u32()?),
        DataType::UInt64 => handle_int(series.u64()?),
        DataType::Float32 => handle_float(series.f32()?),
        DataType::Float64 => handle_float(series.f64()?),
        _ => Err(eyre!(
            "The underlying type is not number-like: {}",
            series.dtype()
        )),
    }?;

    // zip and sort by the df idxs
    let mut data_with_idx: Vec<(_, _)> = std::iter::zip(idxs, data).collect();
    data_with_idx.sort_unstable_by(|(idx_a, _), (idx_b, _)| idx_a.cmp(idx_b));

    // extract data (drop idxs)
    let (_, data): (Vec<_>, Vec<_>) = data_with_idx.into_iter().unzip();
    let new_series = Series::new(name, data);

    Ok(new_series)
}

And a highlight of the WithOrd implementation that is generic over whatever FloatOrd<T> is implemented for. Note that it is a struct with separate impl blocks for each T (f32 and f64). All in all a nice composition of struct and traits using num_traits and float_ord crates.

#[derive(Copy, Clone, Debug, PartialEq)]
pub(crate) struct WithOrd<T>(pub T);

impl<T> Ord for WithOrd<T>
where
    T: Float,
    FloatOrd<T>: Ord, // <<< key
{
    fn cmp(&self, other: &Self) -> Ordering {
        let v1 = FloatOrd(self.0);
        let v2 = FloatOrd(other.0);
        Ord::cmp(&v1, &v2)
    }
}

Where I landed with the generic

pub type Index = Vec<usize>;

trait SeriesHandler {
    type Output;

    /// The actual data processing work is done here by the implementor.
    fn apply<T: Number>(&self, data: &[T]) -> Vec<Self::Output>;

    /// Integers
    fn handle_int<A>(&self, data: &ChunkedArray<A>) -> Result<(Index, Vec<Self::Output>)>
    where
        A: PolarsNumericType,
        A::Native: Num + Ord,
    {
        let mut data_with_idx: Vec<(usize, _)> = data.into_no_null_iter().enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));
        let (idxs, data): (Index, Vec<_>) = data_with_idx.into_iter().unzip();
        let data: Vec<Self::Output> = self.apply(&data);
        Ok((idxs, data))
    }

    /// The floats need to be augmented with Ord implementations
    fn handle_float<A>(&self, data: &ChunkedArray<A>) -> Result<(Index, Vec<Self::Output>)>
    where
        A: PolarsNumericType,
        A::Native: Num,
        WithOrd<A::Native>: Ord,
    {
        let mut data_with_idx: Vec<(usize, WithOrd<_>)> =
            data.into_no_null_iter().map(WithOrd).enumerate().collect();
        data_with_idx.sort_unstable_by(|(_, a), (_, b)| b.cmp(a));

        let (idxs, data): (Index, Vec<_>) = data_with_idx.into_iter().unzip();
        let data: Vec<Self::Output> = self.apply(&data);
        Ok((idxs, data))
    }

    fn handle(&self, series: &Series) -> Result<Vec<Self::Output>> {
        let (idxs, data): (Index, Vec<Self::Output>) = match series.dtype() {
            DataType::Int8 => self.handle_int(series.i8()?),
            DataType::Int16 => self.handle_int(series.i16()?),
            DataType::Int32 => self.handle_int(series.i32()?),
            DataType::Int64 => self.handle_int(series.i64()?),
            DataType::UInt8 => self.handle_int(series.u8()?),
            DataType::UInt16 => self.handle_int(series.u16()?),
            DataType::UInt32 => self.handle_int(series.u32()?),
            DataType::UInt64 => self.handle_int(series.u64()?),
            DataType::Float32 => self.handle_float(series.f32()?),
            DataType::Float64 => self.handle_float(series.f64()?),
            _ => Err(eyre!(
                "The underlying type is not a AnyNumber: {}",
                series.dtype()
            )),
        }?;

        // post-processing to return sorted data
        let mut data_with_idx: Vec<(usize, _)> = std::iter::zip(idxs, data).collect();
        data_with_idx.sort_unstable_by(|(idx_a, _), (idx_b, _)| idx_a.cmp(idx_b));
        let (_, data): (Index, Vec<Self::Output>) = data_with_idx.into_iter().unzip();
        Ok(data)
    }
}

/// Implement the series handler for Decile
struct DecileSeries;

impl SeriesHandler for DecileSeries {
    type Output = Decile;

    fn apply<N: Number>(&self, data: &[N]) -> Vec<Self::Output> {
        decile(data)
    }
}

/// Run the code using DecileSeries
pub fn decile_series(series: &Series, name: &str) -> Result<Series> {
    let handler = DecileSeries;
    let new_series = Series::new(name, handler.handle(series)?);

    Ok(new_series)
}

In the end, to allow generic use of zip and unzip I had to make the output more concrete; so using Vec<Self::Output>. A bit of a downer from a learning perspective, but practical: Vec is a reasonable constraint.

For next time, if anyone knows how to use zip and unzip as I have here, I'd love to see it in action!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.