Is there a better way to classify data with enums?

I have a collection of data. I need to perform two operations on it. One operation is based on the collection's one-to-one qualifications using an enum T, and another is based on qualifications of its subsets using an enum Q.

The appropriate variants of Q reflect the T qualifications of members in each subset. The order of the T qualifications in each subset doesn't matter, only their quantities.

Given the above and the necessity of producing Ts for all items, I thought it good to use the From trait to generate Q for each subset from the corresponding Ts. I came up with the following trait implementation, which works in two steps but seems rather naive:

#![allow(unused)]
/// Each data item produces some variant of T 
#[derive(Debug, PartialEq)]
enum T {
    Cuss,
    Curse,
    Spit,
    Oath { oath_text: String },
    Swear,
}

/// Each subset of data produces some variant of Q
#[derive(Debug, PartialEq)]
enum Q {
    MostlyCusses,
    AllSpit,
    SpittingOath { all_oaths_text: String },
    GeneralRudeness,
}

// A combination of a fixed number of &T (3) determines the variant of Q
impl From<[&T; 3]> for Q {
    fn from(t: [&T; 3]) -> Q {
        //1. I fold to quantify the different T variants in the set "t"
        let (cuss_count, curse_count, spit_count, oath_count_and_strs, swear_count) =
            t.iter().enumerate().fold(
                (0, 0, 0, (0, [Option::<&str>::None; 3]), 0),
                |(
                    mut cuss_count,
                    mut curse_count,
                    mut spit_count,
                    mut oath_count_and_strs,
                    mut swear_count,
                ),
                 (index, t)| {
                    match t {
                        T::Cuss => {
                            cuss_count += 1;
                        }
                        T::Curse => {
                            curse_count += 1;
                        }
                        T::Spit => {
                            spit_count += 1;
                        }
                        T::Oath { oath_text } => {
                            oath_count_and_strs.0 += 1;
                            oath_count_and_strs.1[index] = Some(oath_text);
                        }
                        T::Swear => {
                            swear_count += 1;
                        }
                    }
                    (
                        cuss_count,
                        curse_count,
                        spit_count,
                        oath_count_and_strs,
                        swear_count,
                    )
                },
            );
        //2. I only care to recognize certain combinations of T variants to produce Q via match
        match spit_count {
            3 => Q::AllSpit,
            spit_count_again => match cuss_count {
                3 | 2 => Q::MostlyCusses,
                _ => {
                    let (oath_count, oath_opt_strs) = oath_count_and_strs;
                    match (oath_count, spit_count_again) {
                        (1, 2) | (2, 1) => Q::SpittingOath {
                            all_oaths_text: String::from_iter(oath_opt_strs.iter().filter_map(
                                |x| {
                                    if let Some(oath_text) = x {
                                        Some(*oath_text)
                                    } else {
                                        None
                                    }
                                },
                            )),
                        },
                        _ => Q::GeneralRudeness,
                    }
                }
            },
        }
    }
}

fn main() {
    assert_eq!(
        Q::from([&T::Cuss, &T::Cuss, &T::Spit]), 
        Q::MostlyCusses
    );

    assert_eq!(
        Q::from([&T::Curse, &T::Swear, &T::Spit]),
        Q::GeneralRudeness
    );

    assert_eq!(
        Q::from([
            &T::Oath {
                oath_text: "@#(".into()
            },
            &T::Oath {
                oath_text: "!&$".into()
            },
            &T::Spit
        ]),
        Q::SpittingOath {
            all_oaths_text: "@#(!&$".into()
        }
    );
}

(Playground)

TLDR of how it works:
Step 1: Fold over a subset's items, counting the number of each variant
Step 2: Match the combinations of counts from step 1 to categorize the subset.

Is there a better way to do this with some built-in language feature? Matching the subset itself was not going well because I had to account for so many combinations, even with the use of wildcard "_".

I'd suggest using a for loop and if/else:

[Playground] (Rust Playground)

Not tested as I typed it on my phone, but hopefully you can see how it's simplified.

2 Likes

You can sort your inputs. Note that this will need a little more work if you want to avoid reordering your oaths. Sort order is based on order of your enum declaration, and oaths are sorted lexicographically.

The unsorted version is dense but doesn't seem like the end of the world... if the sample is somewhat representative.

2 Likes

Here is another fun one based on itertools::Itertools::counts and a wrapper type implementing Eq and Hash to ignore the oath text: Playground

I'm not sure if it makes it clearer for this amount of variants, but it could make sense if their number increases.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.