Is there a better way to classify data with enums?

plasticartsshow · March 18, 2022, 12:44am

I have a collection of data. I need to perform two operations on it. One operation is based on the collection's one-to-one qualifications using an enum T, and another is based on qualifications of its subsets using an enum Q.

The appropriate variants of Q reflect the T qualifications of members in each subset. The order of the T qualifications in each subset doesn't matter, only their quantities.

Given the above and the necessity of producing Ts for all items, I thought it good to use the From trait to generate Q for each subset from the corresponding Ts. I came up with the following trait implementation, which works in two steps but seems rather naive:

#![allow(unused)]
/// Each data item produces some variant of T 
#[derive(Debug, PartialEq)]
enum T {
    Cuss,
    Curse,
    Spit,
    Oath { oath_text: String },
    Swear,
}

/// Each subset of data produces some variant of Q
#[derive(Debug, PartialEq)]
enum Q {
    MostlyCusses,
    AllSpit,
    SpittingOath { all_oaths_text: String },
    GeneralRudeness,
}

// A combination of a fixed number of &T (3) determines the variant of Q
impl From<[&T; 3]> for Q {
    fn from(t: [&T; 3]) -> Q {
        //1. I fold to quantify the different T variants in the set "t"
        let (cuss_count, curse_count, spit_count, oath_count_and_strs, swear_count) =
            t.iter().enumerate().fold(
                (0, 0, 0, (0, [Option::<&str>::None; 3]), 0),
                |(
                    mut cuss_count,
                    mut curse_count,
                    mut spit_count,
                    mut oath_count_and_strs,
                    mut swear_count,
                ),
                 (index, t)| {
                    match t {
                        T::Cuss => {
                            cuss_count += 1;
                        }
                        T::Curse => {
                            curse_count += 1;
                        }
                        T::Spit => {
                            spit_count += 1;
                        }
                        T::Oath { oath_text } => {
                            oath_count_and_strs.0 += 1;
                            oath_count_and_strs.1[index] = Some(oath_text);
                        }
                        T::Swear => {
                            swear_count += 1;
                        }
                    }
                    (
                        cuss_count,
                        curse_count,
                        spit_count,
                        oath_count_and_strs,
                        swear_count,
                    )
                },
            );
        //2. I only care to recognize certain combinations of T variants to produce Q via match
        match spit_count {
            3 => Q::AllSpit,
            spit_count_again => match cuss_count {
                3 | 2 => Q::MostlyCusses,
                _ => {
                    let (oath_count, oath_opt_strs) = oath_count_and_strs;
                    match (oath_count, spit_count_again) {
                        (1, 2) | (2, 1) => Q::SpittingOath {
                            all_oaths_text: String::from_iter(oath_opt_strs.iter().filter_map(
                                |x| {
                                    if let Some(oath_text) = x {
                                        Some(*oath_text)
                                    } else {
                                        None
                                    }
                                },
                            )),
                        },
                        _ => Q::GeneralRudeness,
                    }
                }
            },
        }
    }
}

fn main() {
    assert_eq!(
        Q::from([&T::Cuss, &T::Cuss, &T::Spit]), 
        Q::MostlyCusses
    );

    assert_eq!(
        Q::from([&T::Curse, &T::Swear, &T::Spit]),
        Q::GeneralRudeness
    );

    assert_eq!(
        Q::from([
            &T::Oath {
                oath_text: "@#(".into()
            },
            &T::Oath {
                oath_text: "!&$".into()
            },
            &T::Spit
        ]),
        Q::SpittingOath {
            all_oaths_text: "@#(!&$".into()
        }
    );
}

(Playground)

TLDR of how it works:
Step 1: Fold over a subset's items, counting the number of each variant
Step 2: Match the combinations of counts from step 1 to categorize the subset.

Is there a better way to do this with some built-in language feature? Matching the subset itself was not going well because I had to account for so many combinations, even with the use of wildcard "_".

droundy · March 18, 2022, 2:36am

I'd suggest using a for loop and if/else:

[Playground] (Rust Playground)

Not tested as I typed it on my phone, but hopefully you can see how it's simplified.

quinedot · March 18, 2022, 2:48am

You can sort your inputs. Note that this will need a little more work if you want to avoid reordering your oaths. Sort order is based on order of your enum declaration, and oaths are sorted lexicographically.

The unsorted version is dense but doesn't seem like the end of the world... if the sample is somewhat representative.

RobinH · March 18, 2022, 9:30am

Here is another fun one based on itertools::Itertools::counts and a wrapper type implementing Eq and Hash to ignore the oath text: Playground

I'm not sure if it makes it clearer for this amount of variants, but it could make sense if their number increases.

system · June 16, 2022, 9:31am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Enums, Any, and PartialEq help	5	1160	August 30, 2019
Implementing `Hash`, `Ord` etc for an enum using floats	3	1188	January 12, 2023
Enum: equality, sub/super set troubles help	8	728	January 12, 2023
Trait or Enum for sum types?	7	3228	January 12, 2023
Comparing enums without data help	3	2669	October 28, 2020

Is there a better way to classify data with enums?

Related Topics