Alternative to extending enum (to unnest enums)

Extending enums is not possible, but within my own crate, I'd like to be able to support behavior that I've only been able to think how to do by

  • extending enums, (not possible, but maybe alternative since these are all types in my crate)
  • much unwrapping in match statements
  • or repeatedly implementing functions down the "tree" of enums.

I'm working to implement a parser for specification of a set by an operator (typically comparison, but I've got others) and a typed element. i.e.

let s: ">0".to_string();
let positives = Set<OrdOperator<u64>, u64>::from_str(&s).unwrap();
// design not constrained to need ^this type, but as possible illustration.

assert_eq!(positives, Set::new(OrdOperator::Range(RangeOperator::Greater), 0u64));
assert!(positives.contains(1u64));

But also to support other operators on other types, i.e.

let s: "[[hello".to_string();
let greetings = Set<ExtendedOrdOperator<Sentences>, Sentences>::from_str(&s).unwrap();

assert_eq!(greetings, Set::new(ExtendedOrdOperator::String(StringOperator::StartsWith), "hello"));
assert!(greetings.contains("hello world"));

Below is the mess that defines all the operator kinds I use as enums and then enums for the operators that aren't nested.

// operators that are unions of simple operators
pub enum ExtendedOrdOperator {
    Ordering(OrdOperator),
    String(StringOperator),
}
pub enum OrdOperator {
    /// Specifies a range
    Range(RangeOperator),
    /// Specifies an exact
    Exact(EqualityOperator),
}

// simple operator
pub enum RangeOperator {
    Greater,
    GreaterEquals,
    Less,
    LessEquals,
}
pub enum StringOperator {
    StartsWith,
    NotStartsWith,
    Contains,
    NotContains,
}
pub enum EqualityOperator {
    Equals,
    NotEquals,
}

From this, I feel like my options to implement similar behavior are to have

  1. trees of matches statements each time
  2. implement a method for each of these enums of a similar signature, both for parsing and membership and whatever else I may need them all to do. This would be at the unions of operators level as well as the simple operator level
  3. unit enums for all the operators that are leaves, and somehow be smart about std::ops::Not for the parsed representation of unexplicitly specified operators.

I looked at this post, since maybe that macro is a good place to start for one that'll generate the enums. After writing all this option 3 above seems the most promising, but I'd like some input from others here on how I'm thinking about this, I've not a background of programming with interfaces and I think my excitement to use them sometimes ends up in things being more complex than needed.

Since the set of operators is open (you want to support arbitrary operators) and they don't have much to do with each other, you should probably not use enums at all. Use separate unit structs to represent each operator separately.

3 Likes

That makes some sense to me, but I'll be implementing PartialEq and PartialOrd and I'll want the usual relationships between those operators, would that motivate enum? Or are these operators more disparate than I think them to be?

I've now got,

enum Less {
  /// < operator
  Affirm,
  // >= operator, as that's just sugar
  Negate, 
}
// sim for other two Ord operators and sim for the two string operators 

I don't plan on implementing operators complex enough to have an algebra (I think I'm using that term right, but I'm certainly not a mathematician enough to be confident) that isn't simply 2 mutually exclusive options. (Seems like that's closely to related to my idea of a set, it's in it, or it's not :person_shrugging: )

I think your answer along with writing this focused my thoughts pretty well. I definitely won't be needing to rely on the tree of enums anymore.

Oh wait, this is silly, we already have bool. I'll use a struct of a bool

Sometimes enums with two variants make more sense than a boolean.

1 Like

You could certainly make an enum of such related families of operators, but I think Less::Affirm and Less::Negate are both less clear than Less and GreaterOrEq would be.

5 Likes

What do these Sets and their Operators do roughly?

It could be that a flat set (no pun intended) of top level operators is all that's needed.

pub struct Greater;
pub struct GreaterEquals;
pub struct Less;
pub struct LessEquals;

pub struct StartsWith;
pub struct NotStartsWith;
pub struct Contains;
pub struct NotContains;

pub struct Equals;
pub struct NotEquals;

The Set would have to be generic.

pub struct Set<O, T> {
    operator: O,
    value: T,
}

Whatever these Sets and Operators do for certain types could be defined via traits, which avoids the need to match everything all the time.

It's hard to define the traits without knowing what the operators do of course.

Oh and you don't have to go completely flat! You might decide that some of these operators make sense grouped as an enum.

The generic approach means that you can mix and match enum and struct Operators.

A downside is that creating a Vec of Sets with different Operators wouldn't really work, the compiler will force the sets to all have the same generic parameters. Trait objects can overcome this, but this might be a good reason to stick to nested enums.

1 Like

It is pretty generic, I've got this,

pub trait Set<Element> {
    fn is_member(&self, elem: &Element) -> bool;
}

pub struct OperatorConstraint<Element, Op> {
    op: Op,
    elem: Element,
}

impl<Element, Op> Set<Element> for OperatorConstraint<Element, Op>
where
    Op: Operator<Element>,
{
    fn is_member(&self, elem: &Element) -> bool {
        self.op.compares(&self.elem, &elem)
    }
}

So that a set is defined by a membership operation, and I can also say that a pairing of comparator and element can behave like the set they specify.

I think I'll go with @H2CO3 initial suggestion of unit structs for all of them.

I don't need an explicit implementation of not, but there's just something so odd about not being able to declare the 6 PartialOrd comparators with all of their relations.

(I have complement, but no union or intersect. Maybe that's my issue? But I feel like I'm already getting far from the problem domain.)

I also notice that partial_cmp(...).is_ok() seems another perfectly suitable way to describe is_member. Maybe I've gone too far.

1 Like

You might want to look at the std library type for comparisons: Ordering in std::cmp - Rust

It uses only 3 variants and has methods to determine things like <= (Ordering::Less | Ordering::Equal). So don't use it as your operator type, but it's useful to be aware of.

partial_cmp returns Option<Ordering>: PartialOrd in std::cmp - Rust

I think a trait for Operators will work very well! I would use an associated type as their element I would be wrong to suggest associated types, the trait must be generic to allow multiple implementations!. That way, Contains and StartsWith can implement it for String (and not say u64) while the GreaterEqual can implement the same trait for u64, and perhaps String too (alphabetical ordering?). Full flexibility! I'll be back with a playground

Blimey that (playground) was harder than I thought it would be, had to dig into the docs for Borrow so I could make Element=String work with &str for the comparisons. This was a really interesting problem!

I dislike how many repeated trait bounds are used, but sometimes that's the way it goes. Perhaps some more experienced Rustaceans will be able to improve it.

Borrow trait discussion

I wanted to reach for a Borrow-like trait that used an associated item so a type could have a single "canonical" borrowed type, I could use such a trait and ditch the BorrowedElem generic parameter. It's not so bad though once you get used to it... and it does allow for more flexibility whilst mostly being hidden from the user!

In my mind, "CanonBorrow" (or BorrowDefault?) would be implemented for (among others) u64 -> u64 and String -> str.
Sort of answering the question "what type do I get if I stick & in front of this type?".

See if you can make StartsWith work for a set of PathBufs :slight_smile: and enjoy!

Edit: Whelp, I've had a fun morning :slight_smile:

FWIW, the advantage with such a trait is that it is indeed eligible to dyn Traits in the case of needing to type-unify these operators for some reason.

If all of your operators are unit structs (which seems like the proper transposition from enum variants), then you'll be able to get &'static references to them (e.g., &Greater: &'static (impl Set<…>)).

Which means that using dyn won't imply needing Boxing or losing the Copy properties of your enum variants: &'static dyn Set<…> : 'static + Copy thereby making them very convenient!

Moreover, by virtue of being 'static, you'd also be able to compare for equality between them:

  1. slap a : Any super-trait on Set<…>,
  2. impl<E>        Eq for dyn Set<E> {}
    impl<E> PartialEq for dyn Set<E> {
        fn eq(self: &Self, other: &Self) -> bool {
            self.type_id() == other.type_id()
        }
    }
    
    impl<E> dyn Set<E> {
        fn is<T : Set<E>>(&self) -> bool {
            self.type_id() == TypeId::of::<T>()
        }
    }
    

That way you'd be able to write stuff such as if operator.is::<StartsWith>() { which ought to operate quite a bit, at a high-level, like if operator == Operator::StartsWith {

Obviously the drawback of dyn … vs. an enum is precisely this extensibility: you will never be able to know you're being exhaustive in your matches, so you'll keep needing a default/catch-all case.

3 Likes

Ah, thanks for this alternative as well. I also have a question about something you said here,

I'm still getting into Rust, so I'm not sure what the type for &Greater is, as Greater could not implement Set as it is now, I'm wondering what the &(impl Set<...>) refers to., what the (impl Set <...>) means in &Greater: &'static (impl Set<…>).


Your comparison of using dyn and enum at the end helped me realize that not using match could be better as I'm just going to rely on impl Set after parsing.

I can't parse the question from that sentence, but the type of the expression &Greater is &Greater. Unit structs' constructor happens to be spelled identically to their plain type name.

Having trouble finding why any aspect of Borrow would need to be explicited here. Am I missing something or was it that you wanted to make something explicit? (I did not know that borrow was a trait I just assumed if something was sized it could be borrowed, not saying the converse either). I see the requirement to be able to borrow, but not how it couples to the larger problem.

Does it have to do with the remark you made here?

Maybe I just need to understand what motivated you head down that rabbit hole.

edit for clarity,

impl Trait is either:

  1. an unnameable generic type parameter that is constrained to implement Trait, when used in argument position (APIT);
  2. a placeholder for an opaque type that's guaranteed to implement Trait but otherwise unknown to the caller; i.e., a so-called "existential type" (ie. NOT a generic type parameter), when in function return position (RPIT);
  3. or colloquial pseudo-syntax for any type/set of types that implement(s) Trait, when written in prose.

The context of the answer above probably implies #3.

2 Likes

It's interesting, the use of Borrow is needed to enable Operators to use String borrowed as str.

You could avoid the Borrow trait by simply taking a reference &String but that's not as convenient an API for a user. Consider this example:

With Borrow

#[test]
    fn starts_with_works() {
        let set = OperatorConstraint::new(StartsWith, String::from("_"));
        assert!(!set.is_member("snake_case"));
        assert!(set.is_member("_var"));
        assert!(!set.is_member("camelCase"));
    }

Without Borrow (playground),

#[test]
    fn starts_with_works() {
        let set = OperatorConstraint::new(StartsWith, String::from("_"));
        assert!(!set.is_member(&String::from("snake_case")));
        assert!(set.is_member(&"_var".to_owned()));
        assert!(!set.is_member(&"camelCase".to_string()));
    }

This is because referencing (&String) and borrowing (&str) are subtly different for certain types such as String.

You can think of it as the Borrow trait allows you to reference String the "smart way" as &str, but doing so is specific to the String type, so we need its trait impl so String can "tell us" how it should be borrowed. Or at least, that's my understanding at the moment :slight_smile:

1 Like

The difference between &String and &str is the same as this diagram for &Vec<_> versus &[_].

Don't get too thrown off by the name of the Borrow trait. It's mainly there to guarantee[1] that two related types hash and compare the same for the sake of data structures like HashSet and BTreeSet. That way you can query a HashSet<String> with a &str for example. It replaced a pre-1.0 trait called Equiv.

It's not implicitly used when you take a reference or anything.

(Deref coercion sometimes comes into place implicitly, but that's something else.)


  1. modulo a logical implementation -- and this trait is not an unsafe trait, so you can't rely on these guarantees for soundness ↩︎

3 Likes

Fascinating!

I found a quote from your link to the pre-1.0 Equiv trait:

The Borrow trait captures the borrowing relationship between an owned data structure and both references to it and slices from it -- once and for all. This means that it can be used anywhere we need to program generically over "borrowed" data.

(Emphasis mine)

That puts it better than I did.

1 Like