Handling common case and special case when using structs

We have a struct A with some fields and some of its "instance" are special and some fields does not make sense to them.

Both the normal A s and the special A s will generally need to be processed together in a Vec. They are generally created from different ways like calling different APIs and put into the Vec.

As a newbie, I come up with the following solutions:

First: Enum way

#[derive(Default)]
struct NormalA {
    pub val: i32,
    pub created: String,
}

#[derive(Default)]
struct SpecialA {
    pub val: i32,
}

enum A {
    Normal(NormalA),
    Special(SpecialA),
}

fn do_with_a(a: &A) {
    match a {
        A::Normal(inner) => println!("{:#?}", inner.val),
        A::Special(inner) => println!("{:#?}", inner.val),
    }
}

fn main() {
    let many_as = vec![
        A::Normal(NormalA {
            val: 1,
            created: "now".to_owned(),
        }),
        A::Normal(NormalA {
            val: 2,
            created: "yesterday".to_owned(),
        }),
        A::Special(SpecialA { val: -1 }),
    ];

    for a in &many_as {
        do_with_a(a);
    }
}

Second: Option way

#[derive(Default)]
struct A {
    pub val: i32,
    pub created: Option<String>,
}

impl A {
    fn is_special(&self) -> bool {
        self.val < 0
    }
}

fn do_with_a(a: &A) {
    println!("{:#?}", a.val);
}

fn main() {
    let many_as = vec![
        A {
            val: 1,
            created: Some("now".to_owned()),
        },
        A {
            val: 2,
            created: Some("yesterday".to_owned()),
        },
        A {
            val: -1,
            created: None,
        },
    ];

    for a in &many_as {
        do_with_a(a);
    }
}

The first way use enum to differentiate the normal ones and special ones. Generally need more code to match the enum and to create the "instances". The type system requires one to check the special case in every usage.

The second way make some fields optional to adapt for the special ones. The Some or None is not elegant on fields. Also need methods like is_special to check whether one is special.

Question:

If

  1. many times the normal ones and the specials need to be put into a collection to be processed together,
  2. sometimes the special ones need to be process specially (not shown in the example)
  3. the numbers of special "instance" is very small (assumed less than 3), (thus I'm not sure if they worths a different struct)
  4. the numbers of the fields in normal A is very large (assumed 20)
  5. about half of the fields are not meaningful for special A (thus the second way would contains many Some s)

then which way is better? Or is there any other better way to handle this?

Thanks.

Citation needed. If a field can be missing, then what's wrong with making it optional?

1 Like

Here's a macro that might be interesting to you :slight_smile: Introduction - SuperStruct Guide

1 Like
  1. the numbers of special "instance" is very small (assumed less than 3), (thus I'm not sure if they worths a different struct)
  2. the numbers of the fields in normal A is very large (assumed 20)
  3. about half of the fields are not meaningful for special A (thus the second way would contains many Some s)

HI, I mean it seems a bit inelegant in this case. There may be thousand of different normal A s, and only very few special A instances. Then many of all these fields become optional and will be have to deal with unwrap or ? later.

#[derive(Default)]
struct NormalA {
    pub val: i32,
    pub val1: i32,
    pub val2: i32,
    pub val3: i32,
    pub val4: i32,
    ...
    pub created: String,
}

#[derive(Default)]
struct SpecialA {
    pub val: i32,
}

The first way looks cleaner, not sure...

The fields are not optional for normal A, but some could be optional for the "union" A if we treat normal A and special A as a same struct.

I not sure if they should be split into different structs.

Thanks, definitely will check!

Two thoughts:

  1. If your worry is that you would have to do a lot of ?/unwrap a lot when only processing "Normal" As, you might also fix that by holding all of the optional data in a single struct, e.g..
#[derive(Default)]
struct NormalAData {
  pub val s: String,
  pub val i: i32,
  pub val f: f32,
}

#[derive(Default)]
struct A {
    pub val: i32,
    pub general: Option<NormalAData>,
}

That way, once you call unwrap on general, you then you don't have to worry about unwrapping anything else.

  1. That said, I would advise trying to think further about how these two types relate to each other. Are they really the same type, and that type just sometimes doesn't have certain fields? Or are they really two different variants of a type, and you need to think differently about certain things depending on what you're doing?

Without knowing the problem exactly, from what you've said, it sure seems like the latter. You'll either be dealing with special As, or you won't, and you'll want to handle things differently in each case. So I think, as stated, I would probably suggest the enum solution.

(I think the enum option is also nice because it seems like it would be easier to prevent the data from being in an invalid stated, i.e. SpecialA with val > 0; but since I'm not really certain what the consequences of that should be, it's harder for me to state emphatically about that)

That said, you can help yourself out a bit by implementing functions/traits on A to cut down on the matches/creations:

#[derive(Default)]
struct NormalA {
    pub val: i32,
    pub created: String,
}

#[derive(Default)]
struct SpecialA {
    pub val: i32,
}

enum A {
    Normal(NormalA),
    Special(SpecialA),
}

impl A {
    pub fn val(&self) -> i32 {
        match self {
            A::Normal(inner) => inner.val,
            A::Special(inner) => inner.val,
        }
    }    
}

impl Default for A {
    fn default() -> Self {
        A::Normal(NormalA::default())
    }
}

fn do_with_a(a: &A) {
    println!("{:#?}", a.val());
}

fn main() {
    let many_as = vec![
        A::Normal(NormalA {
            val: 1,
            created: "now".to_owned(),
        }),
        A::Normal(NormalA {
            val: 2,
            created: "yesterday".to_owned(),
        }),
        A::Special(SpecialA { val: -1 }),
        A::default(),
    ];

    for a in &many_as {
        do_with_a(a);
    }
}
1 Like

In fact, the X problem is:

I call some REST API and it return some Collection JSON array. Most of them are user created collections and they have all the fields like normal A. Two or three of them (in the array) are system created and only contains few fields. Most fields related to user operations like created are meaningless to them and are missing in the JSON object.

"As a collection", one could have many different processings on both of these two things like

for collection in &collections {
     do_sth(collection);
     do_sth1(collection);
     ...
    // `collection` could be "normal" or "special"
    // Of course `do_sth` and `do_sth` will not use those missing fields
    // on the special collections
}

From this view, I think there is single type Collection (or A in the second way of the original post), with many fields optional.

But sometimes one may need to differentiate the special ones like they are prevented from deleting, renaming, moving... from this view the normal ones and special ones seems to be different types/variants. In this case, I agree with that

I think the enum option is also nice because it seems like it would be easier to prevent the data from being in an invalid stated

Hrm...that's an interesting case. The fact that you're mapping to JSON, means you probably have to concern yourself with how you would model it in JSON as well. I'd have to noodle on it a bit.

The JSON is return from external API. Currently Iā€™m using the Option approach to make deserialization easier.

Have you considered just separating the two types of structs into standalone vectors and handle them "separately"?

struct Normal {...} 
struct Special {...}

struct All {
    normals: Vec<Normal>,
    specials: Vec<Special>
}

You can pass around an instance of "All" struct instead of Vec of mixed enum items...
This may not fit your use-case of course, you lose the original ordering of JSON collection etc., but it might as well - depends on your complete conditions...

2 Likes

You have asked a good design question that will affect ergonomics and the guarantees that the type system is able to make in the code that is using the types that you are desigining.

You described the two basic options that you have

  • all unique message structures as enum variants
  • making fields that may not be present optional

The difference between these is that the first is more rigid but you only have to match on the enum variant to know the structure. The second is more flexible but for each field you have to deal with the value potentially not being present. Of course, you can mix and match these options.

The right decision depends entirely on the context. Based on what you described earlier (all different messages structures are known and there are only 2), I would go with the enum approach.

When it comes to storing the different messages. You can use enum (possible if all variants are known by the library), dyn Trait (only option when library users should be able to add new types) or separate containers as mentioned by @RustyJoeM.

2 Likes

Thanks. I didn't think about the separate containers in this case.

I'm not a serde expert, but this All struct mentioned by @RustyJoeM seems hard to work with deserialization from this kind of JSON array?

Can serde divide the array into the two groups easily?

If they are tagged in some way it should be straight forward. Otherwise it can probably still be done but you may have to implement your own deserialization logic.

1 Like

May I ask why you put a Default derive on the Special struct?

My rationale being that the default has to be a Normal struct.

Genuine question from a beginner.

Here is my playground

Of course, I understand its meaning if we had called the default on SpecialA
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=cffcc610b49020366fe382d699ce89e2; But my question refers to the exact example above.

Thanks

#[derive(Default)] only means implementing the Default trait, which effectively gives you an argument-less constructor for your type. It has nothing to do with your (more complicated and domain-specific) notion of "default", it's just a simple library trait.

Some traits should be implemented if possible, because it makes the lives of users of your code easier; Default is one of them.

1 Like

Nice, and I concur; so in the original example, it s not really needed because we initialize each structure with a value.
I just want to make sure that nothing alarming is about to happen by not using it in this context.

Ok, read your update, it answers my question :slight_smile: Thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.