How do you deal with modeling types with lots of shared fields?

I'm currently writing a application in Rust that extracts data from the GitLab API. As part of the response I recieve JSON that contains a "issue" and "mergeRequest" field. Only one of these actually contains data at a time.

"project": {
    "timelogs": {
       "timeSpent": 1800,
        "nodes": [
            {
                "issue": {
                    "title": "Project Setup",
                    }
                },
                "mergeRequest": null
            },

I've been thinking about how to best model these inside Rust. They contain lots of shared fields and in other languages I would just use a base class for the shared fields and then use inheritance to model the fields unique to that type.

My first attempt was to use Option, but that would lead to ugly is_some() checks, so I rather quickly switched to using enums. In my first attempt, I just duplicated all shared fields in the structs, which I didn't like

pub enum TrackableItem {
    Issue(Issue),
    MergeRequest(MergeRequest),
}

#[serde_as]
#[derive(Debug, PartialEq, Default, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Issue {
    pub title: String,
    #[serde_as(as = "DurationSeconds<i64>")]
    pub time_estimate: Duration,
    #[serde_as(as = "DurationSeconds<i64>")]
    pub total_time_spent: Duration,
    pub assignees: UserNodes,
    pub milestone: Option<Milestone>,
    pub labels: Option<Labels>,
}

#[serde_as]
#[derive(Debug, PartialEq, Default, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct MergeRequest {
    pub reviewers: UserNodes,
    pub title: String,
    #[serde_as(as = "DurationSeconds<i64>")]
    pub time_estimate: Duration,
    #[serde_as(as = "DurationSeconds<i64>")]
    pub total_time_spent: Duration,
    pub assignees: UserNodes,
    pub milestone: Option<Milestone>,
    pub labels: Option<Labels>,
}

My second idea was to model it via a third struct that contained all common fields.
Then, to better access the fields, use the Deref traits

pub struct TrackableItemFields {
    pub title: String,
    pub time_estimate: Duration,
    pub total_time_spent: Duration,
    pub assignees: UserNodes,
    pub milestone: Option<Milestone>,
    pub labels: Option<Labels>,
}

pub struct MergeRequest {
    pub reviewers: UserNodes,
    #[serde(flatten)]
    pub merge_request: TrackableItemFields,
}

impl Deref for MergeRequest {
    type Target = TrackableItemFields;
    fn deref(&self) -> &Self::Target {
        &self.merge_request
    }
}

But the problem with that solution is that I still need a match statement to access common fields

    match node.trackable_item {
        TrackableItem::Issue(issue) => issue.milestone,
        TrackableItem::MergeRequest(mr) => mr.milestone,
    }

These all feel like a bit of a hack, so I would like to ask you how you would model these things inside Rust. I guess I'm currently inside the "knows just enough to be dangerous" territory :slight_smile:

An alternative pattern you can consider is "enum inside struct" instead of "struct inside enum":

pub struct TrackableItem {
    pub common: TrackableItemFields,
    pub kind: ItemKind,
}
pub enum ItemKind {
    Issue(IssueDetails),
    MergeRequest(MergeRequestDetails),
}
pub struct IssueDetails {
    // only issue-specific fields
}

Then, Issue and MergeRequest do not contain TrackableItemFields — it's always separate. If you need types for “definitely an issue, and has all the data” and “definitely a merge request, and has all the data” then you declare those as their own structs and write conversion functions as needed:

pub struct Issue {
    common: TrackableItemFields,
    details: IssueDetails,
}

This is not necessarily better than the struct-inside-enum you came up with — it depends on your needs.

If you do go with enum TrackableItem, you should write one match (well, two, for & and &mut), not one per field:

impl TrackableItem {
    pub fn common(&self) -> &TrackableItemFields {
        match self {
            Self::Issue(issue) => &issue.common,
            Self::MergeRequest(mr) => &mr.common,
        }
    }
    pub fn common_mut(&mut self) -> &mut TrackableItemFields {
        match self {
            Self::Issue(issue) => &mut issue.common,
            Self::MergeRequest(mr) => &mut mr.common,
        }
    }
}

Then use it like item.common().milestone. Or, if you really want to, you can impl Deref for TrackableItem with this same match (but I recommend not using Deref so liberally, as it combines method namespaces and can create confusion over what value is actually being operated on). In any case, this means you do not need a method for each common field.

The "enum inside struct" is also one that I considered briefly and to be honest, it looks like the cleanest approach if I don't want to repeat the fields across the structs. The main problem I faced is I couldn't get the deserialization with serde_json to work. So if you know how to, I'd be glad to know :slight_smile:

The trick that I keep in my toolbox for deserialization problems is to write a custom Deserialize that delegates to a derived one.

impl<'de> Deserialize<'de> for TrackableItem {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        #[derive(Deserialize)]
        enum TrackableItemDe {
            Issue(IssueDe),
            MergeRequest(MergeRequestDe),
        }
        #[derive(Deserialize)]
        struct IssueDe {
            #[serde(flatten)]
            common: TrackableItemFields,
            reviewers: UserNodes,
            // other issue-specific fields
         }

        Ok(match TrackableItemDe::deserialize(deserializer)? {
            TrackableItemDe::Issue(issue) => TrackableItem {
                common: issue.common,
                kind: ItemKind::Issue {
                    reviewers: issue.reviewers,
                    // ...
                }
            }
            TrackableItemDe::MergeRequest(mr) => ...
        })
    }
}

It's a bit tedious, but it allows you to completely decouple data structures designed to work with derive(Deserialize) from data structures designed to work with your application code.

2 Likes

Here's a nice story of deduplicating data in a complex dataset:

I took the plunge on implementing the "scary" Deserializer and I'm quite happy with the result! But I needed to change the TrackablrItemDe from an Enum into a struct, as Merge Requests couldn't be deserialized.

Thanks for your guidance! :slight_smile: