How to convert this Python code into Rust?

class Template(BaseModel):
    model_config: ClassVar[ConfigDict] = ConfigDict(extra="forbid", frozen=True)

    name: BBox
    mr_id: BBox | None = None
    diagnoses: BBox
    allergy_history: BBox | None = None
    department: BBox
    rx_id: BBox
    age: BBox
    drugs: list[DrugTemplate]
    preconception_pregnancy_lactation: BBox | None = None
    hepatic_impairment: BBox | None = None
    renal_impairment: BBox | None = None
    insurance_id: BBox | None = None
    follow_up_notes: BBox | None = None
    remarks: BBox | None = None
    date_year: BBox
    date_month: BBox
    date_day: BBox
    hospital: BBox
    sex: BBox | None = None
    sex_male: BBox | None = None
    sex_female: BBox | None = None
    weight: BBox | None = None

    def non_drug_bboxes(self) -> list[NamedBBox]:
        return [
            NamedBBox(name=k, bbox=v)
            for k, v in self.model_dump(mode="python").items()
            if k != "drugs" and v is not None
        ]

The current approach I come up with is

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Template {
    name: BBox,
    mr_id: Option<BBox>,
    diagnoses: BBox,
    allergy_history: Option<BBox>,
    department: BBox,
    rx_id: BBox,
    age: BBox,
    drugs: Vec<DrugTemplate>,
    preconception_pregnancy_lactation: Option<BBox>,
    hepatic_impairment: Option<BBox>,
    renal_impairment: Option<BBox>,
    insurance_id: Option<BBox>,
    follow_up_notes: Option<BBox>,
    remarks: Option<BBox>,
    date_year: BBox,
    date_month: BBox,
    date_day: BBox,
    hospital: BBox,
    sex: Option<BBox>,
    sex_male: Option<BBox>,
    sex_female: Option<BBox>,
    weight: Option<BBox>,
}

impl Template {
    pub fn non_drug_bboxes(&self) -> Vec<NamedBBox> {
        [
            ("name", Some(self.name)),
            ("mr_id", self.mr_id),
            ("diagnoses", Some(self.diagnoses)),
            ("allergy_history", self.allergy_history),
            ("department", Some(self.department)),
            ("rx_id", Some(self.rx_id)),
            ("age", Some(self.age)),
            (
                "preconception_pregnancy_lactation",
                self.preconception_pregnancy_lactation,
            ),
            ("hepatic_impairment", self.hepatic_impairment),
            ("renal_impairment", self.renal_impairment),
            ("insurance_id", self.insurance_id),
            ("follow_up_notes", self.follow_up_notes),
            ("remarks", self.remarks),
            ("date_year", Some(self.date_year)),
            ("date_month", Some(self.date_month)),
            ("date_day", Some(self.date_day)),
            ("hospital", Some(self.hospital)),
            ("sex", self.sex),
            ("sex_male", self.sex_male),
            ("sex_female", self.sex_female),
            ("weight", self.weight),
        ]
        .into_iter()
        .filter_map(|(name, bbox)| {
            if let Some(bbox) = bbox {
                Some(NamedBBox {
                    name: name.to_string(),
                    bbox,
                })
            } else {
                None
            }
        })
        .collect()
    }
}

I wonder if there is a way to simplify the non_drug_bboxes method. Thanks!

macro_rules! generate_array {
    ($this:ident; $($name:ident),*) => {
        [$(
            (
                ::core::stringify!($name),
                ::core::option::Option::from($this.$name),
            ),
        )*]
    };
}
generate_array!(
    self;
    name, mr_id, diagnoses, allergy_history, department,
    rx_id, age, preconception_pregnancy_location,
    hepatic_impairment, renal_impairment, insurance_id,
    follow_up_notes, remarks, date_year, date_month,
    date_day, hospital, sex, sex_male, sex_female, weight
)

Replace the array in non_drug_bboxes with this macro call. (I probably didn’t forget any writing them all out)
Rust doesn’t have runtime reflection like Python (the field names don’t exist at all in the final binary).

Note: What is the BBox type, and why is it correct for name, hospital, hepatic_impairment, sex, and preconception_pregnancy_location? It seems like those should be different types.

Other note

The filter_map can be rewritten:

.filter_map(|(name, bbox)| {
    bbox.map(|bbox| NamedBBox {
        name: name.to_string(),
        bbox,
    }
}
1 Like

It may be better to use HashMap<String, TemplateValue> where TemplateValue is an enum with variants for each type in the map.

This more closely approximates what Python is doing. You can iterate over the HashMap, collect it into a vector, index it by string keys, etc. Many of the same operations you have in Python.

Constructing the HashMap is not quite as nice as it is in Python, but if you are only serializing and deserializing, it may not be so bad.

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Template {
    map: HashMap<String, TemplateValue>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
enum TemplateValue {
    BBox(BBox),
    MaybeBBox(Option<BBox>),
    DrugTemplates(Vec<DrugTemplate>),
}

impl Template {
    pub fn non_drug_bboxes(&self) -> Vec<NamedBBox> {
        self.map.iter().filter_map(|(name, value)| {
            match value {
                TemplateValue::BBox(bbox) => Some(NamedBBox {
                    name: name.clone(),
                    bbox: *bbox,
                }),
                TemplateValue::MaybeBBox(bbox) => bbox.map(|bbox| NamedBBox {
                    name: name.clone(),
                    bbox,
                }),
                _ => None,
            }
        }).collect()
    }
}

impl From<HashMap<String, TemplateValue>> for Template {
    fn from(map: HashMap<String, TemplateValue>) -> Self {
        Self { map }
    }
}

An example constructor looks like this:

fn main() {
    let template = Template::from([
        ("name".to_string(), TemplateValue::BBox(BBox::new())),
        ("mr_id".to_string(), TemplateValue::MaybeBBox(Some(BBox::new()))),
        ("diagnoses".to_string(), TemplateValue::BBox(BBox::new())),
        ("allergy_history".to_string(), TemplateValue::MaybeBBox(None)),
        // ...
        ("drugs".to_string(), TemplateValue::DrugTemplates(vec![])),
        // ...
    ].into_iter().collect::<HashMap<_, _>>());
    
    println!("{:#?}", template.non_drug_bboxes());
}
2 Likes

Also, I have found this kind of constructor a little better than requiring an external collection into a HashMap:

impl<I> From<I> for Template
where
    I: Iterator<Item = (String, TemplateValue)>,
{
    fn from(iter: I) -> Self {
        Self {
            map: iter.collect(),
        }
    }
}

Usage is the same, but you can remove the collection and turbofish:

let template = Template::from([
    ("name".to_string(), TemplateValue::BBox(BBox::new())),
    ("mr_id".to_string(), TemplateValue::MaybeBBox(Some(BBox::new()))),
    ("diagnoses".to_string(), TemplateValue::BBox(BBox::new())),
    ("allergy_history".to_string(), TemplateValue::MaybeBBox(None)),
    // ...
    ("drugs".to_string(), TemplateValue::DrugTemplates(vec![])),
    // ...
].into_iter());
1 Like

What is the BBox type

It is just (f64, f64, f64, f64), defining the (x, y, width, height) of a OCR region.

Is macro the only way to help me not to write all the fields names?

Thank for the suggestion.

One downside might be the fields of template become uncertain (Option), accessing any field need to check if it is None.

That is already the situation with struct fields containing Option.

The real downsides are roughly the same as in Python:

  • HashMap keys may not exist: either never created or have since been deleted.
  • Additional HashMap keys may have been added that your code is not prepared to handle.
  • Duck typing with enum values makes it possible to change the type at runtime.
    • Rust has an edge up on Python here: every interaction with enum must go through typed access. It can never accidentally end up with an unexpected type and all types must be handled.