[serde] I couldn't deserialize a serialized object for a recursively nested struct with flatten fields

Hello community!
I am new to rust and am currently learning about the "serde" framework.
I ran into not being able to deserialize the JSON object back that I got when serializing my structure. It is a recursively nested "struct" with flatten fields.
Error:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("invalid type: integer `20`, expected struct Person", line: 18, column: 1)', src/main.rs:28:75

Given that I would like to keep the "struct" and format of JSON, what do I need to fix with minimal changes using the capabilities of the framework?
Thank you in advance! The code and link to the playground are attached,

(Playground)

use std::collections::BTreeMap;
use serde::{Serialize, Deserialize};
use serde_json;


#[derive(Debug, Serialize, Deserialize)]
struct Person {
    #[serde(flatten)]
    reports: BTreeMap<String, Person>,
    #[serde(flatten)]
    subscriptions: BTreeMap<String, Subscription>,
}

#[derive(Debug, Serialize, Deserialize)]
struct Subscription {
    cost: i32,
    kind: String,
}


fn main() {
    let person_obj = compose_person_example_struct();
    println!("Object!\n{:?}\n", person_obj);

    let person_obj_ser = serde_json::to_string_pretty(&person_obj).unwrap();
    println!("Serialized!\n{}\n", person_obj_ser);

    let person_obj_ser_de: Person = serde_json::from_str(&person_obj_ser).unwrap();
    println!("Deserialized!\n{:?}\n", person_obj_ser_de);
}


fn compose_person_example_struct() -> Person {
/*
Returns object:

Person{
  reports: {
    "Pete": Person{
      reports: {
        "John": Person{
          reports: { },
          subscriptions: {
            "Netflix": Subscription{cost: 20, kind: "movies"},
            "Spotify": Subscription{cost: 10, kind: "music"}
          }
        }
      },
      subscriptions: {
        "Spotify": Subscription{cost: 10, kind: "music"}
      }
    }
  },
  subscriptions: { }
}
*/
    let mut john_subscriptions = BTreeMap::new();
    john_subscriptions.insert("Netflix".to_string(), Subscription{kind: "movies".to_string(), cost: 20});
    john_subscriptions.insert("Spotify".to_string(), Subscription{kind: "music".to_string(), cost: 10});

    let mut pete_subscriptions = BTreeMap::new();
    pete_subscriptions.insert("Spotify".to_string(), Subscription{kind: "music".to_string(), cost: 10});
    let mut pete_reports = BTreeMap::new();
    pete_reports.insert("John".to_string(), Person{reports: BTreeMap::new(), subscriptions: john_subscriptions});
    
    let mut reports = BTreeMap::new();
    reports.insert("Pete".to_string(), Person{reports: pete_reports, subscriptions: pete_subscriptions});

    Person { reports, subscriptions: BTreeMap::new() }
}

This code forces any value to be deserializable as both Person and Subscription. The two struct definitions you have don't allow this. flatten works by taking any key-value pairs, which are not yet consumed (e.g., by other fields in the struct), and trying to deserialize them. It returns an error if it cannot do it. A flattened BTreeMap also does not consume the key-value pair, so that even if a Person got successfully deserialized as Person, it will be tried again for the Subscription.

I think the easiest will be to deserialize into a BTreeMap<String, PersonOrSubscription> and then split it into the correct BTrees afterwards. The untagged enum allows deserializing different types, by trying them in the declared order until one succeeds or all fail.

#[derive(Debug, Serialize, Deserialize)]
#[serde(from = "PersonDeser")]
struct Person { /* ... */ }

#[derive(Deserialize)]
struct PersonDeser(BTreeMap<String, PersonOrSubscription>);

impl From<PersonDeser> for Person {
    fn from(p: PersonDeser) -> Self {
        let mut reports = BTreeMap::new();
        let mut subscriptions = BTreeMap::new();
        p.0.into_iter().for_each(|(key, pos)| match pos {
            PersonOrSubscription::Person(person) => {
                reports.insert(key, person);
            }
            PersonOrSubscription::Subscription(subscription) => {
                subscriptions.insert(key, subscription);
            }
        });
        Self {
            reports,
            subscriptions,
        }
    }
}

#[derive(Deserialize)]
#[serde(untagged)]
enum PersonOrSubscription {
    Person(Person),
    Subscription(Subscription),
}

Playground

3 Likes

It’s also possible to skip the intermediate BTreeMap<String, PersonOrSubscription> by implementing Deserialize manualy

use serde::{de::Visitor, Deserialize, Serialize};
use std::collections::BTreeMap;

#[derive(Debug, Serialize)]
struct Person {
    #[serde(flatten)]
    reports: BTreeMap<String, Person>,
    #[serde(flatten)]
    subscriptions: BTreeMap<String, Subscription>,
}

impl<'de> Deserialize<'de> for Person {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        struct PersonVisitor;
        impl<'de> Visitor<'de> for PersonVisitor {
            type Value = Person;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                formatter.write_str("a map")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: serde::de::MapAccess<'de>,
            {
                let mut person = Person {
                    reports: BTreeMap::new(),
                    subscriptions: BTreeMap::new(),
                };
                while let Some((key, value)) = map.next_entry()? {
                    match value {
                        PersonOrSubscription::Person(p) => {
                            person.reports.insert(key, p);
                        }
                        PersonOrSubscription::Subscription(s) => {
                            person.subscriptions.insert(key, s);
                        }
                    }
                }
                Ok(person)
            }
        }
        deserializer.deserialize_map(PersonVisitor)
    }
}

#[derive(Deserialize)]
#[serde(untagged)]
enum PersonOrSubscription {
    Person(Person),
    Subscription(Subscription),
}

(playground)

2 Likes

Looking at the macro expansion, #[serde(untagged)] seems to be rather costly. It stores the entire input in a type similar to serde_value::Value first (though it uses a Vec for the maps, too, so slightly better); this will probably lead to quadratic deserialization time and lots of allocations.

2 Likes

Yeah, it has to use backtracking.

It's the drawback of trying to support arbitrary stuff. If it's too slow for you, you gotta implement the whole Deserialize yourself rather than rely on the untagged enum.

But don't just go optimizing. Profile it first.

I spent some time on the weekend and now I am a bit more familiar with this framework thanks to the comments!
in the end I settled on a custom deserialize method, so it was easier for me to understand how to parse a more complex and strange structures :slight_smile:

By the way, on a very complex and nested struct, I ran into the fact that an empty json { } in the depth of file was parsed as a Person::default() object. This was fixed by adding the following statement before the while loop:

if map.size_hint() == Some(0)  {
    return Err(serde::de::Error::missing_field("Do not parse empty map as Person"))
}
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.