Process duplicates in json message

Does anybody know how I can process such json with serde_json?

{
"field1": "hello",
"array_field: ["1", "2", "3"],
"dup_field": 1,
"dup_field": 2
}

But message could be without dup too

{
"field1": "hello",
"array_field: ["1", "2", "3"],
"dup_field": 1,
}

My struct:

#[derive(Serialize, Deserialize, Debug)]
pub struct Message {
    field1: Option<String>,
    array_field:Vec<Value>,
    dup_field: i64,
}

How do you want dup_field to be handled in the final result? Should it:

  • Keep all values, if so in what data structure? A Vec?
  • Keep only the first value
  • Keep only the last value

Keep only the first/last is ok

Seems it's a bit tricky to achieve. There are some ideas here:

Otherwise you could use an intermediate type to collect and de-dup extra fields then convert that. Like this:

This has a much simpler solution using Value:

let raw: Value = serde_json::from_str(EXAMPLE).expect("unable to parse JSON");
let msg: Message = serde_json::from_value(raw).expect("dup_field is missing");

Playground

1 Like

TIL that serde_json::Value allows input to have duplicated keys, thanks!

1 Like

To be honest, I just assumed that it does. Neither HashMap nor BTreeMap actively check for duplicate keys; they just blindly overwrite with the latest, so I figured that's also what a map-based Value::Object would do, and the gamble paid off :sweat_smile:

2 Likes

I've used parsing fields to HashMap as below:

let parsed_message: HashMap<String, Value> = serde_json::from_str(message).unwrap();

As @H2CO3 mentioned it just uses the latest field.

That's not what was needed but it could be used like workaround :slight_smile:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.