Best way to deserialize heterogeneous JSON object

#1

Hi,
I’m currently porting a Go project to rust and I’m struggling to embrace to full rust way of thinking to deserialize a configuration file.

The file looks like something like that (which is our SANE https://opensource.bloom.sh/sane configuration format, much like JSON):

modules = {
  "ports" = [22, 80, 443],
  "domain/whois" = {},
  "http/drupal/CVE_2018_7600" = {},
  "ssltls/cve_2014_0160" = {},
  "ssltls/cve_2014_0224" = {},
}

In Go I could have deserialized to a map[string]interface{}, but I’m not sure how to proceed in rust.
The purpose of the configuration file is to enable + configure heterogeneous modules in a program.

I know I can deserialize in a giant struct like the following


struct Modules {
  ports: Option<Vec<u16>>,
  #[serde(rename = "domain/whois")]
  domain_whois: Option<()>,
  #[serde(rename = "http/drupal/CVE_2018_7600")]
  http_drupal_cve_2018_7600: Option<()>,
  // ...
}

//...

if let Some(ports) = modules.ports {
  findings.push(modules.ports.run())
}

if let Some(()) = modules.domain_whois {
  findings.push(modules.domain_whois.run())
}

if let Some(()) = modules.http_drupal_cve_2018_7600 {
  findings.push(modules.http_drupal_cve_2018_7600.run())
}

but I wanted to know if there is a better way to proceed where I don’t have to create this giant struct with optional but instead a collection that I can iterate.

// let findings = modules.map(|module| module.run())
0 Likes

#2

This is what you are looking for:
https://docs.serde.rs/serde_json/value/index.html

Nevertheless, I actually think having a concrete struct is preferably to having a loosely typed collection like you are requesting.

Also, you can add a #[serde(default)] to the ports field, so that it always instantiates to a vector (empty if undefined), so that you don’t have to deal with Option.

3 Likes

#3

Thank you I will try Value!

The problem with the concrete struct is that it creates a lot of boilerplate code which can be error prone
think of

// if ports module is enabled
if let Some(ports) = modules.ports {
  findings.push(modules.ports.run())
}

for 1000+ modules and it’s growing.

Thank you for the default thing I wasn’t aware. The thing is that here we may want to disable the module by not including it in the config, so with. my very limited rust knowledge, I think Option is better in this specific case.

0 Likes

#4

It is unlikely that you will be able to remove all of the duck typing boilerplate by using duck typing. :slight_smile: Take for example Value::get; it returns Option! Another example is type checking with methods like is_array, is_object, and is_string.

A concrete type will help remove some of the boilerplate but only if you are certain that the given struct fields are required. Then there is no need to handle None, or unwrap.

It is also possible to use a hybrid approach where the required fields are specified in a concrete type, and everything else is accessed through duck typing like the Value enum. I use this in one of my projects, but with a much much smaller data set and simpler JSON schema.

2 Likes

#5

Acknowledged, thanks!

Is your project open source ?

0 Likes

#6

It is! I don’t know how useful it will be for you, though. Here’s the code I’m talking about: https://github.com/rust-console/cargo-n64/blob/master/src/cargo.rs#L104-L112

cargo --message-format=json returns a list of JSON objects separated by \n characters. This code splits the list and filters it into two iterators that are of interest: One that contains warning messages, and another that contains compiler artifacts.

And here are the concrete types for these two JSON objects: https://github.com/rust-console/cargo-n64/blob/master/src/cargo.rs#L51-L70 These types only include fields that I care about. The rest are parsed by serde and ignored.

Again, I’m not sure how much this helps. It’s just an example of how I chose to deal with the complexities of unstructured data in Rust.

1 Like