Hi I posted this on stack overflow but got no replies. I foolishly posted on a weekend
I have some json like below. The sites field is a few thousand entries in size. The file is around 4M. If I read this with serde_json::from_str My remaining memory of 5 or 6 Gb is quickly consumed and the app crashes.
I see serde has Deserializer::from_str(&contents).into_iter::();
How can I use this to stream just the sites field of the top level object, as that is where all the repeating objects are ?
{
"debug": {
"conn": {
"database": "bmos",
"host": "carneab4.memset.net",
"password": "cms823nvc",
"port": 5432,
"user": "bmos"
},
"interval": 60,
"name": "debug",
"sites": {
"house1": {
"enable": true,
"equip": [
{
"enable": true,
"interval": 45,
"ip": "127.0.0.1",
"name": "gateway",
"points": [
{
"auto_convert": true,
"data_type": "float",
"enable": true,
"name": "point1",
"order": "1234",
"reg_type": "input",
"register": 22,
"scale": 1.0,
"uid": 1,
"units": "watts"
},
{
"auto_convert": true,
"data_type": "float",
"enable": true,
"name": "point2",
"order": "1234",
"reg_type": "input",
"register": 22,
"scale": 1.0,
"uid": 2,
"units": "watts"
},
.......
],
"port": 502
}
],
"name": "house1",
"plot": "1"
},
.....
}
}
}
Ideas I have tried are
1, I tried adding #[serde(deserialize_with = "site_stream_deserialize")] to the sites field and using the method to do a stream read but I don't know if that method has access to the original byte stream.
2, I tried to implement my own Deserialize. However, sites is still expected to be an object and cant just be read as a bytes stream or str.
impl<'de> Deserialize<'de> for Campus {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: serde::Deserializer<'de>
{
#[derive(Deserialize)]
struct Outer {
pub name: String,
pub interval: i32,
pub conn: DBConnection,
pub sites: String,
}
let helper = Outer::deserialize(deserializer)?;
let stream = Deserializer::from_str(&helper.sites).into_iter::<Site>();
let mut sites: HashMap<String, Site> = HashMap::new();
for site in stream {
let s = site.unwrap();
sites.insert(s.name.to_string(), s);
}
Ok(Campus {
name: helper.name,
interval: helper.interval,
conn: helper.conn,
sites: sites,
})
}
}
I am not sure I can read this json as a stream as the examples I have found are all toplevel repeating entities. Any advice would be great.
I guess I could try preprocessing the file to just leave the sites array
or I could create a nom parser
but 1 seems hacky and 2 more work and possible fragile.
Thanks