I'm trying to understand how do I do streaming json parsing in rust.
For example, I have a JSON with serialized array of billions of structs Point(i32,i32,i32)
, and I want to aggregate this data. If I have a small file I could do
let points: Vec<Point> = serde_json::from_str(read_all_file("file.txt"));
let aggregated_result = points
.into_iter()
.fold((0, 0, 0), |(a, b, c), (x, y, z)| (a + x, b + y, c + z));
But the problem here is we are parsing the whole file when we actually need one row at each moment.
How could it be done more efficiently? My current implementation just perform nasty string indexOf
operations to indicate object boundaries and then call from_str::<Point>()
on the substring, but it looks very hacky, unreliable and unwise.