Json parser for test purpose


#1

Hello,

I am new to RUST that i found at benchmark json :
https://github.com/kostya/benchmarks

I need to have the same test done but with a different json to show my boss the good choice is RUST (very fast)
in the example:
https://github.com/kostya/benchmarks/tree/master/json/json.rs/src
json_pull.rs ou json_struct.rs
they use json like :

{ "x": 0.67641911159407, "y": 0.23669492753045707, "z": 0.11020410911663026, "name": "tvgdmj 4114", "opts": { "1": [ 1, true ] } },
but mine is that way :

{
  "errors": [
    {
      "type": "node",
      "nodeId": "abcd-efgh",
      "nodeUri": "http://example.com",
      "tags": {"site": "lon"},
      "error": "Connection refused",
      "internal": true
    },
    {
      "type": "series",
      "tags": {"site": "lon"},
      "error": "Aggregation too heavy, too many rows from the database would have to be fetched to satisfy the request!",
      "internal": true
    }
  ],
  "result": [
    {
      "hash": "deadbeef",
      "tags": {"foo": "bar"},
      "values": [[1300000000000, 42.0]]
    },
    {
      "hash": "beefdead",
      "tags": {"foo": "baz","tutu":"toto","ici":"maintenant"},
      "values": [[1300000000000, 42.0]]
    }
  ],
  "range": {
    "end": 1469816790000,
    "start": 1469809590000
  },
  "statistics": {}
}

any help would be appriciate and therefor i will have time to learn RUST

Thanks !


#2

If I understand your question correctly, you’re trying to see how fast Rust is at parsing JSON, using your own example.

The most common way is to use Serde: https://serde.rs/ (serde, serde_json, serde_derive crates).

You will need to make a Rust struct that matches layout of your JSON, and derive #[derive(Deserialize)] for it. You can probably take an existing example and rename fields/adjust types to match yours. For nested objects you will need to nest multiple Rust structs.

Rust has a small benchmarking framework built in: https://doc.rust-lang.org/1.16.0/book/benchmark-tests.html


#3

it is exactly that.
but even if I did try i need help to write the Rust struct for my json.
do you have example of json Rust struc complicated …?
thanks


#4

You can break it down:

  • {} is a struct if you know the keys. It’s HashMap<String, String> if you don’t know the keys. Or serde_json::Value if it’s completely dynamic.
  • [] is a Vec
  • numbers are f64 (or u32/u64 if they’re always integers).
  • If something can be undefined, then wrap it in Option<…>

#5

http://play.rust-lang.org/?gist=ead6329abf2c2d33b2a91a87c7cffd49&version=stable&mode=debug


#6

Here is a quick comparison of Rust vs Python on your data.

On my computer Rust takes 2.8 microseconds and Python takes 16.9 microseconds, so the Rust one is around 6 times faster. Keep in mind that there may be other considerations in choosing a language. In particular the Rust code here is many more lines of code, the two implementations behave differently in the case of malformed input, there may be a faster JSON library for Python than the one in the standard library, etc.

Rust

// # Cargo.toml
// [dependencies]
// serde = "1.0"
// serde_derive = "1.0"
// serde_json = "1.0"

#![feature(test)]
#![allow(dead_code)]

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate serde_json;
extern crate test;

use std::collections::BTreeMap as Map;

static J: &str = r#"{
  "errors": [
    {
      "type": "node",
      "nodeId": "abcd-efgh",
      "nodeUri": "http://example.com",
      "tags": {"site": "lon"},
      "error": "Connection refused",
      "internal": true
    },
    {
      "type": "series",
      "tags": {"site": "lon"},
      "error": "Aggregation too heavy, too many rows from the database would have to be fetched to satisfy the request!",
      "internal": true
    }
  ],
  "result": [
    {
      "hash": "deadbeef",
      "tags": {"foo": "bar"},
      "values": [[1300000000000, 42.0]]
    },
    {
      "hash": "beefdead",
      "tags": {"foo": "baz","tutu":"toto","ici":"maintenant"},
      "values": [[1300000000000, 42.0]]
    }
  ],
  "range": {
    "end": 1469816790000,
    "start": 1469809590000
  },
  "statistics": {}
}"#;

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct Lucilecoutouly {
    errors: Vec<ApiError>,
    result: Vec<ApiResult>,
    range: Range,
    statistics: Statistics,
}

#[derive(Deserialize)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
struct ApiError {
    #[serde(rename = "type")]
    error_type: ErrorType,
    node_id: Option<String>,
    node_uri: Option<String>,
    #[serde(default)]
    tags: Map<String, String>,
    error: String,
    internal: bool,
}

#[derive(Deserialize)]
#[serde(rename_all = "lowercase")]
enum ErrorType {
    Node,
    Series,
}

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct ApiResult {
    hash: String,
    #[serde(default)]
    tags: Map<String, String>,
    values: Vec<(u64, f64)>,
}

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct Range {
    end: u64,
    start: u64,
}

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct Statistics {}

#[bench]
fn bench_json(b: &mut test::Bencher) {
    b.iter(|| serde_json::from_str::<Lucilecoutouly>(J).unwrap());
}

Python

J = """{
  "errors": [
    {
      "type": "node",
      "nodeId": "abcd-efgh",
      "nodeUri": "http://example.com",
      "tags": {"site": "lon"},
      "error": "Connection refused",
      "internal": true
    },
    {
      "type": "series",
      "tags": {"site": "lon"},
      "error": "Aggregation too heavy, too many rows from the database would have to be fetched to satisfy the request!",
      "internal": true
    }
  ],
  "result": [
    {
      "hash": "deadbeef",
      "tags": {"foo": "bar"},
      "values": [[1300000000000, 42.0]]
    },
    {
      "hash": "beefdead",
      "tags": {"foo": "baz","tutu":"toto","ici":"maintenant"},
      "values": [[1300000000000, 42.0]]
    }
  ],
  "range": {
    "end": 1469816790000,
    "start": 1469809590000
  },
  "statistics": {}
}"""

import json, timeit

def test():
    json.loads(J)

if __name__ == '__main__':
    N = 1000 * 1000
    T = timeit.timeit("test()", number=N, setup="from __main__ import test")
    print("%sns" % int(T / N * 1000 * 1000 * 1000))

#7

I agree with this statement. Make sure your team is comfortable writing Rust as well as comfortable integrating it into the rest of your current technology.

I’m not saying not to try! Just saying that make sure the experience is a good one for all developers involved. Rust is fast, but so are many other languages. Make sure it fits your use case and the team’s abilities.

I’m probably preaching to the choir here but I thought I’d add my two sense reguardless!


#8

thank you a lot

but when y try to build in rust i had this error

error[E0554]: #![feature] may not be used on the stable release channel

Quoting David Tolnay rust_lang@discoursemail.com:


#9

Run rustup default nightly to fix this.