Using json() with response data in reqwest is slow

I am trying to obtain an HTTP response asynchronously, but the speed is slow. I am not sure if the asynchronous approach is successful.
Removing json::<serde_json::Value>().await can make it faster, but then won't be able to get the returned data
Would be grateful for any assistance.

    let package_json_string = read_to_string("package.json").expect("read pacakge.json error");
    let mut query_key_list = vec![];
    let PackageJson {
        dependencies,
        devDependencies,
    }: PackageJson = serde_json::from_str(package_json_string.as_str()).expect("parse error");
    for key in dependencies.keys() {
        query_key_list.push(key)
    }
    for key in devDependencies.keys() {
        query_key_list.push(key)
    }
    let client = Client::new();
    let mut task = vec![];
    let start = Instant::now();
    for key in query_key_list {
        let url = format!("{}{}", "https://registry.npmjs.org/", key);
        let client = client.clone();
        task.push(tokio::spawn(async move {
            let result = client.get(url).send().await;
            let res = result.unwrap().json::<serde_json::Value>().await.unwrap();
        }));
    }
    join_all(task).await;
    let duration = start.elapsed();
    println!("Time elapsed in expensive_function() is: {:?}", duration);

I am also post it in stackoverflow rust - Using json() with response data in reqwest is slow - Stack Overflow

Is it much slower than downloading it manually using curl or your web browser?

Much slower than JS fetch version. It seems because of the big size of returned json data

What is your JS fetch code and how much it takes for both JS and Rust versions? Also, are you using release mode?

Yes , The speed difference is approximately five times, read · GitHub

Under my 100Mbps internet your code takes ~9.7s to download ~100MB of data which means it's near optimal. Again, what's your JS fetch code and how much it takes in seconds for both JS and Rust version in your attempt?

2 Likes

Are you compiling in release mode? Parsing is a very fast in memory task. 5 times slower is a lot.

1 Like

Please add a link to cross-posts on other forums or QA-sites to avoid duplicated effort by the community:

2 Likes

ok,added

var pjson = require('./package.json');
const key = [...Object.keys(pjson.dependencies), ...Object.keys(pjson.devDependencies)];
let resList =[]
console.time('test-node')
key.forEach(name => {
  const url = "https://registry.npmjs.org/" + name
  fetch(url).then(res=>res.json()).then(res => {
    resList.push(`${res.name}: ${res["dist-tags"]}`)
    if(resList.length === key.length){
      console.log(resList)
      console.timeEnd('test-node')
    }
  })
})

It spend 2~3s, Rust code spend 48s😂

Yes,Release mode, i am put the JS version code

This might be sub optimal because the Vec starts empty and slowly grows over time requiring reallocations.
You can try

let mut task = Vec::with_capacity(query_key_list.len());

To allocate the whole Vec once.

1 Like

On which platform are you working, is it on Windows or Linux for example, or is it inside a navigator using wasm ?

It might be useful to investigate whether the requests are really sent in parallel or not.

To test this, you could compare for example your current code, to another approach where all requests are sent synchronously one after another.

Also, how many requests are you sending ? (What is the size of query_key_list?)

1 Like

I ran your Rust code and even in debug mode it only takes 2 seconds for me (3 in debug mode) using your package.json from the gist.

Try not using join_all. Read more: futures::future::join_all on JoinHandle is too slow · Issue #2401 · tokio-rs/tokio · GitHub

EDIT: It seems this issues has been fixed :slight_smile: Try running cargo-flamegraph

2 Likes

There are some responses that are large and can be decompressed (e.g. the Content-Encoding in vite's response header is gzip).

...
[8.4M] https://registry.npmjs.org/tailwindcss: Response { name: "tailwindcss" }
[19.2M] https://registry.npmjs.org/typescript: Response { name: "typescript" }
[35.7M] https://registry.npmjs.org/vite: Response { name: "vite" }

But reqwest doesn't enable any decompression by default, so it should be manually enabled[1].

reqwest = { version = "*", features = ["gzip", "deflate"] }

Full Rust code is here, and some minor fixes that are probably not related to the speed/performance:

  • better error handling by replacing join_all with Stream / StreamExt apis
    • before constructing the Stream, you don't have to store dep names in vec
    let deps = dependencies.keys().chain(devDependencies.keys());
    let res = deps
        .map(|dep| /* async block */)
        // turn futures into an unordered stream
        .collect::<FuturesUnordered<_>>()
        // error handling: ignore the failed cases
        .filter_map(|res| async move { res.ok() })
        // collect and await the result of a stream
        .collect::<Vec<_>>()
        .await;
  • json parsing is usually not the bottleneck, but you can do better: parse the part you're interested in just like what you do on parsing package.json.

  1. According to reqwest's doc, gzip and deflate are enabled in default ClientBuilder (thus Client) when the features is enabled. ↩︎

Thank you!After enabling gzip feature, there was a significant improvement in speed.I will study the demo you provided carefully。I'm surprised by how friendly the community is. Thank you all! :smiling_face_with_three_hearts:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.