How to optimize http client performance?

I use isahc to realize a simple http bench mark tool.
The http server runs on actix-web v3.1.0. Rust version is 1.47.
Client and server are all in the same machine.
Windows 10 system, intel 8 core 1.6G, and 8G mem.

thread-num:8,use-time:12388 ms,speed:12915, About 13000/s , the tool occupy 30%+ CPU , server 15%CPU.
The same server, use baton (a tool by go), about 40000/s, it occupy 15% CPU , server 15% CPU.

It is a very simple condition. Rust performances not well.
I have tried the condition with actix-web-client, reqwest, hyper,surf..., all smaller than15000/s, worse than baton.

Why? And how to optimize it?
Thanks!

Comile cargo run --release, and profile is:
codegen-units = 8
codegen-threads=8
#debug = true
lto = true
overflow-checks = true
opt-level = 3

//--------------------------------------------------------------------------------------------
//Code

use isahc::prelude:: <em>;
use isahc::HttpClient;
use isahc::config::ResolveMap;
use async_std::</em> ;
use std::time::SystemTime;

const threadNum : usize = 8;
const runTimes : u64 = 20000;

#[async_std::main]
async fn main() -> Result<(), isahc::Error> {
    let start = SystemTime::now();
    let mut i : usize = 0;
    let mut handlers = vec![];

    while i < threadNum {
        i += 1;
        let no = i;
        let h = task::spawn(async move {
                let client = HttpClient::builder()
                    .max_connections(100)
                   .max_connections_per_host(100)
                //.dns_resolve(ResolveMap::new()
                // Send requests for example.org on port 80 to 127.0.0.1.
                //.add("www.example.org", 8080, [127, 0, 0, 1]))
                .build().unwrap();

            let mut i : u64 = 0;
            while i < runTimes {
                i += 1;
                let mut response = client.get("http://127.0.0.1:8080/").unwrap();
                // Print some basic info about the response to standard output.
                // println!("Status: {}", response.status());
                // println!("Headers: {:#?}", response.headers());

                // // Read the response body as text into a string and print it.
                // print!("{}-{},{}",no, i, response.text().unwrap());
            }
        });
        handlers.push(h);
    }
    for h in handlers {
        h.await;
    }

    let interval = start.elapsed().unwrap().as_millis() as u64;

    println!("thread-num:{},use-time:{} ms,speed:{}", 
        threadNum, interval, runTimes * 1000 * (threadNum as u64) / interval);

    Ok(())
}

I'm guessing the problem is that HttpClient::get blocks the entire thread, meaning no other tasks can proceed and you end up executing the requests sequentially. What if you try HttpClient::get_async instead?

1 Like

Thanks for your suggestion.
I tried get_async, but without improvement.
:neutral_face: even worse.

use isahc::prelude::*;
use isahc::HttpClient;
use isahc::config::{ResolveMap, DnsCache};
use async_std::*;
use std::time::{SystemTime, Duration};

const threadNum : usize = 8;
const runTimes : u64 = 20000;

#[async_std::main]
async fn main() -> Result<(), isahc::Error> {
    // Send a GET request and wait for the response headers.
    // Must be `mut` so we can read the response body.
    let start = SystemTime::now();
    let mut i : usize = 0;
    //let sys = actix::System::new("test");
    let mut handlers = vec![];

    while i < threadNum {
        i += 1;
        let no = i;
        let h = task::spawn(async move {
            let client = HttpClient::builder()
                .max_connections(10)
                .max_connections_per_host(10)
                .connection_cache_size(10)
                .tcp_keepalive(Duration::from_secs(10))
                .dns_cache(DnsCache::Forever)
                //.dns_resolve(ResolveMap::new()
                // Send requests for example.org on port 80 to 127.0.0.1.
                //.add("www.example.org", 8080, [127, 0, 0, 1]))
                .build().unwrap();

            let mut i : u64 = 0;
            while i < runTimes {
                i += 1;
                let mut response = client.get_async("http://127.0.0.1:8080/").await.unwrap();
                // Print some basic info about the response to standard output.
                // println!("Status: {}", response.status());
                // println!("Headers: {:#?}", response.headers());
                response.text_async().await.unwrap();
                // // Read the response body as text into a string and print it.
                // print!("{}-{},{}",no, i, response.text().unwrap());
            }
        });
        handlers.push(h);
    }
    for h in handlers {
        h.await;
    }

    let interval = start.elapsed().unwrap().as_millis() as u64;

    println!("thread-num:{},use-time:{} ms,speed:{}", 
        threadNum, interval, runTimes * 1000 * (threadNum as u64) / interval);


    Ok(())
}

Since you're comparing with another tool, are you sure they're doing the same thing? I notice you're restricting the clients to 8 parallel requests maximum (actually you're restricting to running 8 instances of get_async and text_async combined), which might not be what baton is doing.

2 Likes

I have not read baton's code.
I run it with 8 concurrent requests, because my CPU has 8 cores,
I am not sure whether it stands for 8 threads or 8 async request. As my understanding, it is 8 co-routine in Go.

-c Number of concurrent requests

baton -c 8 -t 30 -u http://localhost:8080/

====================== Results ======================
Total requests:                               1183048
Time taken to complete requests:          30.0014848s
Requests per second:                            39433
===================== Breakdown =====================
Number of connection errors:                        0
Number of 1xx responses:                            0
Number of 2xx responses:                      1183048
Number of 3xx responses:                            0
Number of 4xx responses:                            0
Number of 5xx responses:                            0
=====================================================

As a programmer, I have a bad habit as others have, :joy:
likes to compare performance.
On my other machine, compare servers with vertx(Java).
Vertx(Java) +baton ==> 27000/s
Actix web server + baton ==> 18000/s
Hyper server + baton ==> 18000/s
Tide + baton => 9000/s

All are worse than vertx.
I know the test condition is very limited.
But they let me unhappy, because I think Rust should be the no.1
I don't know how to optimize it...

I’d be very interested in trying out similar benchmarks and see what can be improved and understanding how they are working

Can you share also the code you have been using on the server(s)?

Cargo.toml
[dependencies]
actix-web = "*"

Server code is copied from an example of actix-web.

use actix_web::{get, middleware, web, App, HttpRequest, HttpResponse, HttpServer};
#[get("/resource1/{name}/index.html")]
async fn index(req: HttpRequest, name: web::Path<String>) -> String {
    println!("REQ: {:?}", req);
    format!("Hello: {}!\r\n", name)
}

async fn index_async(req: HttpRequest) -> &'static str {
    println!("REQ: {:?}", req);
    "Hello world!\r\n"
}

#[get("/")]
async fn no_params() -> &'static str {
    "Hello world!\r\n"
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    //std::env::set_var("RUST_LOG", "actix_server=info,actix_web=info");
    //env_logger::init();
    HttpServer::new(|| {
        App::new()
            .wrap(middleware::DefaultHeaders::new().header("X-Version", "0.2"))
            .wrap(middleware::Compress::default())
            .wrap(middleware::Logger::default())
            .service(index)
            .service(no_params)
            .service(
                web::resource("/resource2/index.html")
                    .wrap(
                        middleware::DefaultHeaders::new().header("X-Version-R2", "0.3"),
                    )

                    .default_service(
                        web::route().to(|| HttpResponse::MethodNotAllowed()),
                    )
                   .route(web::get().to(index_async)),
            )
            .service(web::resource("/test1.html").to(|| async { "Test\r\n" }))
    })
    .bind("127.0.0.1:8080")?
    .workers(1) //**if set to 8, isahc reaches 18000/s, baton reaches 70000/s**
    .run()
    .await
}

Test on my home PC, windows 10, intel 8 cores, 1.6Ghz, 8G mem.
VertX + baton ==> 65000/s
VertX + Isahc ==> 25000/s
Actix-Web(8 workers) + baton ==> 70000/s
Actix-Web(8 workers) + Isahc ==> 18000/s

Vertx server's code is copied from https://vertx.io/, no any optimization.
Server and client are all run in the same PC.

package vertx;

import io.vertx.core.AbstractVerticle;
import io.vertx.core.Vertx;

public class ServerTest extends AbstractVerticle {
    private static Vertx vertx = Vertx.vertx();

    // Convenience method so you can run it in your IDE
    public static void main(String[] args) throws Exception {
        vertx.deployVerticle(new ServerTest());
    }

    @Override
    public void start() throws Exception {
        vertx.createHttpServer().requestHandler(req -> {
            req.response().putHeader("content-type", "text/html").end("Hello world");
        }).listen(8080);
    }
}

Test on my office PC(for reference)
Vertx + baton ==> 27000/s
Vertx + Hyper client ==> about 18000/s
Vertx + reqwest(based on Hyper) ==> about 18000
Vertx + Actix web client ==> about 4000/s, It can't be used in multi thread because req/resp are not 'Send', different from hyper client.
Vertx + Surf(based on Isahc) ==> about 4000/s
Vertx + Isahc ==> about 4000/s, I like Isahc's API, support dns-resolver/self-signed certificate, according to my requirements.

Maybe this is related:

Thanks for sharing the info.
I’ve been a bit busy, but I will eventually get back on my own tries of the benchmark

Maybe rust is ok, actix-web server is ok.
The gap really exists.
Here is a study by the author of 'isahc'.