Why is my hyper gateway service slow?

As part of a hackday at work I decided to apply my fledgling Rust skills towards to porting a very thin gateway service from Node/Express to Rust/Hyper. This service acts as a CORS proxy, forwarding any request to it to a URL specified in a url query parameter and adding CORS headers to the response.

Code:

use std::collections::HashMap;

use hyper_tls::HttpsConnector;
use hyper::{HeaderMap, header, service::{make_service_fn, service_fn}};
use hyper::{Body, Client, Method, Request, Response, Server, StatusCode, Uri};

type HttpClient = Client<HttpsConnector<hyper::client::HttpConnector>>;

// Response messages
const MISSING: &[u8] = b"Route not found.";
const OK: &[u8] = b"OK";
const MISSING_URL_PARAMETER: &[u8] = b"Required \"url\" query parameter not found.";
const INVALID_URL: &[u8] = b"Invalid \"url\" query parameter provided.";

// Parameters
const URL_PARAM: &str = "url";

fn respond_with_message_body(status: StatusCode, body: &'static [u8]) -> Response<Body> {
    Response::builder()
        .status(status)
        .body(body.into())
        .unwrap()
}

fn parse_query_params(req: &Request<Body>) -> HashMap<String, String> {
    req
        .uri()
        .query()
        .map(|query_string| {
            url::form_urlencoded::parse(query_string.as_bytes())
                .into_owned()
                .collect()
        })
        .unwrap_or_else(HashMap::new)
}

fn apply_cors_headers(res_headers: &mut HeaderMap, req_headers: HeaderMap) {
    res_headers.insert(header::ACCESS_CONTROL_ALLOW_CREDENTIALS, header::HeaderValue::from_static("true"));
    res_headers.insert(header::ACCESS_CONTROL_ALLOW_HEADERS, header::HeaderValue::from_static("authorization"));
    res_headers.insert(header::ACCESS_CONTROL_ALLOW_METHODS, header::HeaderValue::from_static("GET,HEAD,PUT,PATCH,POST,DELETE"));
    if let Some(origin) = req_headers.get(header::ORIGIN) {
        res_headers.insert(header::ACCESS_CONTROL_ALLOW_ORIGIN, origin.clone());

    }
}

async fn proxy_request(mut req: Request<Body>, client: HttpClient) -> Result<Response<Body>, hyper::Error> {
    let params = parse_query_params(&req);
    let proxied_url = match params.get(URL_PARAM) {
        Some(url_param) => {
            if let Ok(url) = url::Url::parse(url_param) {
                url
            } else {
                return Ok(respond_with_message_body(StatusCode::BAD_REQUEST, INVALID_URL));
            }
        }
        None => {
            return Ok(respond_with_message_body(StatusCode::BAD_REQUEST, MISSING_URL_PARAMETER))
        }
    };

    let uri = proxied_url.into_string().parse::<Uri>().unwrap();
    let req_headers = req.headers().clone();

    *req.uri_mut() = uri;

    let mut response = match client.request(req).await {
        Ok(res) => res,
        Err(e) => {
            return Ok(
                Response::builder().
                status(StatusCode::INTERNAL_SERVER_ERROR)
                .body(format!("Could not complete request: \"{}\"", e).into())
                .unwrap()
            )
        }


    };

    apply_cors_headers(response.headers_mut(), req_headers);
    Ok(response)
}

async fn route_request(req: Request<Body>, client: HttpClient) -> Result<Response<Body>, hyper::Error> {
   match (req.method(), req.uri().path()) {
       (&Method::OPTIONS, "/") => {
           let mut response = Response::builder().status(StatusCode::NO_CONTENT).body(Body::empty()).unwrap();
           let req_headers = req.headers().clone();
           apply_cors_headers(response.headers_mut(), req_headers);

           Ok(response)
       }
        (_, "/") => proxy_request(req, client).await,
        (&Method::GET, "/healthcheck") => Ok(respond_with_message_body(StatusCode::OK, OK)),
        _ => Ok(respond_with_message_body(StatusCode::NOT_FOUND, MISSING))
    }
}

#[tokio::main]
async fn main() {
    // We'll bind to 127.0.0.1:3000
    let addr = "127.0.0.1:3000".parse().unwrap();

    let https = HttpsConnector::new();
    let client = Client::builder().build::<_, hyper::Body>(https);

    let make_svc = make_service_fn(move |_| {
        let client = client.clone();

        async {
            Ok::<_, hyper::Error>(service_fn(move |req| {
                route_request(req, client.to_owned())
             }))
        }
    });

    let server = Server::bind(&addr).serve(make_svc);

    // Run this server for... forever!
    if let Err(e) = server.await {
        eprintln!("server error: {}", e);
    }
}

In some rudimentary benchmarks my Rust service takes about 100ms longer on average than its Node counterpart to complete the same request. I have a hunch that my biggest problem is awaiting the response executed by the Hyper client, which presumably buffers the entire response before returning it, but I'm at a loss for what to do given that I need to modify my response headers.

Would love any guidance on how to improve what's here, even if it's just idiomatic improvements.

Standard question: have you enabled optimizations by building with the --release flag?

The Body object is an abstraction over a stream, so if you're just passing it through, it should nicely stream the response with minimal buffering.

Yep, release is enabled.

What is the full type of response.body()? The docs say:

The body component is generic, enabling arbitrary types to represent the HTTP body. For example, the body could be Vec<u8> , a Stream of byte chunks, or a value that has been deserialized.

and I'm wondering whether client.request(req) reads everything into a Vec<u8>.

Blockquote What is the full type of response.body() ?

I'm not sure. This is a gateway/proxy so I don't know the type ahead of time.

I meant the Rust type, not content/mime type. To check whether it's Vec<u8> and therefore the request's body is read fully into memory before sending it back or it's a stream in which case it'd work the way @kornel described.

I believe it's a stream of bytes? response.body() just returns a &Body, but as far as I can tell it's a stream.

A Body is indeed a stream of bytes.

For debugging concurrency stuff like this I'd recommend using tracing and tracing_futures to instrument your code and see what the biggest bottleneck is.

Then if I can't figure it out from that, I'll hook up tracing with Jaeger to visualize requests.