AWS Lambda cold start performance with reqwest

I'm attempting to test out cargo lambda for use in deploying rust aws lambdas. Initially when it was just the default cargo lambda that is generated, the cold start times were quite fast, but after adding in reqwest the cold starts are over a hundred milliseconds. I found this blog that hinted that the problem might be native-tls, so I've tried to remove all traces of native-tls and instead am using rustls. This is what my Cargo.toml looks like:

[package]
name = "rust-test-lambda"
version = "0.1.0"
edition = "2021"

[dependencies]

lambda_runtime = "0.8.3"
#openapi = "0.1.5"
#opentelemetry = { version = "0.21.0"}
openssl = { version = "0.10.35", features = ["vendored"], default-features = false }
# Due to https://www.cargo-lambda.info/guide/cross-compiling.html#known-cross-compilation-issues
# we must enable `native-tls-vendored`
reqwest = { version = "0.11", features = [ "serde_json", "blocking", "json", "rustls-tls"], default-features = false }
serde = { version = "1.0.136", default-features = false }
serde-this-or-that = { version = "0.4.2",default-features = false}
serde_derive = { version = "1.0.193", default-features = false}
serde_json = { version = "1.0.108", default-features = false}
serde_with = { version = "3.4.0", default-features = false}
#tokio = { version = "1", features = ["macros"], default-features = false }
tracing = { version = "0.1", features = ["log"], default-features = false }
tracing-subscriber = { version = "0.3", features = ["fmt"], default-features = false }

When I run cargo tree, there is no mention of native

rust-test-lambda v0.1.0 (/Users/tylerthrailkill/Documents/dev/work/rust-test-lambda)
├── lambda_runtime v0.8.3
│   ├── async-stream v0.3.5
│   │   ├── async-stream-impl v0.3.5 (proc-macro)
│   │   │   ├── proc-macro2 v1.0.69
│   │   │   │   └── unicode-ident v1.0.12
│   │   │   ├── quote v1.0.33
│   │   │   │   └── proc-macro2 v1.0.69 (*)
│   │   │   └── syn v2.0.39
│   │   │       ├── proc-macro2 v1.0.69 (*)
│   │   │       ├── quote v1.0.33 (*)
│   │   │       └── unicode-ident v1.0.12
│   │   ├── futures-core v0.3.29
│   │   └── pin-project-lite v0.2.13
│   ├── base64 v0.20.0
│   ├── bytes v1.5.0
│   ├── futures v0.3.29
│   │   ├── futures-channel v0.3.29
│   │   │   ├── futures-core v0.3.29
│   │   │   └── futures-sink v0.3.29
│   │   ├── futures-core v0.3.29
│   │   ├── futures-executor v0.3.29
│   │   │   ├── futures-core v0.3.29
│   │   │   ├── futures-task v0.3.29
│   │   │   └── futures-util v0.3.29
│   │   │       ├── futures-channel v0.3.29 (*)
│   │   │       ├── futures-core v0.3.29
│   │   │       ├── futures-io v0.3.29
│   │   │       ├── futures-macro v0.3.29 (proc-macro)
│   │   │       │   ├── proc-macro2 v1.0.69 (*)
│   │   │       │   ├── quote v1.0.33 (*)
│   │   │       │   └── syn v2.0.39 (*)
│   │   │       ├── futures-sink v0.3.29
│   │   │       ├── futures-task v0.3.29
│   │   │       ├── memchr v2.6.4
│   │   │       ├── pin-project-lite v0.2.13
│   │   │       ├── pin-utils v0.1.0
│   │   │       └── slab v0.4.9
│   │   │           [build-dependencies]
│   │   │           └── autocfg v1.1.0
│   │   ├── futures-io v0.3.29
│   │   ├── futures-sink v0.3.29
│   │   ├── futures-task v0.3.29
│   │   └── futures-util v0.3.29 (*)
│   ├── http v0.2.11
│   │   ├── bytes v1.5.0
│   │   ├── fnv v1.0.7
│   │   └── itoa v1.0.9
│   ├── http-body v0.4.5
│   │   ├── bytes v1.5.0
│   │   ├── http v0.2.11 (*)
│   │   └── pin-project-lite v0.2.13
│   ├── http-serde v1.1.3
│   │   ├── http v0.2.11 (*)
│   │   └── serde v1.0.193
│   │       └── serde_derive v1.0.193 (proc-macro)
│   │           ├── proc-macro2 v1.0.69 (*)
│   │           ├── quote v1.0.33 (*)
│   │           └── syn v2.0.39 (*)
│   ├── hyper v0.14.27
│   │   ├── bytes v1.5.0
│   │   ├── futures-channel v0.3.29 (*)
│   │   ├── futures-core v0.3.29
│   │   ├── futures-util v0.3.29 (*)
│   │   ├── h2 v0.3.22
│   │   │   ├── bytes v1.5.0
│   │   │   ├── fnv v1.0.7
│   │   │   ├── futures-core v0.3.29
│   │   │   ├── futures-sink v0.3.29
│   │   │   ├── futures-util v0.3.29 (*)
│   │   │   ├── http v0.2.11 (*)
│   │   │   ├── indexmap v2.1.0
│   │   │   │   ├── equivalent v1.0.1
│   │   │   │   └── hashbrown v0.14.2
│   │   │   ├── slab v0.4.9 (*)
│   │   │   ├── tokio v1.34.0
│   │   │   │   ├── bytes v1.5.0
│   │   │   │   ├── libc v0.2.150
│   │   │   │   ├── mio v0.8.9
│   │   │   │   │   └── libc v0.2.150
│   │   │   │   ├── num_cpus v1.16.0
│   │   │   │   │   └── libc v0.2.150
│   │   │   │   ├── pin-project-lite v0.2.13
│   │   │   │   ├── socket2 v0.5.5
│   │   │   │   │   └── libc v0.2.150
│   │   │   │   └── tokio-macros v2.2.0 (proc-macro)
│   │   │   │       ├── proc-macro2 v1.0.69 (*)
│   │   │   │       ├── quote v1.0.33 (*)
│   │   │   │       └── syn v2.0.39 (*)
│   │   │   ├── tokio-util v0.7.10
│   │   │   │   ├── bytes v1.5.0
│   │   │   │   ├── futures-core v0.3.29
│   │   │   │   ├── futures-sink v0.3.29
│   │   │   │   ├── pin-project-lite v0.2.13
│   │   │   │   ├── tokio v1.34.0 (*)
│   │   │   │   └── tracing v0.1.40
│   │   │   │       ├── log v0.4.20
│   │   │   │       ├── pin-project-lite v0.2.13
│   │   │   │       ├── tracing-attributes v0.1.27 (proc-macro)
│   │   │   │       │   ├── proc-macro2 v1.0.69 (*)
│   │   │   │       │   ├── quote v1.0.33 (*)
│   │   │   │       │   └── syn v2.0.39 (*)
│   │   │   │       └── tracing-core v0.1.32
│   │   │   │           └── once_cell v1.18.0
│   │   │   └── tracing v0.1.40 (*)
│   │   ├── http v0.2.11 (*)
│   │   ├── http-body v0.4.5 (*)
│   │   ├── httparse v1.8.0
│   │   ├── httpdate v1.0.3
│   │   ├── itoa v1.0.9
│   │   ├── pin-project-lite v0.2.13
│   │   ├── socket2 v0.4.10
│   │   │   └── libc v0.2.150
│   │   ├── tokio v1.34.0 (*)
│   │   ├── tower-service v0.3.2
│   │   ├── tracing v0.1.40 (*)
│   │   └── want v0.3.1
│   │       └── try-lock v0.2.4
│   ├── lambda_runtime_api_client v0.8.0
│   │   ├── http v0.2.11 (*)
│   │   ├── hyper v0.14.27 (*)
│   │   ├── tokio v1.34.0 (*)
│   │   └── tower-service v0.3.2
│   ├── serde v1.0.193 (*)
│   ├── serde_json v1.0.108
│   │   ├── itoa v1.0.9
│   │   ├── ryu v1.0.15
│   │   └── serde v1.0.193 (*)
│   ├── serde_path_to_error v0.1.14
│   │   ├── itoa v1.0.9
│   │   └── serde v1.0.193 (*)
│   ├── tokio v1.34.0 (*)
│   ├── tokio-stream v0.1.14
│   │   ├── futures-core v0.3.29
│   │   ├── pin-project-lite v0.2.13
│   │   └── tokio v1.34.0 (*)
│   ├── tower v0.4.13
│   │   ├── futures-core v0.3.29
│   │   ├── futures-util v0.3.29 (*)
│   │   ├── pin-project v1.1.3
│   │   │   └── pin-project-internal v1.1.3 (proc-macro)
│   │   │       ├── proc-macro2 v1.0.69 (*)
│   │   │       ├── quote v1.0.33 (*)
│   │   │       └── syn v2.0.39 (*)
│   │   ├── pin-project-lite v0.2.13
│   │   ├── tower-layer v0.3.2
│   │   ├── tower-service v0.3.2
│   │   └── tracing v0.1.40 (*)
│   └── tracing v0.1.40 (*)
├── openssl v0.10.60
│   ├── bitflags v2.4.1
│   ├── cfg-if v1.0.0
│   ├── foreign-types v0.3.2
│   │   └── foreign-types-shared v0.1.1
│   ├── libc v0.2.150
│   ├── once_cell v1.18.0
│   ├── openssl-macros v0.1.1 (proc-macro)
│   │   ├── proc-macro2 v1.0.69 (*)
│   │   ├── quote v1.0.33 (*)
│   │   └── syn v2.0.39 (*)
│   └── openssl-sys v0.9.96
│       └── libc v0.2.150
│       [build-dependencies]
│       ├── cc v1.0.83
│       │   └── libc v0.2.150
│       ├── openssl-src v300.1.6+3.1.4
│       │   └── cc v1.0.83 (*)
│       ├── pkg-config v0.3.27
│       └── vcpkg v0.2.15
├── reqwest v0.11.22
│   ├── base64 v0.21.5
│   ├── bytes v1.5.0
│   ├── encoding_rs v0.8.33
│   │   └── cfg-if v1.0.0
│   ├── futures-core v0.3.29
│   ├── futures-util v0.3.29 (*)
│   ├── h2 v0.3.22 (*)
│   ├── http v0.2.11 (*)
│   ├── http-body v0.4.5 (*)
│   ├── hyper v0.14.27 (*)
│   ├── hyper-rustls v0.24.2
│   │   ├── futures-util v0.3.29 (*)
│   │   ├── http v0.2.11 (*)
│   │   ├── hyper v0.14.27 (*)
│   │   ├── rustls v0.21.9
│   │   │   ├── log v0.4.20
│   │   │   ├── ring v0.17.5
│   │   │   │   ├── getrandom v0.2.11
│   │   │   │   │   ├── cfg-if v1.0.0
│   │   │   │   │   └── libc v0.2.150
│   │   │   │   └── untrusted v0.9.0
│   │   │   │   [build-dependencies]
│   │   │   │   └── cc v1.0.83 (*)
│   │   │   ├── rustls-webpki v0.101.7
│   │   │   │   ├── ring v0.17.5 (*)
│   │   │   │   └── untrusted v0.9.0
│   │   │   └── sct v0.7.1
│   │   │       ├── ring v0.17.5 (*)
│   │   │       └── untrusted v0.9.0
│   │   ├── tokio v1.34.0 (*)
│   │   └── tokio-rustls v0.24.1
│   │       ├── rustls v0.21.9 (*)
│   │       └── tokio v1.34.0 (*)
│   ├── ipnet v2.9.0
│   ├── log v0.4.20
│   ├── mime v0.3.17
│   ├── once_cell v1.18.0
│   ├── percent-encoding v2.3.0
│   ├── pin-project-lite v0.2.13
│   ├── rustls v0.21.9 (*)
│   ├── rustls-pemfile v1.0.4
│   │   └── base64 v0.21.5
│   ├── serde v1.0.193 (*)
│   ├── serde_json v1.0.108 (*)
│   ├── serde_urlencoded v0.7.1
│   │   ├── form_urlencoded v1.2.0
│   │   │   └── percent-encoding v2.3.0
│   │   ├── itoa v1.0.9
│   │   ├── ryu v1.0.15
│   │   └── serde v1.0.193 (*)
│   ├── system-configuration v0.5.1
│   │   ├── bitflags v1.3.2
│   │   ├── core-foundation v0.9.3
│   │   │   ├── core-foundation-sys v0.8.4
│   │   │   └── libc v0.2.150
│   │   └── system-configuration-sys v0.5.0
│   │       ├── core-foundation-sys v0.8.4
│   │       └── libc v0.2.150
│   ├── tokio v1.34.0 (*)
│   ├── tokio-rustls v0.24.1 (*)
│   ├── tower-service v0.3.2
│   ├── url v2.4.1
│   │   ├── form_urlencoded v1.2.0 (*)
│   │   ├── idna v0.4.0
│   │   │   ├── unicode-bidi v0.3.13
│   │   │   └── unicode-normalization v0.1.22
│   │   │       └── tinyvec v1.6.0
│   │   │           └── tinyvec_macros v0.1.1
│   │   └── percent-encoding v2.3.0
│   └── webpki-roots v0.25.3
├── serde v1.0.193 (*)
├── serde-this-or-that v0.4.2
│   └── serde v1.0.193 (*)
├── serde_derive v1.0.193 (proc-macro) (*)
├── serde_json v1.0.108 (*)
├── serde_with v3.4.0
│   └── serde v1.0.193 (*)
├── tracing v0.1.40 (*)
└── tracing-subscriber v0.3.18
    ├── sharded-slab v0.1.7
    │   └── lazy_static v1.4.0
    ├── thread_local v1.1.7
    │   ├── cfg-if v1.0.0
    │   └── once_cell v1.18.0
    └── tracing-core v0.1.32 (*)

I am still seeing at least 100ms cold start times though. My lambda does need to call out to an https service, so I cannot avoid tls.

Here is most of the code, excluding structs. If there's anything else I can provide, please let me know...

mod models;
use reqwest::header::{HeaderMap, HeaderValue, USER_AGENT, CONTENT_TYPE, AUTHORIZATION};
use lambda_runtime::{run, service_fn, Error, LambdaEvent};
use crate::models::redacted::api::{LoginForm, LoginResponse, CalculateRequest, CalculateResponse};
use serde::{Deserialize, Serialize};
use crate::models::UnifiedInput;


static EXTERNAL_COMPANY_ID: &str = "redacted";

/// This is a made-up example. Requests come into the runtime as unicode
/// strings in json format, which can map to any structure that implements `serde::Deserialize`
/// The runtime pays no attention to the contents of the request payload.
#[derive(Deserialize)]
struct Request {
    command: String,
}

/// This is a made-up example of what a response structure may look like.
/// There is no restriction on what it can be. The runtime requires responses
/// to be serialized into json. The runtime pays no attention
/// to the contents of the response payload.
#[derive(Serialize)]
struct Response {
    req_id: String,
    msg: String,
}



fn construct_headers(token: String) -> HeaderMap {
    let mut headers = HeaderMap::new();
    headers.insert(USER_AGENT, HeaderValue::from_static("reqwest"));
    headers.insert(CONTENT_TYPE, HeaderValue::from_static("application/vnd.tri.redacted.idt+json"));
    let bearer_token = format!("Bearer {}", token);
    headers.insert(AUTHORIZATION, HeaderValue::from_str(&bearer_token).unwrap());
    headers.insert("Correlation-Id", "redacted".parse().unwrap());
    headers
}

/// This is the main body for the function.
/// Write your code inside it.
/// There are some code example in the following URLs:
/// - https://github.com/awslabs/aws-lambda-rust-runtime/tree/main/examples
/// - https://github.com/aws-samples/serverless-rust-demo/
async fn function_handler(event: LambdaEvent<UnifiedInput>) -> Result<CalculateResponse, Error> {
    // Extract some useful info from the request
    let command = event.payload.opportunity.prospect_id;


    let client = reqwest::Client::new();
    let form = LoginForm {
        client_id: "redacted".to_string(),
        scopes: "redacted".to_string(),
        grant_type: "client_credentials".to_string(),
        client_secret: "redacted".to_string(),
    };

    use std::time::Instant;
    let now = Instant::now();

    let res = client.post("https://redacted.com/oauth2/v1/token")
        .form(&form)
        .send()
        // .await.unwrap().text().await.unwrap();
        .await;


    let elapsed = now.elapsed();
    println!("Elapsed: {:.2?}", elapsed);

    let response = res.unwrap();
    let token = match response.status() {
        reqwest::StatusCode::OK => {
            // on success, parse our JSON to an APIResponse
            match response.json::<LoginResponse>().await {
                Ok(parsed) => {
                    println!("Login Succeeded! {:?}", parsed);
                    Ok(parsed.token)
                },
                Err(_) => {
                    println!("Unable to login");
                    Err("Unable to login")
                },
            }
        }
        other => {
            panic!("Uh oh! Something unexpected happened: {:?}", other);
        }
    };

    println!("{:#?}", token);
    let access_token = token.unwrap();

    let calculate_json = r#"
    {
    "key": "value"
  }
  "#;

    let calculate_request: CalculateRequest = serde_json::from_str(calculate_json).unwrap();

    let now = Instant::now();
    let calculate_response = client.post("https://redacted")
        .json(&calculate_request)
        .headers(construct_headers(access_token))
        .send()
        // .await.unwrap().text().await.unwrap();
        .await
        .unwrap();
    let elapsed = now.elapsed();
    println!("Elapsed: {:.2?}", elapsed);

    match calculate_response.status() {
        reqwest::StatusCode::OK => {
            let response_body = calculate_response.text().await.unwrap();
            let response_body_clone = response_body.clone();

            match serde_json::from_str::<CalculateResponse>(&response_body_clone) {
                Ok(parsed) => {
                    println!("Calculate and deserialize succeeded! {:?}", parsed);
                    Ok(parsed)
                },
                Err(err) => {
                    println!("Unable to deserialize {:?} {:?}", err, response_body);
                    Err(Error::try_from("Unable to deserialize").unwrap())
                },
            }
        }
        other => {
            println!("lkdf {:?}", other);
            Err(Error::try_from(calculate_response.text().await.unwrap()).unwrap())
        }
    }
}

#[tokio::main]
async fn main() -> Result<(), Error> {
    tracing_subscriber::fmt()
        .with_max_level(tracing::Level::INFO)
        // disable printing the name of the module in every log line.
        .with_target(false)
        // disabling time is handy because CloudWatch will add the ingestion time.
        .without_time()
        .init();
    run(service_fn(function_handler)).await
}

One factor is that cold start is very sensitive to binary size, and the default request feature config pulls in quite a bit with http2 support with the h2 crate, for example (from memory about a megabyte?)

Another, although it strictly isn't budgeted to cold start time proper, is that the lower end lambda sizes are just way slower than you might expect, and TLS isn't free. I haven't tracked it down exactly, but I've seen TLS establishing taking in the 100ms range in Node at the bottom end sizes (below 512MB). Since that's mostly in openssl, I don't think that's the fault of JS!

I went ahead and configured the lambda to have 1024mb of mem and I'm still seeing 100ms+ times.

binary size doesn't seem that big, though I've not written very much rust so I'm not sure what size you would expect for the above code and a good number of structs.

❯ ls -al target/lambda/rust-test-lambda/
Permissions Size User            Date Modified Name
.rwxr-xr-x@ 4.6M tylerthrailkill 28 Nov 12:32  bootstrap
.rw-r--r--@ 2.1M tylerthrailkill 29 Nov 15:04  bootstrap.zip

Yep, that's actually pretty good cold start for a 2MB payload: Size is (almost) all that matters for optimizing AWS Lambda cold starts | by Adrian Tanasa | Medium

TLDR: start a weight loss program I guess! I think using ClientBuilder::http1_only will avoid linking in http2 support, which is pretty chunky, you probably also want to switch back to native tls so you don't include that code.

After that it gets a bit tough to slim down. You might find GitHub - dtolnay/cargo-llvm-lines: Count lines of LLVM IR per generic function handy for example.

1 Like

Yep, that's actually pretty good cold start for a 2MB payload:

are you sure? pretty much all of our lambdas are written in kotlin and compiled to native using graalvm and they're like 50mb each (we use drools and have tens of thousands of rules). Most of them have around 500ms-1s cold start times. 100ms for 2mb sounds absolutely horrendous to me.

I'll try out the other things you reference, but I'm quite bad at rust so I'll be back to ask more questions I'm sure.

seems like the majority of the size actually comes from serde_json... which I'm not sure I can get rid of (or want to get rid of). Hm. that stinks. I really though it would be faster than that. I'm pretty sure I've seen our kotlin lambdas have just as fast of a cold start, meaning we'd not be gaining too much with using rust for some things.

You could give this a try? serde_lite - Rust

Never tried it, but it seems to be a decent approach.