Axum request limits

My job involves building & supporting an archive of data files that are hosted on S3, and I recently used Rust and axum to write a WebDAV interface to this archive. My colleague has been running performance tests on it (yes, built using --release), mostly by using rclone to download a directory containing 188 files. Increasing the number of concurrent requests that rclone uses results in the total download time decreasing, but only until it gets to about 20 concurrent requests, after which the decrease stops and the runtime plateaus. I can't figure out why this plateau happens where it does, but it seems that the server only supports so many concurrent requests, and we'd like to raise this limit somehow.

Some details:

  • The machine the server is running on has 32 CPUs (according to Python's os.cpu_count() function).

  • I haven't checked exactly what rclone is doing, but the download process should basically go like this:

    • rclone makes a PROPFIND request to the download directory URL on the WebDAV server to get a list of the directory's entries.
    • For each entry, rclone makes a GET request to the corresponding path under the directory URL, and the WebDAV server replies with a redirect to an S3 download URL. This is the part that rclone can parallelize.
  • The WebDAV server is implemented as a tower::Service via tower::service_fn() wrapping a function that operates on an Arc<MyHandler>.

  • The server serves two top-level hierarchies, implemented with different code paths serving the same files in different ways. We're testing downloading from each hierarchy, and they both exhibit the 20-request limit.

  • One hierarchy — let's call it /apple — handles requests by making a request to our data archive's REST API and then (for the specific cases under test) making a request to S3 to fetch further file-specific details.

    • The requests to the data archive go through a single reqwest::Client instance.
    • The requests to S3 go through an aws_sdk_s3::Client that (for reasons I won't go into) is instantiated on first use and cached in a moka::future::Cache that should only ever hold one S3 client instance throughout the runtime of the server.
  • The other hierarchy — let's call it /banana — handles requests of the form /banana/:id/:path by making a request to a file server (different from the other servers listed so far) to download a JSON file for the given :id containing details about a tree of files on S3, and then details for the given :path in the tree are extracted from the JSON file and returned.

    • The requests to the file server go through another reqwest::Client instance.
    • The JSON files for the last 16 :id's requested are cached in a moka::future::Cache.

So, is the apparent 20-request limit due to axum or reqwest or moka or something else? How can we improve or otherwise further analyze things?

  • I haven't been able to find any details about axum's default rate-limiting; it seems like it may not even have any.

  • The documentation for reqwest::Client states that it uses connection pooling, but I haven't been able to find details on the limits of this pool. (Should I create a pool of Clients somehow for more concurrency?)

  • The moka docs say the cache is "lock-free", so that's probably not limiting anything, but who knows?

Were you able to expand the connection pool buffer?

No. The only progress I've made is confirming that axum doesn't have any request limits by default.