Minimal Rocket app generates probably enormous amount of heap allocations

Hey Rust lang community,

I'm playing around with Rocket web framework. I need a really simple yet fast HTTP application that puts some data from HTTP requests into Redis.

To achieve this I took Rocket (as it looks easy to start with), deadpool for handling connections to Redis and serde_json for serializing data from HTTP to store it further as a message in the Redis Stream.

Once I started to benchmark my few-lines application using wrk I noticed that the application do not run fast enough. On my Ubuntu laptop the release build serves only 60K requests/sec. When I threw away all my code and kept a basic "hello world" GET-responder the app gave somewhat 80K requests/sec.

What I desire to achieve is nearly 120-150K requests/sec served. So, I need a nearly 2x increase in throughput. I tried to tune here and there, a number of workers for instance, and I quickly realized that I do not evaluate bottlenecks of my app properly. Digging even further I have discovered that the count of heap allocations looks abnormal. Valgrind shows 1.9M allocations after a wrk run on 23K requests. So, it is 82 allocation per request. Does it seem to be reasonable for app that handles TCP connections, deserializes HTTP headers, then serialize them and send over next wire?

Thank you for reading until this line. I am asking about suggestion on what to focus on in profiling and improving the application performance. Should I change the web-framework this on warp or axum? How to properly search for bottleneck in this simple app but the complicated environment? Could you suggest an improvement from your past experience with Rust and web-frameworks?

Define ā€œreasonableā€. I remember time when my friend had trouble with performance of his Android app (in early days when it was Harmony/Dalvik based) and have found out that one, single, printf in his code created more than 200 temporary objects.

I would say that 82 allocations per request is more-than-adequate for convenience-based framework but if you are planning to achieve speed-records then you, of course, need something else.

That's most definitely achievable (Cloudfare replaced NGINX with Rust-based Pingora for a reason), but I wouldn't expect to see number like that with general-purpose web-framework.

These to be oriented more toward ā€œgood performance, great ergonomicsā€ rather than toward the ā€œsmallest overhead possibleā€.

The general expectation is that your pages take some ā€œreasonable timeā€ to be created, not that you just serve mostly-static content.

From my experience most web frameworks are not geared toward these extreme numbers. I suspect Actix-Web may achieve 120-150K, not sure about Axum.

But web-servers geared toward static content serving, like NGINX, may do 10x of that… they are just not very usable as general-purpose frameworks.

Thank you, @khimru for pointing out to the post about Pingora.
I decided to play with lower level solutions. Hyper looks good enough. Will update on my challenge if anybody interested.

1 Like