Techniques for high perf server

What are the best known, tried and tested techniques in designing/implementing server that needs to deal with large number of requests?
At the moment I'm using tokio which basically does the following:

fn main()

Any advice more than welcome.
Thank you

What kind of large is a "large number of requests" for you?

I think the answer to your question depends more on what is inside your process_connection() function than the performance of Tokio in socket handling.

If you mean LARGE then think the likes of google search. How many machines can you afford to run this on?

1 Like

Hi, yeah, good point. Large is about 30k of requests per sec.
In the process_connection fn what I'm doing is basically parsing the incomming data and creating objects from it which are then further send "somewhere".

It seems to me that at the bottom of a server you have the operating system, it's network drivers, router, TCP/IP stack, process scheduler etc. Already a huge amount of code. At your application there is that process_connection() thing which who knows what in it. In the middle there is Tokio, which is likely negligible in comparison to the rest.

Whatever you do you will end up binding to a socket, listening to it for connections, then spawning some work to handle the connection.

There are two extreme possibilities here:

  1. Your process_connection() ends up doing a lot of computation which could hog the machine for a long while. Likely it is better to use a synchronous server here, with actual threads, such that the compute work can be done on the other cores in your machine, spreading the load.

  2. Your process_connection actually does very little work, parsing requests, formatting responses but it relies on getting something else to do the work, a database or whatever, in which case it will spend a lot of time waiting on replies from other services. Likely a async server, using Tokyo for example, is appropriate as it can juggle very many of those async tasks that are just waiting on things most of the time.

In short: Use sync for do a lot of computational work, use async for doing a lot of waiting.

The situation is clouded somewhat as Tokio can spread it async tasks to other cores as well.

Bottom line is that you are going to have to measure this. Build some approximation to the job you actually want to do and see how it flies.

Yes, I am aware of the fact that I'll have to measure. That goes without saying. My question was rather, are there some well known, tried and tested techniques in designing a server architecture that help creating high perf app. Kind of "Design pattern" if you like.

I don't know really.

I observe that back in the day it was common for servers to fork a whole new process to handle each connection.

That was a bit heavy duty so as loads got higher with the internet and web servers things moved to spawning threads, which was less overhead than processes. I believe that is how the Apache web server evolved.

As loads continued to go up a lighter weight solution for handling lots of connections was found by going asynchronous. Like the NGINX web server.

I guess these are all "patterns" for building servers in themselves.

The "C10K Problem" is a good read about this evolution: C10K Problem: Understanding and Overcoming the 10,000 Concurrent Connections Challenge | Web Hosting Geeks' Blog.

That is really nice read. Thanks.

You didn't even specify absolute basics. TCP? UDP? HTTP? TLS?

E.g. if you were doing unencrypted UDP you could probably do Mp/s on a single core without using async.

As far as the network stack goes this isn't all that much as long as the connections are well-behaved.
Assuming TCP you might run into issues if they're malicious such as doing slow read attacks because you may be able to handle 30k connection initiations per second but that's not the same as the millions of connections that accumulate if they don't get closed swiftly.

That's kinda vague, that could easily be more work than the network part. Or not.

The techniques depend on where your bottlenecks are, what edge-cases you want to deal with, how you want to trade off latency vs. throughput etc.

Just start writing a prototype and load-test it. If it doesn't hold up then post a concrete issue you're encountering.


Yes, you are right: It is TCP

I suggest reading here:

and a follow-up here:

[ I have no connection to Cloudflare at all ]


Thank you. Indeed, very interesting reads.