Is this the fastest idiomatic code for a simple counter TCP server?

Hello! I was experimenting with std::net for writing a TCP server as part of a personal project. While working on this, I was going to try out

  • a single-threaded server using std::net::TcpListener
  • a multi-threaded server using a single thread for the listener, but spawning threads to handle each stream
  • a multi-threaded server with multiple threads listening (using reuse_port and reuse_address, probably using the net2 crate)
  • a server using Tokio’s TcpListener and tasks to handle each stream
  • a server using Tokio’s TcpListener and multiple threads listening, as in this gist by Alex Crichton
  • [edit] After reading about Redis’ design, also considering directly using mio’s non-blocking TcpListener

While trying to familiarise myself with all these available options, I decided to try and see how fast the simplest architecture of a single threaded server would be for my use case - a request counter. My project was going to be a metric collector where clients could increment and retrieve the current value of a metric, but I decided to build the simplest case of a single counter first.

After some trial and error, I wrote the code below. Writing another small Rust binary to create 4 threads and make 100K requests on each thread, I found its performance to be around 30k requests/second. I was wondering whether:

  1. I’m making some silly mistakes in the server code, the measurement or this idea of testing speed in the first place.
  2. Whether the code below is the fastest (while still being somewhat idiomatic) single-threaded TCP server that counts the number of requests made.

Sending “inc” increments the counter and returns “ok” and sending “read” returns the current value of the counter.

use std::io;
use std::io::{Read, Write};
use std::net::{TcpListener, TcpStream};

fn main() {
    run_server().unwrap();
}

fn run_server() -> io::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:9000")?;
    let mut c = 0i64;
    for stream in listener.incoming() {
        handle_client(stream?, &mut c);
    }
    Ok(())
}

fn handle_client(mut stream: TcpStream, counter: &mut i64) {
    // In my local tests, all input was less than 8 bytes, so I thought using
    // an 8 byte buffer would be fastest?
    let mut buf = [0u8; 8];
    // Also tried BufReader, but in this case we're dealing with <16 bytes
    // per request, so it probably doesn't make sense?
    let num_bytes_read = stream.read(&mut buf).unwrap();
    // Noticed that operating on the buffer was faster than String, probably
    // because of the overhead of from_utf8?
    if num_bytes_read >= 6 && &buf[0..4] == b"read" {
        // A little worried about the overhead of UTF-8 strings here.
        stream.write(counter.to_string().as_bytes()).unwrap();
    } else if num_bytes_read >= 5 && &buf[0..3] == b"inc" {
        *counter = *counter + 1;
        stream.write("ok".as_bytes()).unwrap();
    } else {
        // Not actually hitting this branch in the tests I was running.
        stream.write("unknown".as_bytes()).unwrap();
    }
}

Thanks!

Note that there’s no guarantee you’re going to get a whole message in one read operation. Since TCP is a streaming protocol, you might get this from two successive reads:

  • b"rea"
  • b"d\r\ninc\r\n"

A loop that for some end-of-frame condition is usually required, or read_exact if you know the number of bytes.

Similarly, write doesn’t necessarily write the whole reply; use write_all for that. write makes exactly one syscall, which might not be able to write everything if the send buffer fills up. write_all loops internally until everything is sent or an error occurs.

These conditions are usually never met while testing locally, which is why everything seems to work fine until you get into a situation where the packets get bigger, the two sockets are far apart and packets get lost, etc.

1 Like

Ah, of course - that makes sense. Thanks for pointing this out, will fix the code!