Basic HTTP server implementation

Hello

I have used the Actix-web framework quite a lot for previous projects. But for learning purposes and experimenting a bit I wanted to implement a simple HTTP server myself using less abstractions and more "low-level". With low-level I mean not using too many libraries or frameworks but implementing as much as possible on my own. I have achieved this by using TCP streams and reading the content of these streams into a buffer. After which I manually interpret the string and process it, which is pretty cool and interesting.

Currently I have made a working implementation which runs pretty smooth and is pretty cool IMHO. Just like in a real framework the programmer can predefine routes and route-functions. In the main there is a vector holding these routes which the programmer can easily edit.

This current version supports multiple HTTP methods like GET and POST, it supports custom URL-parameters,...

Notes:

  • This is purely experimental learning code. There is no checking for edge-cases or malformed requests. I kept it basic.
  • In my code I have tagged some comments with [REVIEW] on places where I think my code can be improved or I have questions.
  • Currently everything is in one file for simplicity. But actually I'd put Routes in a different module so the programmer has a separate file to define his Route functions.

Questions:

  • In my code I have tagged comments with questions using [REVIEW]. Most of the questions are about better ways to implement something or searching for more elegant solutions.
  • I also want to know if there are ways to improve the performance of my code somewhere. I for example use clone() in the goto function to convert a mutable object to a immutable object. In the route functions I also use cloned() to create a cloned version of the iterator to retrieve the URL-parameter values. Can this be improved?
  • Most important: Considering that we agree that this code shouldn't be used in production. How is the overall quality and readability of my code? Would you consider it somewhat professional or rather the code of a total beginner?

Example output:

$ curl http://127.0.0.1:8080
This is the homepage.
$ curl http://127.0.0.1:8080/welcome/foobar
Welcome foobar
$ curl http://127.0.0.1:8080/welcome/foobar/25
Welcome foobar, your age is 25
$ curl -X POST http://127.0.0.1:8080/print -d '{"name": "foobar"}'
# Server output:
# Name foobar

Code:

// Tokio
use tokio_tungstenite::{connect_async, tungstenite::protocol::Message};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};
// Futures
use futures_util::{future, pin_mut, StreamExt};
use futures::FutureExt;
use futures::future::BoxFuture;
// String manipulation (JSON, serialization, regexes,...)
use serde::{Deserialize, Serialize};
use regex::Regex;
use serde_json;
// std
use std::sync::Arc;
use std::error::Error;
// Other
use async_trait::async_trait;
use reqwest;
use url;

// Some seemingly unused imports are actually used in the excluded modules.
//mod api;
//use api::ExchangeAPI;

// Define type for tuple containing routes. (HTTP-method, path, fn pointer to route function).
//
// [REVIEW] I have this type defined so the programmer can define routes in the main function like
// this:
// ("GET", "/my/path", |r, h| Routes::index(r, h).boxed())
// This seemed the cleanest way to do it? Is there a better way? Would you maybe use a struct
// instead?
type Route = (&'static str, &'static str, fn(&Routes, Http) -> BoxFuture<Result<HttpResponse, String>>);

// Macros
//
// This macro improves the readability of the code. Instead of
// ("GET", "/", |r, h| Routes::index(r, h).boxed()) the programmer can just
// type route!("GET", "/", Routes::index).
macro_rules! route {
    ($m:expr, $p:expr, $fn:path) => {
        ($m, $p, |r, h| $fn(r, h).boxed())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    
    // Start server.
    let server = TcpListener::bind("127.0.0.1:8080").await.expect("Failed to bind to 127.0.0.1:8080");
    println!("Listening on 127.0.0.1:8080...");

    // Define all your routes here.
    let routes: Arc<std::vec::Vec<Route>> = Arc::new(vec![
        route!("GET", "/", Routes::index),
        route!("GET", "/welcome/{name}", Routes::welcome),
        route!("GET", "/welcome/{name}/{age}", Routes::welcome_age),
        route!("POST", "/print", Routes::print_name),
        route!("GET", "/html-page", Routes::html_page),
    ]);

    // Spawn process to handle incoming streams.
    // [REVIEW] I currently pass the vector of routes as a parameter to the
    // functions which require the roots. Would be using a lazy_static be more clean?
    loop {
        let cloned_routes = routes.clone();
        let (mut stream, _) = server.accept().await.unwrap();
        tokio::spawn(async move {
            if let Err(e) = process(&mut stream, &cloned_routes).await {
                eprintln!("Error: {}", e);
            }
        });
    }

}

// Process the incoming streams.
async fn process(stream: &mut TcpStream, routes: &std::vec::Vec<Route>) -> Result<(), Box<dyn Error>> {
   
    // Buffer holding the stream in bytes.
    let mut buffer = [0; 1024];
    if let Err(e) = stream.read(&mut buffer).await {
        eprintln!("Error: {}", e);
        
        let response = Routes::internal_server_error().await;
        stream.write_all(response.to_string().as_bytes()).await.unwrap();
        stream.flush().await.unwrap();

        return Ok(());
    }
   
    // Create a string from the contents of the buffer.
    let stream_string = String::from_utf8_lossy(&buffer[..]);

    // Turn the raw stream as string into a workable Http-object.
    let mut http = Http::from_str(&stream_string);
    
    // Check if there is a route matching the HTTP-method and path.
    let response = match Routes::goto(&mut http, routes).await {
        Ok(r) => r,
        Err(e) => {
            eprintln!("Error: {}", e);
            Routes::internal_server_error().await
        }
    };

    // Write response to stream as answer.
    stream.write_all(format!("{}\n", response).as_bytes()).await.unwrap();
    stream.flush().await.unwrap();

    Ok(())
}

// Struct for incoming HTTP requests.
#[derive(Clone)]
struct Http {
    method: String,
    path: String,
    host: String,
    params: std::collections::HashMap<String, String>,
    body: String,
}

impl Http {
    // Generate Http-object from raw stream string.
    // Note: This function assumes that all incoming requests are well formed in the expected
    // format.
    fn from_str(str: &str) -> Self {
        let lines : std::vec::Vec<&str> = str.lines().collect();
        let request_line : std::vec::Vec<&str> = lines[0].split_whitespace().collect();
        
        let method = request_line[0];
        let path = request_line[1];
        let host = lines[1].split_whitespace().collect::<std::vec::Vec<&str>>()[1];
      
        // [REVIEW] Here I extract the body from the request. The logic is that in the request
        // the headers are always beneath each other and the body comes after the headers
        // after a new line (AKA teh seperator). Is there a more elegant way to do this?
        let mut body = "".to_string();
        let mut sep_found = false;
        for line in lines {
            if line == "" {
                sep_found = true;
                continue;
            }

            if sep_found {
                body = format!("{}\n{}", body, line);
            }
        }

        // Remove trailing zero-bytes.
        body = body.chars().filter(|&c| c != '\0').collect();

        Http {
            method: method.into(),
            path: path.into(),
            host: host.into(),
            // Params will be set in Route::goto where we find the matching route.
            params: std::collections::HashMap::new(),
            body: body,
        }
    }
}

// Struct for a HTTP Response.
// This way we can easily define a response and use the string formatter to write
// the response to the stream as string.
struct HttpResponse {
    status: u16,
    content_type: String,
    body: String,
}

// Convert HttpResponse to string.
impl std::fmt::Display for HttpResponse {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(
            f, 
            "HTTP/1.1 {status}\nContent-Type: {content_type}\n\n{body}",
            status = self.status,
            content_type = self.content_type,
            body = self.body
        )
    }
}

// Struct Routes holds the following:
// - A method `goto` to find and execute a the fn pointer to a route for the matching Http-method
// and path.
// - Defined custom routes.
// - Predefined standard routes (e.g 404, 500)
struct Routes;
impl Routes {

    // Find fn pointer to execute the route given the Http-method and path from the incoming
    // request.
    async fn goto(req: &mut Http, routes: &std::vec::Vec<Route>) -> Result<HttpResponse, String> {
        
        if let Some(r) = routes.iter().find(|&&(m, p, _)| m == req.method && check_path(p, &*req.path)) {
            let (_, p, ro) = *r;
            req.params = extract_params(p, &*req.path);
            return ro(&Routes, req.clone()).await;
        } else {
            return Ok(Routes::not_found().await);
        }

        // A route can contain custom parameters e.g /user/{name}. By formatting
        // the defined routes into a regex string we can check if the defined path
        // matches the given path incoming from the request.
        //
        // :param p: Path we predefined in our routes.
        // :param path: Path extracted from the incoming request.
        fn check_path(p: &str, path: &str) -> bool {
            
            let re_custom_param = Regex::new(r"\{(\w+)\}").unwrap();
            let formatted_path = re_custom_param.replace_all(p, "(\\w+)");
            let regex_str_path = format!("^{}$", formatted_path);

            let re = Regex::new(&regex_str_path).unwrap();
            re.is_match(path)
        }

        // Extract the parameters from the path given in the request and make it
        // workable in a HashMap.
        // E.g:
        // p = /user/{firstname}/{lastname}, path = /user/foo/bar, becomes:
        // {"firstname": "foo", "lastname": "bar"}
        //
        // :param p: Path we predefined in our routes.
        // :param path: Path extracted from the incoming request.
        fn extract_params(p: &str, path: &str) -> std::collections::HashMap<String, String> {
            let mut params : std::collections::HashMap<String, String> = std::collections::HashMap::new();

            // [REVIEW] This code works. But I think there should be a more elegant, functional and
            // 'Rusty' way of achieving this instead of splitting two strings and looping over
            // them.

            // Split both paths in pieces.
            let p_pieces = p.split("/");
            let path_pieces = path.split("/").collect::<std::vec::Vec<&str>>();

            // Loop over the pieces of our predefined route path. If a piece
            // contains {...} AKA is a custom parameter:
            // Add the string between the brackets as key and add the value of the
            // path from the request as value to the hashmap.
            let mut i = 0;
            for p_piece in p_pieces {
                let re = Regex::new(r"\{(\w+)\}").unwrap();
                if re.is_match(p_piece) {
                    let param_name = p_piece.replace("{", "").replace("}", "");
                    params.insert(param_name, path_pieces[i].into());
                }

                i = i + 1;
            }

            params
        }
    }

    // Custom routes.
    // 
    async fn index(&self, req: Http) -> Result<HttpResponse, String> {
        Ok(HttpResponse {
            status: 200,
            content_type: "text/plain".into(),
            body: "This is the homepage.".into(),
        })
    }
   
    // [REVIEW] (extra) How would I implement something like this?
    //
    // A kind of macro to do the following:
    // #[Route("GET", "/welcome/{name}")]
    async fn welcome(&self, req: Http) -> Result<HttpResponse, String> {
        let name = req.params.get("name").cloned().unwrap_or_else(|| "".to_string());

        Ok(HttpResponse {
            status: 200,
            content_type: "text/plain".into(),
            body: format!("Welcome {}", name.to_string()),
        })
    }
    
    async fn welcome_age(&self, req: Http) -> Result<HttpResponse, String> {
        let name = req.params.get("name").cloned().unwrap_or_else(|| "".to_string());
        let age = req.params.get("age").cloned().unwrap_or_else(|| "".to_string());

        Ok(HttpResponse {
            status: 200,
            content_type: "text/plain".into(),
            body: format!("Welcome {}, your age is {}", name.to_string(), age.to_string()),
        })
    }
    
    async fn print_name(&self, req: Http) -> Result<HttpResponse, String> {
        // Define the expected data from the POST-request.
        #[derive(Serialize, Deserialize)]
        struct Data {
            name: String,
        };

        // Make Data object from POST request body. Return error 400
        // if sent data is malformed.
        let data : Data = match serde_json::from_str(&*req.body) {
            Ok(d) => d,
            Err(e) => {
                return Ok(HttpResponse {
                    status: 400,
                    content_type: "text/plain".into(),
                    body: "".to_string(),
                });
            }
        };

        // Print out the name. Normally we would do something advanced
        // like a database query or smth.
        println!("Name {}", data.name);

        Ok(HttpResponse {
            status: 200,
            content_type: "text/plain".into(),
            body: "".to_string(),
        })
    }

    // [REVIEW] Not all route-functions use the parameter 'req'. But all route-functions must have it because
    // of the defined type Route. How to solve this or supress these warnings at function level without using
    // #![allow(unused)] so the compilers ignores it for all my code?
    async fn html_page(&self, req: Http) -> Result<HttpResponse, String> {
        let html = 
        r#"
            <h1>Hello</h1>
            <p>This is html.</p>
        "#;

        Ok(HttpResponse {
            status: 200,
            content_type: "text/html".into(),
            body: html.into(),
        })
    }

    // Standard routes.
    //
    async fn not_found() -> HttpResponse {
        HttpResponse {
            status: 404,
            content_type: "text/plain".into(),
            body: "Not found.".into(),
        }
    }

    async fn internal_server_error() -> HttpResponse {
        HttpResponse {
            status: 500,
            content_type: "text/plain".into(),
            body: "Internal server error.".into(),
        }
    }
}

[package]
name = "rust-api"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
reqwest = "0.11.23"
futures = "0.3.0"
base64 = "0.21.5"
hmac = "0.12.1"
sha2 = "0.10.8"
hex = "0.4"
async-trait = "0.1.74"
tokio = { version = "1.35.1", features = ["macros", "io-std", "rt-multi-thread"] }
tokio-tungstenite = { version = "0.21.0", features = ["native-tls"] }
futures-util = "0.3.29"
futures-channel = "0.3"
url = "2.5.0"
serde_json = "1.0.108"
serde = { version = "1.0.193", features = ["derive"] }
tokio-util = "0.7.10"
tokio-stream = "0.1.14"
lazy_static = "1.4"
regex = "1.10.2"

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.