Infantile HTTP Parsing in Rust

I've been eyeing rust for a while, it seems great but when I write code I feel like the language's robust features often leave me suffering from analysis paralysis, much more then when compared to languages like Go and C.
Recently, I started writing an application to remotely communicate with mpv via form data on websites and a HTTP API. Rather than reaching for a mature crate for http parsing, I decided to implement the necessary features I need myself in Rust(namely parsing post req data). The main intent I had for the parser is to utilize Rusts Composite Types and Type System to ensure correctness of the data rather since this approach is something that differs from the languages I primarily use.
Feel free to take a look below and share your thoughts below, I know its nothing novel.
There were design decisions that I would appreciate input on, such as

  • Using String types in Structs ( instead of Cow<str> or Rc<str> )
  • Using Vec<String>
  • Composite Enums (Headers enum)
  • Maybe converting Header.version to an enum for easy and faster matching
  • Effectiveness of Enforcing HTTP in the type system ( so many options )
use std::io::{Error,ErrorKind};
use std::io::{BufRead, BufReader};
// parser doesnt aim to be complete
// just enough so it can read data from browser requests
// and reply
#[derive(Debug)]
enum Method {
    POST,
}
//struct URL<'a>(&'a str);// maybe make do some processing to make a real string type
#[derive(Debug)]
enum ContentTypes {
   URLENCODED,
}
#[derive(Debug)]
enum Connections{
    KeepAlive,
}
#[derive(Debug)]
enum Headers {
    ContentType(ContentTypes),
    Connection(Connections),
    ContentLength(u8),
    Host(String),
}
#[derive(Debug)]
struct Header{
    method: Method,
    location: String,
    version: String,
    headers: Vec<Headers>,
}
/// Given any readable datatypes, read them and get their method, location and additional headers
pub fn parse_header<T: AsRef<[u8]>>(data:T) -> std::io::Result<Header>{
    let mut reader = BufReader::new(data.as_ref());
    let mut line: String = String::with_capacity(1024);
    reader.read_line(&mut line)?;
    let mut line = line.split_ascii_whitespace();
    if let (Some(method), Some(location), Some(version)) = (line.next(),line.next(),line.next()) {
        let method = match method {
            "POST" => Method::POST,
            &_ => return Err(Error::new(ErrorKind::InvalidData,"not an implemented method"))
        };
        let (location,version) = (location.to_string(),version.to_string());
        let mut headers: Vec<Headers> = Vec::new();
        for lines in reader.lines() {
            let loop_line: String = lines?;
            if loop_line.trim().is_empty() { // handle end of headers
                break;
            }
            let mut it = loop_line.trim().split_ascii_whitespace();
            let verb = it.find(|&ctx| {
                ctx.ends_with(":")
            }).ok_or(Error::new(ErrorKind::InvalidData,"Missing semicolon:verb"))?; // return error if missing
            let optional_data = it.next();
            match (verb,optional_data.as_deref()) {
                ("Content-Type:",Some("application/x-www-form-urlencoded")) => {
                    headers.push(Headers::ContentType(ContentTypes::URLENCODED))
                }
                ("Content-Type",Some(_)) => { // other content types
                    todo!("Capture and parse other content types")
                },
                ("Host:",Some(host)) => {
                   headers.push(Headers::Host(host.to_owned()))
                }
                (&_,x) => todo!("Add additional header parsing"),
            }
        }
        return Ok(Header {
            method,
            location,
            version,
            headers
        })
    }
    Err(Error::from(ErrorKind::InvalidData))
}
#[cfg(test)]
pub mod tests {
    use super::*;
    #[test]
    fn test_parse_header() {
       let form_req =  "POST /api HTTP/1.1\r
Content-Type: application/x-www-form-urlencoded\r
\r
command=skip&time=30
";
        assert!(parse_header(form_req).is_ok());
        println!("{:?}",parse_header(form_req).unwrap());
        let get_req =  "GET /api HTTP/1.1\r";
        assert!(parse_header(get_req).is_err());
        assert_eq!(parse_header(get_req).err().unwrap().kind(),ErrorKind::InvalidData);
    }
}

I'm looking forward to all the feedback both good and bad :slight_smile:

I'd recommend implementing FromStr for each type that you parse, or at least writing the parsing logic as associated functions. That would make the code much more readable.

3 Likes
  1. Use clippy.
  2. I would replace the if-let with a let-else to get rid of one level of indentation.
  3. You can avoid calling trim twice.
  4. I think "Missing semicolon" should say "Missing colon"
  5. I'd add more tests
1 Like

Possibly after doing it manually to figure out what they're doing for you, check out strum for string / enum conversions, and thiserror (or anyhow of you don't need to match the error type) for friendly error messages, which should help make the code a lot prettier.

1 Like

Thanks for the input I was using a linter whichever I set up in my emacs config, I forget.

Thanks I'll take a look at those!

Thanks I will give it a try!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.