Performance comparison

So at work we had a piece of code that was parsing a bit of a http response and decided to write it in cpp and rust to see which was quicker. I ended up with the two bits of code below. The only issue is my cpp codes takes 800ms and the rust code takes 2.1 Seconds. I know this is a silly micro benchmark but was interested in how I would dig through this to see what is taking the time.

#include <iostream>
#include <string>
#include <string_view>
#include <tuple>
#include <chrono>

using namespace std;
using namespace std::chrono;

std::tuple<std::string_view,std::string_view,std::string_view,bool> split_http(std::string_view s){

        std::string_view version = "";
        std::string_view status = "";
        std::string_view description = "";
        bool error = true;

        size_t found = s.find(' ');
        if (found != std::string::npos) {
                version = s.substr(0,found);
        } else {
                return std::make_tuple(version,status,description,error);
        }
        size_t found1 = s.find(' ',found +1);
        if (found1 != std::string::npos) {
                status = s.substr(found+1,found1 - found);
        } else {
                return std::make_tuple(version,status,description,error);
        }
        error = false;
        description = s.substr(found1+1,s.length());
        return std::make_tuple(version,status,description,error);
}

int main(){

        std::string_view s = "HTTP/1.1 418 I'm a teapot\r\n";
        //std::string_view s = "HTTP/1.1 418 ";

        std::string_view version,status,description;
        bool error;

        auto start = high_resolution_clock::now();
        for (int i = 0 ; i < 100000000 ; i++){
                std::tie(version,status,description,error) = split_http(s);
                if (error == true) {
                        //do something here
                        continue;
                }
        }
        auto stop = high_resolution_clock::now();
        auto duration = duration_cast<microseconds>(stop - start);
        std::cout << "Time taken by function: " << duration.count() << " microseconds" << std::endl;
        std::cout << "Version: " << version << std::endl;
        std::cout << "status: " << status << std::endl;
        std::cout << "description: " << description << std::endl;

        return 0;
}

The rust one

use std::time::{Duration, Instant};
fn main() {
    let response = String::from("HTTP/1.1 418 I'm a teapot\r\n");
    let mut res: (&str, &str, &str) = ("", "", "");
    let start = Instant::now();
    for _ in 0..100_000_000 {
        res = match parse_http(&response) {
            Ok(data) => data,
            Err(_) => {
                continue;
            }
        };
    }
    let duration = start.elapsed();

    println!("version:{}\ncode:{}\ndescription:{}\n", res.0, res.1, res.2);
    println!("Time elapsed in parse_response() is: {:?}", duration);
}

fn parse_http(s: &str) -> Result<(&str, &str, &str), &str> {
    let mut parts = s.splitn(3, ' ');
    let version = parts.next().ok_or("No Version")?;
    let code = parts.next().ok_or("No status code")?;
    let description = parts.next().ok_or("No description")?;
    Ok((version, code, description))
}

Tried using perf and found its spending most of it's time in <core::str::iter::SplitN

as core::iter::traits::iterator::Iterator>::next but not sure if there is anything I can do about this

Any suggestions as to why there is a big time difference and how I would have found it ?

Are you compiling both pieces of code with optimizations enabled? (E.g. -O2 -flto for C++ and cargo build --release for Rust.)

1 Like

I think the str::split methods are not very well optimized. This version which iterates over the bytes of the string and works more like the C++ code is about 2.5x faster than the original Rust, on my machine:

fn parse_http(s: &str) -> Result<(&str, &str, &str), &str> {
    let mut bytes = s.as_bytes().iter();
    let i = bytes.position(|b| *b == b' ').ok_or("No Version")?;
    let j = bytes.position(|b| *b == b' ').ok_or("No status code")? + i + 1;

    let version = &s[..i];
    let code = &s[(i + 1)..j];
    let description = &s[(j + 1)..];
    Ok((version, code, description))
}

You could do something similar by using s.as_bytes().splitn(b' '), and then str::from_utf8_unchecked on the results. This would result in simpler (and possibly even faster) code but would require the unsafe keyword. Fortunately, the safety is easy to verify.

I think that the standard library's str::splitn implementation could be optimized to do less unnecessary UTF-8 decoding, and become competitive with these byte-by-byte versions. Some similar work was already done for other string functions: str::find(char) is slower than it ought ot be · Issue #46693 · rust-lang/rust · GitHub.

6 Likes

thanks @mbrubeck that makes sense and trying your version speed up the rust code so its 576 ms. Maybe it's time to file my first rust issue.

Even with speed improvements the Rust version is unlikely to be faster than the C++ version because of the difference in strings between the languages.

In Rust, strings are guaranteed to be valid UTF-8 sequences so when you receive/create the HTTP line someone needs to validate the sequence of bytes that it's valid UTF-8. And then when parsing the string line Rust will be considering UTF-8 instead of bytes so again it''s more work.

A better approach could be to use bytes/arrays (instead of strings and string slices) and treat the HTTP line as pure binary ASCII and only use strings when an actual unicode string is needed (after determining the encoding).

Bytes seems to be commonly used in crates when dealing with owned binary data.

3 Likes