Rust program slower than expected

Hi, I am a complete newbie with Rust, just playing with it.
I tried to create a simple program just to get a feeling of the language and compare it to others.
At my first attempt I'm not getting the expected speed so I'm puzzled if I'm doing something wrong?
The release-compiled version runs a little slower than a PHP script that does the same thing.
Rust: 0m2.071s. PHP: 0m1.770s.

The program just de/serializes an incrementally growing JSON.

use serde_json::{Map, Value, json};

fn main() {
    println!("Wait...");
    let mut last = r#"
    {
        "counter": 0
    }"#
    .to_string();
    for _i in 0..3000 {
        let mut x: Map<String, Value> = serde_json::from_str(last.as_str()).unwrap();
        let counter = x["counter"].to_string();
        let mut node = &mut x;
        for key in counter.chars() {
            if !node.contains_key(&key.to_string()) {
                node.insert(key.to_string(), Value::from(serde_json::Map::new()));
            }
            node = node
                .get_mut(&key.to_string())
                .unwrap()
                .as_object_mut()
                .unwrap();
        }
        x["counter"] = Value::from(x["counter"].as_i64().unwrap() + 1);
        let next = json!(x);
        last = next.to_string();
    }
    println!("{}", last.len());
}

Here's the PHP version:

<?php

echo 'Wait...' . PHP_EOL;
$last = '
    {
		"counter": 0
    }';
for( $i = 0; $i < 3000; $i++ ) {
	$x = json_decode( $last );
	$node = &$x;
	foreach( str_split( $x->counter ) as $key ) {
		$node->$key ??= new stdClass();
		$node = &$node->$key;
	}
	$x->counter++;
	$last = json_encode( $x );
}
echo strlen( $last ) . PHP_EOL;

Did you compile your Rust program with the --release flag to enable compiler optimisations?

Edit: found my answer:

@jofas Yes, like this

But you are not “comparing it to others”. You are comparing PHP-program-in-PHP to PHP-program-in-Rust.

The expectaion was, probably, that “magic-in-Rust” would make PHP programs faster… but that doesn't work that way, of course.

Typical Rust program is 100 times faster than equivalent PHP program not because it does useless work (like repeated conversion from JSON to string and back) faster, but because it doesn't perform that useless work at all!

Rust is not a magic bullet, if you want to write PHP program then it's better to write it in PHP and Java program is best programmed in Java, too!

Yup. And in doing that it compared serializer/deserializer written in C (the one used by PHP) to serializer/deserializer written in Rust (the one used in Rust).

C and Rust have roughly comparable performance characteristics and also Rust serializer/deserializer does more work… why are you surprised that it's slower?

Actually, yes because I'd assume that the parser not just reads bytes from the input string, but also instantiates (and eventually destroys) a lot of memory objects. And PHP memory model is not the same as in C.

Sure. In PHP every variable is like serde_json::value:: Value, while in C and Rust there are other, more efficient types.

But you are not using them, you are bringing PHP's memory model into Rust.

As I have said: these are two PHP programs, just one written directly in PHP and one is transpiled to Rust… they should have similar characterisics… and they show them.

Nothing surprising.

Making your CPU waste a whole lot of time, then - for no reason in particular. The "speed" part of Rust (or any other lower level PL) is in the ability of the compiler to remove (needless) layers upon layers of indirections and/or checks, otherwise unavoidable in langs like Python/PHP/JS (unless JIT-ed, which is a whole another topic).

Instead of checking at runtime whether any particular instance of an object has any given method or operation, once compiled to machine code the only bare essentials are (hopefully) left. Which bytes to move from where to where. Nothing else - that's what gives it the overall "speed" boost.

Your test, by its very nature, prevented any and all optimization. Not only you're constantly turning your Map back and forth into/from a String - yet you're keeping the Map itself opaque, with no clear (low level) data types for the compiler to optimize for. Account for that, and suddenly:

use serde_json::{Map, Value, json};
use std::time::Instant;

fn main() {
    println!("Wait...");
    let start = Instant::now();
    let last = r#"{ "counter": 0 }"#;
    // (1) keep the map outside of the loop, otherwise you're 
    // just wasting CPU cycles on repeated (de)serialization into/from `String`
    let mut x: Map<String, Value> = serde_json::from_str(&last).unwrap();
    for _i in 0..3000 {
        let counter = x["counter"].to_string();
        let mut node = &mut x;
        for char in counter.chars() {
            let key = char.to_string();
            if !node.contains_key(&key) {
                let new = Value::Object(Map::new());
                node.insert(key.clone(), new);
            }
            node = node
                .get_mut(&key).unwrap()
                .as_object_mut().unwrap();
        }
        let counter = x["counter"].as_i64().unwrap();
        x["counter"] = Value::from(counter + 1);
    }
    // (2) turn it back into a `String` only at the very end
    let json = json!(x).to_string(); 
    println!("{}", json.len());
    let time_taken = Instant::now().duration_since(start);
    println!("elapsed: {:?}", time_taken);
}

- the whole thing now only takes 2-6 ms (check the playground version).

12 Likes

Not only you're constantly turning your Map back and forth into/from a String

That was I what I wanted to measure though

And that's why I wrote what I wrote. Rust program is not supposed to constantly turn untyped Map into string and back. And if you want to write PHP program then PHP is really the best and fastest language for such program. Really.

Think about if: if mechanical conversion of program to Rust and then compilation of it with rustc would have been able to magically produce faster code… wouldn't have someone made “Enterprise PHP” that would have done precisely that?

Languages like Rust win not because they do the useless busywork that slows down everything in languages like PHP or Python faster, but because they make said busywork not needed.

Removal of said busywork is still human's responsibility, though.

I think a large part of what this is measuring is the hashmap performance between PHP and Rust. The hashmap implementation of PHP seems to use DJBX33A as hasher, which is a very fast but low quality hasher without HashDOS protection. Rust however uses SipHash 1-3 with a random key as default hasher, which performs much better on low-entropy inputs and has HashDOS protection. It is slower than DJBX33A in non-pathological cases though.

13 Likes

Rust version is also forced to use UTF-32 which makes this “version of PHP” even slower than PHP-6, that was abandoned because it was too slow. But as @00100011 have shown that difference is very minor compared to the other things.

It's also worth mentioning that Rust tends to performs poorly on these quick performance comparisons for tasks that involve short-lived data structures, when compared to GC-backed languages. This is because Rust has to do the cleanup for every single object, while GC-backed languages can batch the cleanup.

4 Likes

Guys, I think I understand the points you made in replies. What I am trying to say is JSON de/serialization is a valid testing point. My day job is supporting a backend that reads and writes a lot of that, and is often limited by CPU. So I am genuinely curious how Rust would behave in this conditions. So that if I offer my boss to rewrite something in Rust it would have this or that advantage over PHP and Java that we use currently.
I love how little memory Rust is taking, but honestly we have enough RAM on our server, unlike CPU cores.

To do this, I don't understand why you wouldn't use the Rust implementation that @00100011 provided. If you were going to use Rust in production, you would probably use it this way.

2 Likes

It's hard to justify choosing a language based solely on microbenchmarks, nonetheless. If someone in my team came up with such a bold suggestion, it would certainly raise concerns.

Choosing a language for a rewrite has to be justified on a wide range of metrics such as the ergonomics of the language for the task in question, the maintenance costs of having another language being used in production, the maturity of the language's ecosystem, the maturity of the libraries for the specific program being rewritten, and so on.

3 Likes

What the previous answers tried to highlight is that there's a difference between trying to decode arbitrary JSON inputs into nested data structures vs. decoding JSON into typed structs.

Compare the untyped json and strongly typed sections in serde_json. The latter is generally faster because it will have a more compact representation in memory and only has to retain those things you declare necessary.

So you're only testing a very specific form of JSON deserialization, not what many production applications use.

1 Like

Which is a valid point (pun intended), in its own way. What is lacking in your original example, however, is context.

Does your backend have to routinely (1) de/serialize the same exact Map from/into a json String over a few K+ iterations, all the while (2) updating a counter stored within it; which is, for some reason, itself used to (3) populate the map with a whole bunch of other inner Map's?

If the (1) doesn't hold, you can easily de/serialize your json in parallel using the (few) cores you
do have, using a lib like rayon. If the (2) isn't the case, you can keep it outside as a regular i32 throughout the whole process, before plugging it back at the very end. If the (3) has little to do with the data you're actually reading/writing, you can further streamline your implementation by preinitializing and/or keeping some of that data around in a const or a static.

Out of any "simple program" you could have chosen, you ended up picking one of the most spherical cows to ever traverse the vacuum of CPU-bound computation.

Context matters. Whether that context is representative of the (actual) task, even more so.

Why don't you try to rewrite some critical portion of your actual production code, then? Even if your first program ended up being twice or five or ten times as fast, I can't quite imagine the kind of boss that would happily accept an example so ... detached? - from any meaningful work, as a proof of Rust's definite "advantage".

At best, what you end up comparing here are the implementation details of two specific methods (serde_json::from_str and to_string) of a feature (JSON/UTF-8 en/decoding) of a library (serde_json itself) that's not even a part of the Rust's std itself; to the matching built-in functionality of the PHP. How would it be representative of the language, as a whole?

2 Likes

You unnecessarily serialize to JSON twice in every loop iteration with json!() + .to_string().

This version is twice as fast on my machine (and I also made it a bit more idiomatic using .entry() method):

use serde_json::{Map, Value};

fn main() {
    println!("Wait...");
    let mut last = r#"
    {
        "counter": 0
    }"#
    .to_string();
    for _i in 0..3000 {
        let mut x: Map<String, Value> = serde_json::from_str(&last).unwrap();
        let counter = x["counter"].to_string();
        let mut node = &mut x;
        for key in counter.chars() {
            node = node.entry(key)
                .or_insert_with(|| Value::from(serde_json::Map::new()))
                .as_object_mut()
                .unwrap();
        }
        x["counter"] = Value::from(x["counter"].as_i64().unwrap() + 1);
        last = serde_json::to_string(&x).unwrap();
    }
    println!("{}", last.len());
}
3 Likes

I'm a little bit puzzled about your large time difference between the PHP and Rust version.

On my system I get about 600 milliseconds for the PHP (8.1) version, while your original Rust version is about 50 milliseconds slower with 650 milliseconds.

The fixed version from @frol runs in about 300 milliseconds, twice as fast as PHP.

Just for curiosity, I tried another JSON implementation sonic-rs which runs in about 120 milliseconds, 5 times faster than PHP:

use sonic_rs::{JsonValueMutTrait, JsonValueTrait, Object, Value};

fn main() {
    println!("Wait...");
    let mut last = r#"
    {
        "counter": 0
    }"#
    .to_string();
    for _i in 0..3000 {
        let mut x: Value = sonic_rs::from_str(last.as_str()).unwrap();
        let counter = x["counter"].to_string();
        let mut node = x.as_object_mut().unwrap();
        for key in counter.chars() {
            let key = key.to_string();
            if !node.contains_key(&key) {
                node.insert(&key, Object::new());
            }
            node = node.get_mut(&key).unwrap().as_object_mut().unwrap();
        }
        x["counter"] = Value::from(x["counter"].as_i64().unwrap() + 1);
        last = sonic_rs::to_string(&x).unwrap();
    }
    println!("{}", last.len());
}
1 Like