Blog post: Making Slow Rust Code Fast

ashishnegi · October 22, 2021, 4:52am

Online services serving many concurrent users/requests do spend a lot of time in deserializing user request and serializing response, since data is mostly cached in memory. That is why json is not considered for good communication between internal services. I have seen this in many production services.

Fredrik · October 22, 2021, 12:05pm

I like that you added JSON and Message Pack for comparison. Maybe also add CBOR?

You can't make such a conclusion when comparing apples to oranges. You have to look into the details.

rmp_serde::to_vec achieves high performance by not serializing field names. You have to use rmp_serde::to_vec_named for a fair comparison.

Also to consider (after the important detail above) is that Message Pack is not BSON.

Arrays are prefixed with the number of items, so arrays can be allocated upfront with the correct capacity when deserializing.
Byte lengths of arrays and objects are not encoded, so no need to backtrack when serializing.
Keys of array elements are not encoded, significantly reducing the number of bytes to serialize and deserialize.

For further performance improvements in deserializing arrays, they could be allocated in a bump arena. This reduces the number of allocations (and deallocations) for any format, but is particularly helpful in making BSON competitive with Message Pack, because arrays could have a preallocated capacity for a pretty large number of items, and any capacity that is left over after the end of one array can be immediately used for the next array.

There's async for that. So what matters depends on what you're measuring.

univerz · October 24, 2021, 5:40am

thank you for careful read, this mistake is on me. new numbers (EDIT: for older, different benchmark) are less impressive, however there is still ~2.8x difference even for simple struct {i: i64} (and potentially i: ObjectId - this was the main reason for the benchmark as it was an order of magnitude slower).

univerz · October 24, 2021, 6:55am

bioinformatics is on a heavy side of postprocessing. on my websites & lighter number crunching projects, bson was really visible & resulting perf was worse than (c++,) python & even one old php webpage i rewrote in rust. it's significalntly better now, but numbers in this thread suggest there is still room to do several double digits improvements.

system · January 22, 2022, 8:30am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Json_in_type : encoding json faster than serde announcements	1	504	January 12, 2023
Comparison between rustc-serialize and serde_json	2	1228	January 12, 2023
Recommendation for JSON crate help	7	1453	January 12, 2023
What is the best way to de-/serialize a simple struct? help	8	1032	April 30, 2021
Serde and serde_json 1.0.0 released announcements	7	15004	July 3, 2022

Blog post: Making Slow Rust Code Fast

Related Topics