Zerompk - Extremely fast MessagePack serializer for Rust

Hello!

I'd like to introduce zerompk, a new MessagePack implementation for Rust! It's far faster than the currently most popular rmp_serde and is built without any external crate dependencies (including std).

use zerompk::{FromMessagePack, ToMessagePack};

#[derive(FromMessagePack, ToMessagePack)]
pub struct Person {
    pub name: String,
    pub age: u32,
}

fn main() {
    let person = Person {
        name: "Alice",
        age: 18,
    };
    
    let msgpack: Vec<u8> = zerompk::to_msgpack_vec(&person).unwrap();
    let person: Person = zerompk::from_msgpack(&msgpack).unwrap();
}

Have fun!

repo: GitHub - nuskey8/zerompk: A zero-copy, zero-dependency, no_std-compatible, extremely fast MessagePack serializer for Rust. · GitHub
crate: crates.io: Rust Package Registry

1 Like

For reference, the results of some benchmarks are listed below. Please refer to the repository for the actual code.

Also, note that while msgpacker in the benchmarks claims to be a MessagePack serializer, it does not actually generate correctly formatted msgpack binaries. msgpacker serializes structures as arrays, but it fails to generate the necessary headers, making accurate comparisons impossible.

Serialize/Deserialize Struct (with 4 fields, array format) 1000 times

Crate Serialize Deserialize
serde_json (JSON) 98.33 μs 329.12 μs
msgpacker 25.41 μs 134.37 μs
rmp_serde 56.22 μs 97.00 μs
zerompk 28.82 μs 72.27 μs

Serialize/Deserialize Struct (with 4 fields, map format) 1000 times

Crate Serialize Deserialize
serde_json(JSON) 98.33 μs 329.12 μs
rmp_serde 92.63 μs 98.31 μs
zerompk 35.81 μs 71.19 μs
msgpacker N/A N/A

Serialize/Deserialize Array (struct with 2 fields, 1000 elements) 1000 times

Crate Serialize Deserialize
serde_json(JSON) 22,369.22 μs 37,034.55 μs
rmp_serde 9,803.24 μs 10,839.79 μs
msgpacker 10,981.52 μs 4,608.72 μs
zerompk 6,310.66 μs 4,074.17 μs

Serialize/Deserialize Struct (with 2 fields, no-copy) 1000 times

Crate Serialize Deserialize
rmp_serde 15.47 μs 16,82 μs
zerompk 8.57 μs 10.33 μs
1 Like

How does it compare to directly using rmp? I'm wondering is the overhead in the rmp library or the serde interface?

Since rmp is only a low-level API for memory operations, it should be possible to achieve the same performance if implemented correctly. However, it would be difficult to create a well-optimized and attack-resistant implementation using only this.

Also, during the development of zerompk, I created a prototype implementation using serde with the zerompk low-level API (this is not yet publicly available, and I'm not sure if I'll include it in the future), and the result was slightly faster than rmp_serde, but slower than zerompk.

From this, it seems that while there is still some room for optimization in rmp_serde, the performance degradation caused by the serde interface itself is significant enough that it cannot be ignored.

the (in the vast majority of cases at least) useless and extremely frequent use of unsafe is honestly kinda worrying, and would definitely put me off using this crate were i to need messagepack deserialization.

let mut slice = [0u8; 9];
slice[0] = 0xcf;
unsafe {
    core::ptr::copy_nonoverlapping(
        (u as u64).to_be_bytes().as_ptr(),
        slice.as_mut_ptr().add(1),
        8,
    );
}

could trivialy be written as

let mut slice = [0u8; 9];
let [head, tail @ ..] = &mut slice;
*head = 0xcf;
*tail = u.to_be_bytes();

and you should just get yourself a take_array to use instead of take_slice when you need comptime length.

1 Like

Yes, currently most memory operations are implemented using unsafe. However, all unsafe sections are clearly safe, so this shouldn't be a problem.

However, I agree that equivalent code can be written without using unsafe. I plan to remove as many unnecessary unsafes as possible, but this will take some time as I want to ensure the generated LLVM IR is equivalent.

1 Like

zerompk v0.2.0 has been released!

Many unsafe statements have been removed from the code. This was done with careful consideration of the impact on performance, so there is no performance degradation.

Furthermore, by redesigning the zerompk::Write trait and its implementation, we have achieved a dramatic improvement in serialization speed. It is 1.5-2.0 times faster than 0.1.0!

Serialize Struct (with 4 fields, array format) 1000 times

Crate Serialize
serde_json (JSON) 98.33 μs
msgpacker 25.41 μs
rmp_serde 56.22 μs
zerompk 0.1.0 28.82 μs
zerompk 0.2.0 12.38 μs

Currently, the write module still contains a significant number of unsafe operations. However, these remain because pointers and set_len() are essential for efficient manipulation of Vec<T>.

This could potentially be reduced by using spare_capacity_mut(). However, this work is on hold because I haven't investigated how this change would actually impact performance.

the improvements are really nice to see.

a note on this :

        let mut str_buf = alloc::vec::Vec::with_capacity(len);
        unsafe {
            str_buf.set_len(len);
            self.read_exact(str_buf.as_mut_slice())?;
        }

this could be UB simply because Self::read_exact takes a &mut [u8] (afaik this still is undecided), but it definitely is UB because you are calling <R as Read>::read_exact and while it is discouraged read_exact is allowed to read.

this definitely is std's fault for not providing an uninit API, but the solution, the read_buf feature, is still unstable. i would advise that you try seeing the perf cost of zeroing out the buffer in the meantime.

1 Like

I've created and merged a fix for this issue (#8). There's a cost to zero initialization, but it's probably acceptable. (Also, it improves performance as it was causing unnecessary allocations due to a misimplementation of read_string. :slight_smile:

Thank you!

1 Like

reader.by_ref().take(n).read_to_end(&mut vec)? == n does it safely.

This API is clunky, ReadBuf doesn't fix it. It's much more clunky, and it doesn't even work safely with Vec! (it has its own special fill-vs-init capacity tracking, can't use Vec's spare capacity).

nice catch, sorry i missed that.

zerompk 0.2.1 has been released.

Fixed an issue where reading with std::io::Read::read_exact() could cause undefined behavior, and improved the tolerance of read_msgpack() for excessively large header lengths.

Hi! You claim that

Blockquote
msgpacker is described as a MessagePack serializer, but it does not produce correct MessagePack binaries. In msgpacker, structs are always represented as arrays, but the discriminating header is omitted. Therefore, binaries serialized by msgpacker are not compatible with properly implemented MessagePack serializers, making strict comparisons invalid.

Can you point where in the MessagePack protocol specifies this claim? Also, can you make the library opensource?

zerompk is available on github under the MIT license, which is OSI aproved.

Yes, but I meant as in adding a repository value here

The minimum code to demonstrate this is as follows:

use msgpacker::{MsgPacker, Packable};
use zerompk::ToMessagePack;

#[derive(MsgPacker, ToMessagePack)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let point = Point { x: 1, y: 2 };

    let mut bin = vec![];
    point.pack(&mut bin);
    println!("[{}]", to_hex(&bin)); // [0x01, 0x02]

    let msgpack = zerompk::to_msgpack_vec(&point).unwrap();
    println!("[{}]", to_hex(&msgpack)); // [0x92, 0x01, 0x02]
}

fn to_hex(bytes: &[u8]) -> String {
    bytes
        .iter()
        .map(|b| format!("0x{:02x}", b))
        .collect::<Vec<_>>()
        .join(", ")
}

[0x01, 0x02] is not a valid msgpack binary. If you want to represent the structure as an array, you need to add the appropriate header.

This issue can be resolved by merging the PR #19. However, unfortunately, a performance degradation is unavoidable.

Oh, I forgot to add the repository URL. I'll add it later.

Is it anywhere in the protocol spec tho?

The representation of msgpack arrays is described in this section of spec.md.