I'd like to introduce zerompk, a new MessagePack implementation for Rust! It's far faster than the currently most popular rmp_serde and is built without any external crate dependencies (including std).
use zerompk::{FromMessagePack, ToMessagePack};
#[derive(FromMessagePack, ToMessagePack)]
pub struct Person {
pub name: String,
pub age: u32,
}
fn main() {
let person = Person {
name: "Alice",
age: 18,
};
let msgpack: Vec<u8> = zerompk::to_msgpack_vec(&person).unwrap();
let person: Person = zerompk::from_msgpack(&msgpack).unwrap();
}
For reference, the results of some benchmarks are listed below. Please refer to the repository for the actual code.
Also, note that while msgpacker in the benchmarks claims to be a MessagePack serializer, it does not actually generate correctly formatted msgpack binaries. msgpacker serializes structures as arrays, but it fails to generate the necessary headers, making accurate comparisons impossible.
Serialize/Deserialize Struct (with 4 fields, array format) 1000 times
Crate
Serialize
Deserialize
serde_json (JSON)
98.33 μs
329.12 μs
msgpacker
25.41 μs
134.37 μs
rmp_serde
56.22 μs
97.00 μs
zerompk
28.82 μs
72.27 μs
Serialize/Deserialize Struct (with 4 fields, map format) 1000 times
Crate
Serialize
Deserialize
serde_json(JSON)
98.33 μs
329.12 μs
rmp_serde
92.63 μs
98.31 μs
zerompk
35.81 μs
71.19 μs
msgpacker
N/A
N/A
Serialize/Deserialize Array (struct with 2 fields, 1000 elements) 1000 times
Crate
Serialize
Deserialize
serde_json(JSON)
22,369.22 μs
37,034.55 μs
rmp_serde
9,803.24 μs
10,839.79 μs
msgpacker
10,981.52 μs
4,608.72 μs
zerompk
6,310.66 μs
4,074.17 μs
Serialize/Deserialize Struct (with 2 fields, no-copy) 1000 times
Since rmp is only a low-level API for memory operations, it should be possible to achieve the same performance if implemented correctly. However, it would be difficult to create a well-optimized and attack-resistant implementation using only this.
Also, during the development of zerompk, I created a prototype implementation using serde with the zerompk low-level API (this is not yet publicly available, and I'm not sure if I'll include it in the future), and the result was slightly faster than rmp_serde, but slower than zerompk.
From this, it seems that while there is still some room for optimization in rmp_serde, the performance degradation caused by the serde interface itself is significant enough that it cannot be ignored.
the (in the vast majority of cases at least) useless and extremely frequent use of unsafe is honestly kinda worrying, and would definitely put me off using this crate were i to need messagepack deserialization.
let mut slice = [0u8; 9];
slice[0] = 0xcf;
unsafe {
core::ptr::copy_nonoverlapping(
(u as u64).to_be_bytes().as_ptr(),
slice.as_mut_ptr().add(1),
8,
);
}
could trivialy be written as
let mut slice = [0u8; 9];
let [head, tail @ ..] = &mut slice;
*head = 0xcf;
*tail = u.to_be_bytes();
and you should just get yourself a take_array to use instead of take_slice when you need comptime length.
Yes, currently most memory operations are implemented using unsafe. However, all unsafe sections are clearly safe, so this shouldn't be a problem.
However, I agree that equivalent code can be written without using unsafe. I plan to remove as many unnecessary unsafes as possible, but this will take some time as I want to ensure the generated LLVM IR is equivalent.
Many unsafe statements have been removed from the code. This was done with careful consideration of the impact on performance, so there is no performance degradation.
Furthermore, by redesigning the zerompk::Write trait and its implementation, we have achieved a dramatic improvement in serialization speed. It is 1.5-2.0 times faster than 0.1.0!
Serialize Struct (with 4 fields, array format) 1000 times
Currently, the write module still contains a significant number of unsafe operations. However, these remain because pointers and set_len() are essential for efficient manipulation of Vec<T>.
This could potentially be reduced by using spare_capacity_mut(). However, this work is on hold because I haven't investigated how this change would actually impact performance.
let mut str_buf = alloc::vec::Vec::with_capacity(len);
unsafe {
str_buf.set_len(len);
self.read_exact(str_buf.as_mut_slice())?;
}
this could be UB simply because Self::read_exact takes a &mut [u8] (afaik this still is undecided), but it definitely is UB because you are calling <R as Read>::read_exact and while it is discouraged read_exact is allowed to read.
this definitely is std's fault for not providing an uninit API, but the solution, the read_buf feature, is still unstable. i would advise that you try seeing the perf cost of zeroing out the buffer in the meantime.
I've created and merged a fix for this issue (#8). There's a cost to zero initialization, but it's probably acceptable. (Also, it improves performance as it was causing unnecessary allocations due to a misimplementation of read_string.
reader.by_ref().take(n).read_to_end(&mut vec)? == n does it safely.
This API is clunky, ReadBuf doesn't fix it. It's much more clunky, and it doesn't even work safely with Vec! (it has its own special fill-vs-init capacity tracking, can't use Vec's spare capacity).
Fixed an issue where reading with std::io::Read::read_exact() could cause undefined behavior, and improved the tolerance of read_msgpack() for excessively large header lengths.
Blockquote msgpacker is described as a MessagePack serializer, but it does not produce correct MessagePack binaries. In msgpacker, structs are always represented as arrays, but the discriminating header is omitted. Therefore, binaries serialized by msgpacker are not compatible with properly implemented MessagePack serializers, making strict comparisons invalid.
Can you point where in the MessagePack protocol specifies this claim? Also, can you make the library opensource?