MsgPacker - A Rust msgpack implementation

crate
docs

This crate targets simplicity and performance. No dependencies are used, just the standard Rust library.

We have two main structures available:

  • Message - Owned parsed values
  • MessageRef - Message parsed by reference and bound to the lifetime of the readers source

Snippet example

use msgpacker::prelude::*;
use std::io::{Cursor, Seek};

let buffer = vec![0u8; 4096];

let mut cursor = Cursor::new(buffer);

let key = Message::string("some-key");
let value = Message::integer_signed(-15);
let entry = MapEntry::new(key, value);
let message = Message::map(vec![entry]);

// Write the message to the cursor
message.pack(&mut cursor).expect("Message pack failed");

cursor.rewind().expect("Reset the cursor to the beginning");

// Read the message from the cursor
let restored = Message::unpack(&mut cursor).expect("Message unpack failed");

// Alternatively, we can use the index implementation
let value = restored["some-key"]
    .as_integer()
    .expect("The value was an integer")
    .as_i64()
    .expect("The value was a negative number");

assert_eq!(value, -15);

Did you benchmark it?

Example

Your code is trusting the number of array elements specified by the input and allocating whatever capacity it requests, which opens the door for parser bombs.

1 Like

Yes. Its not really the protocol responsibility to check for that, right? As long reader is able to provide enough data, it will just create the messages

As in the protocol specs, we can take up to 4294967295 items

I'm finishing adding some helper implementations (from, index, iter, etc) and then I'll bench

edit: bench done

Benchmarks

Results obtained with Intel(R) Core(TM) i9-9900X CPU @ 3.50GHz

$ cargo bench
msgpack nil             time:   [4.6849 ns 4.6898 ns 4.6961 ns]
msgunpack nil           time:   [23.654 ns 23.675 ns 23.707 ns]
msgunpack ref nil       time:   [20.253 ns 20.280 ns 20.318 ns]
msgpack int             time:   [6.0831 ns 6.0921 ns 6.1045 ns]
msgunpack int           time:   [26.375 ns 26.415 ns 26.465 ns]
msgunpack ref int       time:   [25.163 ns 25.202 ns 25.258 ns]
msgpack map 1           time:   [17.411 ns 17.432 ns 17.461 ns]
msgunpack map 1         time:   [118.56 ns 118.67 ns 118.83 ns]
msgunpack ref map 1     time:   [59.850 ns 59.898 ns 59.972 ns]
msgpack map 5           time:   [73.763 ns 73.881 ns 74.049 ns]
msgunpack map 5         time:   [539.42 ns 539.91 ns 540.56 ns]
msgunpack ref map 5     time:   [161.39 ns 161.58 ns 161.86 ns]
msgpack map 10          time:   [133.03 ns 133.18 ns 133.39 ns]
msgunpack map 10        time:   [1.0574 us 1.0583 us 1.0597 us]
msgunpack ref map 10    time:   [289.05 ns 289.43 ns 289.98 ns]
msgpack map 100         time:   [1.2123 us 1.2135 us 1.2150 us]
msgunpack map 100       time:   [9.3964 us 9.4076 us 9.4214 us]
msgunpack ref map 100   time:   [2.6246 us 2.6283 us 2.6334 us]

You're trusting that all the data will be provided. It's possible to create a five byte message that doesn't actually contain the data but fools the parser into allocating a space for it. These five bytes can even be repeated to allocate any number of nested arrays without providing the data for any of them.

1 Like

You're still not benchmarking any competing crates, so the benchmark doesn't really say anything. Look at the benchmarking code related to the post I previously linked to for reference.

Posting the violin chart of the benchmark makes the results a lot easier to read. Or even better, publish the full report in HTML format, perhaps through CI.

That is true. I have two naive blocks in the implementation. First is this pre-alloc, and second is the unrestricted recursion for arrays or maps.

The second is even cheaper to produce an attack. For the pre-alloc, I'd need several requests because each of them would alloc around 4G - that by itself is not enough to kill a modern server - but with a couple of others in parallel, then we might be talking about some serious damage. The second I can just explode the call stack with a bunch of bytes. Will release a new version with the fixes for that.

Also, do you have a recommendation of other performant message pack implementations for rust? I'll create a more comprehensive benchmark.

Thx for the feedback, appreciate it!

update:

rmpv is by far the most downloaded one. I'll make the benches comparing to that, but if there is any other relevant crate that I'm missing, please let me know and I'll include there!

Each array can allocate up to u32::MAX (4_294_967_295) * std::mem::size_of::<Message>() bytes. With a brief look at the definition of Message, I estimate its size is 40 bytes. The total is then 160 GiB, not 4 GiB.

As I pointed out in my last message, there's no need to do several requests in parallel to do several big allocations. One requests can repeat the same five byte sequence several times, each time allocating a nested array.

As you're mentioning servers as a use case, I'm wondering what's the motivation for a design that works with an enum type representing data of arbitrary form. Many (if not most) use cases, including the typical server, work with data in a specific form, which in Rust means using the type system to define the form of the data. Other serialization/deserialization libraries for Rust can work directly with statically typed data without going through a soup of enums.

There's rmp-serde, which has five times as many downloads as the one you claim to be the most downloaded. I recommend also including libraries for other formats in the benchmark, such as serde_json and serde_cbor. When using rmp-serde, keep in mind that you have to choose whether to include field names in the serialized data, in order to make a fair comparison, as discussed in the other thread.

Does it make sense to compare different protocols? MessagePack is known to be faster than JSON, while the latter provides better readability. Its the header of their website, and we also have a bunch of benchmarks out there in the wild comparing these two.

Unless ofc there is an implementation error that makes MessagePack slower.

They don't aim for the same problem, right? rmpv will create Value while rmp_serde will just grab bytes around, leaving to the user all the protocol functionality and implementation (e.g. Boolean, Integer, Float, Map, etc). The goal of this crate is to create a struct to encapsulate the behavior of the message pack protocol.

We fall into the same problem of comparing different protocols.

As everything in life, that depends. The decision to use enum based structs might be optimal for implementations that requires abstraction without additional overhead - with the trade-off of the enum being actually more expensive memory-wise compared to concrete struct variants.

Update:

Made some preliminary bench between rmpv and msgpacker, and msgpacker is faster. Will include also rmps

Serialization difference is small, but deserialization by ref is quite big.

Yes. It's not unheard of that libraries for supposedly faster formats are slower than libraries for JSON, because popular JSON libraries are highly optimized. Comparing libraries for different protocols helps find opportunities for optimization.

On the contrary, rmp_serde will create Value or better. I suggest you add it twice to the benchmark, both using Value and using serde_derive.

It depends on the use case, and as I mentioned before, the most likely use case is one where a soup of enums is a lot worse in every way. I don't see what kind of abstraction would make enum values optimal.

It's more like on par. More interesting would be if you can go twice as fast. You could for example try the bump arena trick I suggested in the other thread.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.