Pot v1.0.0-rc.1: A concise, binary serialization format for serde

ecton · December 24, 2021, 1:00am

I released v1.0.0-rc.1 of my serialization crate, Pot. The name is inspired from being invented as the storage format for BonsaiDb, and bonsai trees need pots, of course.

Bad puns aside, what does this new format offer?

Fully self-describing, and comes with a Value type that allows you to deserialize arbitrarily encoded payloads.
Very concise. Compared to CBOR, it is more concise in most situations. This is achieved by not repeating the field names more than once in the serialized data. It can even compete with Bincode in some situations.

This is a pre-release, but this release comes after a significant push for covering edge cases with unit tests. However, it hasn't had much real-world testing yet. I'm hoping by starting to publicize it a little more that it will get more testing and any remaining bugs can be squashed.

josh · December 24, 2021, 9:13am

Nice!

Is being self-describing the primary advantage over bincode?

Is the format sufficiently self-describing that it can handle async decoding natively, without requiring an additional wrapper like async-bincode?

ecton · December 24, 2021, 2:39pm

Yes. If a self-describing format isn't needed, bincode is incredibly efficient. And, if users enable variable integer encoding in the options, it will consistently encode smaller than Pot.

Another advantage is compatibility. With Pot, there's a small header and magic code to enable fast format checking as well as future version compatibility if we want to add more features to the format without breaking backwards compatibility.

This isn't affected by being self-describing, nor is this isn't a limitation of bincode itself. It's a limitation of serde's API. The traits that power serialization and deserialization are not async (nor would that be a good idea for speed reasons). If input isn't available, the only way to wait for it is to block until its received. Ultimately, the reason this can't be done safely currently is the same reason AsyncRead/AsyncWrite can't be automatically implemented for Read/Write.

That being said, there's no reason async-bincode needs to be limited to Bincode. All it does is pre-serialize the payload, write a length header, and then write the entire payload. On the receiving end, it reads the length header and then waits until enough bytes have arrived before invoking the deserializer. This is a universal strategy that can be adapted to every serialization strategy, even ones that aren't serde-powered.

I guess I know what project idea I'm working on next!

Yandros · December 24, 2021, 3:24pm

Nice crate!

A nit regarding the README.md on GH: the plots are not very readable when using a dark theme:

ecton · December 24, 2021, 9:31pm

As a person who gets blinded by non-dark-mode sites, I'm aware. Unfortunately, I don't know how to get those plots out of Criterion in an image format other than SVG, and they're emitted with no background color. Now that I think about it, I suppose I can patch the SVGs in CI... but yuck! I'll figure something out eventually, but I opted to leave the graphs there in the meantime.

Thank you for the feedback and checking the crate out!

Yandros · December 24, 2021, 11:58pm

Yuck indeed We shouldn't need to do these things, but when looking at criterion, it looks like:

github.com/bheisler/criterion.rs

Set background color in plots

opened 10:40AM - 18 Mar 21 UTC

Systemcluster

The generated plots are created with transparent backgrounds. This makes them im…possible to read when embedded on a page with anything but white background color. This manifests even [in the official book](https://bheisler.github.io/criterion.rs/book/user_guide/plots_and_graphs.html#madmeanmediansdslope) when another theme than "light" is selected (for example "navy", which was selected by default for me): ![image](https://user-images.githubusercontent.com/2847328/111612522-d4a46d80-87dd-11eb-9f74-39f3bcbfa5bb.png) It would be very useful to have the option to configure a background color for plots.

hasn't had any response since ; I wonder how hard would it be to submit a PR with a tentative fix

system · March 24, 2022, 11:59pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Overwhelmed by the vast variety of serialization formats. Which to use when? help	12	4602	May 6, 2023
What's everyone working on this week (13/2021)? community	2	691	June 27, 2021
What purpose does the crate `bincode` serve in binary serialization that `serde` does not? help	4	4594	July 8, 2022
Implement own data format serialize and deserialize help	2	462	November 28, 2021
Recommended layout for serde for type with two formats help	7	450	January 20, 2022

Pot v1.0.0-rc.1: A concise, binary serialization format for serde

Related Topics