Pot v1.0.0-rc.1: A concise, binary serialization format for serde

I released v1.0.0-rc.1 of my serialization crate, Pot. The name is inspired from being invented as the storage format for BonsaiDb, and bonsai trees need pots, of course.

Bad puns aside, what does this new format offer?

  • Fully self-describing, and comes with a Value type that allows you to deserialize arbitrarily encoded payloads.
  • Very concise. Compared to CBOR, it is more concise in most situations. This is achieved by not repeating the field names more than once in the serialized data. It can even compete with Bincode in some situations.

This is a pre-release, but this release comes after a significant push for covering edge cases with unit tests. However, it hasn't had much real-world testing yet. I'm hoping by starting to publicize it a little more that it will get more testing and any remaining bugs can be squashed.

3 Likes

Nice!

Is being self-describing the primary advantage over bincode?

Is the format sufficiently self-describing that it can handle async decoding natively, without requiring an additional wrapper like async-bincode?

Yes. If a self-describing format isn't needed, bincode is incredibly efficient. And, if users enable variable integer encoding in the options, it will consistently encode smaller than Pot.

Another advantage is compatibility. With Pot, there's a small header and magic code to enable fast format checking as well as future version compatibility if we want to add more features to the format without breaking backwards compatibility.

This isn't affected by being self-describing, nor is this isn't a limitation of bincode itself. It's a limitation of serde's API. The traits that power serialization and deserialization are not async (nor would that be a good idea for speed reasons). If input isn't available, the only way to wait for it is to block until its received. Ultimately, the reason this can't be done safely currently is the same reason AsyncRead/AsyncWrite can't be automatically implemented for Read/Write.

That being said, there's no reason async-bincode needs to be limited to Bincode. All it does is pre-serialize the payload, write a length header, and then write the entire payload. On the receiving end, it reads the length header and then waits until enough bytes have arrived before invoking the deserializer. This is a universal strategy that can be adapted to every serialization strategy, even ones that aren't serde-powered.

I guess I know what project idea I'm working on next!

Nice crate! :ok_hand:

A nit regarding the README.md on GH: the plots are not very readable when using a dark theme:

As a person who gets blinded by non-dark-mode sites, I'm aware. Unfortunately, I don't know how to get those plots out of Criterion in an image format other than SVG, and they're emitted with no background color. Now that I think about it, I suppose I can patch the SVGs in CI... but yuck! :slight_smile: I'll figure something out eventually, but I opted to leave the graphs there in the meantime.

Thank you for the feedback and checking the crate out!

1 Like

Yuck indeed :grinning_face_with_smiling_eyes: We shouldn't need to do these things, but when looking at criterion, it looks like:

hasn't had any response since :sweat_smile: ; I wonder how hard would it be to submit a PR with a tentative fix :thinking:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.