What's everyone working on this week (13/2021)?

New week, new Rust! What are you folks up to?

Today I have released the first version of the Neodyn Exchange serialization format. This format is a Serde serializer/deserializer inspired by various other formats. Its main highlights are:

  • It is a self-describing, schemaless format.
  • Just like LLVM bitcode, it features three equivalent and equally powerful representations:
    1. A human-readable textual format, inspired by JSON;
    2. A machine-readable, compact, byte-oriented binary format with variable-width encoding;
    3. An in-memory tree of Values.
  • Along with the usual primitive types (bool, integers, floating-point numbers, strings, arrays, maps), it can natively encode binary blobs even in the text format, using a readable, hexadecimal representation.
  • Map keys are allowed to be any type, not only strings. The Value enum implements many common traits including Hash and Ord, so it can be used as a key in many data structures.
  • It explicitly represents optional values and supports nested optionals, allowing you to correctly round-trip types like Option<Option<u64>>.
  • Like Apple's Property List format, the binary encoding applies string and blob interning in order to save space. This means that frequently repeated struct field names and enum variant names will not take up additional space beyond their first occurrence, making the result asymptotically as small as a schemaful binary format.
  • Accordingly, fields and variants are identified by their name even in the binary format, so adding or removing fields and variants in the middle of a data type does not break forward compatibility.
  • The serializer and deserializer keep track of the use count of each interned symbol. This means that if you are decoding from a stream reader, the deserializer will give out the owned buffer upon the last occurrence of a string or a blob, sparing you an unnecessary clone.
  • Trailing commas at the end of arrays and maps in the text format are allowed but not required. This makes the text format more diff-friendly and easier to machine-generate than JSON.