Zero-copy deserialize dynamically sized data


I am currently working on a no std, zero-copy and zero-allocation parser. Parsing structures only consisting of primitives and fixed-length arrays is fairly straight-forward and poses no issue at all, but now I have hit a roadblock with the following byte structure visualized using the Rust ish style:

struct Foo {
    size: u8,
    bar: Bar,

Where Bar is dynamically sized based on the amount of bytes specified in the size field

struct Bar {
    list: [Baz],
    padding: u8,

Where the list field is an array of zero or more Baz structs which themselves have a dynamic size, and a padding field which is simply zero.

struct Baz {
    size: u8,
    buf: [u8],

Where the buf field is an array of generic bytes with the size of the size field.

Any ideas on how to neatly parse this would be great. Thanks!

I've never worked much with defining custom DSTs, but I can see a few problems:

[T] where T is dynamically sized is impossible. Elements need a static size so the slice can compute where each element is.

The Bar structure is forbidden. You can only have exactly one DST field in a struct, and it needs to be the last thing in that struct. Also, DST fields are "infectious" in that they turn any structure they're in into a DST.

The Foo and Baz structures are not possible the way you seem to want. Because Baz contains [u8], you can't have values of type Baz, only pointers. Also, those pointers will carry the length of the [u8] with them as a usize. So, because the length must already be stored on the pointer, there's no benefit to having an explicit size field, and you're definitely not going to be able to shrink it to a u8.

Hopefully someone has a better answer than this for you, but if I had to do this, I'd do one of two things:

  1. Accept I can't avoid allocation and use Vec<T> or Box<[T]> for the dynamic arrays.

  2. Accept I can't represent this structure directly, and use "views" into the unparsed data that parse on the fly.

Yep, that's what I am currently doing. I simply hold the whole data in a byte slice reference to the original data and then iterate through the slice manually on the fly. With an initial run through the slice to ensure that the input data is not malformed.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.