In place network bytes parsing


#1

Hi,

i am very new to Rust, really like the idea of safety it offers and chooses to use it as a main language for my next project but soon hit the blocker could some one please show some light.

I need to write a server to respond to some client requests, format is predefined, so not possible to define a new one.

the protocol is some thing like this

<tag (big endian number 3bytes)> <type (1byte)> {type specific size can be very large with multiple things inside }

in c i can simply define a struct with bit fields to interpret the data in place and can easily convert 3 bytes to int (big or little).

i am not able to find anything that let me do such things in Rust, the data rates are very high like 20Gbps. and request sizes can be very large. so converting each and every thing with copy might not be an option.

could someone let me know my options (serde etc or c code ),

please excuse if this is not the right place for this question.

Thanks in advance,
Tirumalesh.


#2

That’s exactly the use case for the untrusted library. See also chomp and nom which are also zero-copy, IIRC.


#3

Are you looking for something like nom (parser combinator library) or byteorder (if you want to write it manually). Something like this is actually quite easy to do in Rust, I’ve written parsers for similar high throughput applications and never had any issues. For such a simple packet format it’s not worth using something really heavyweight like serde.

If you are really desperate and want to do things the C way, you can use mem::transmute() to tell Rust to blindly reinterpret a selection of bytes (effectively what you do in C with a cast) but that’s a really unsafe thing to do. Imagine if you got an invalid type and then blindly cast it to an enum which has no corresponding variant (UB in Rust). Any match statements you do on that type are now going to be invalid and could possibly corrupt your application’s state.

Just out of curiosity, what network protocol uses a 24-bit integer? I can’t say I’ve ever encountered them before out in the real world.


#4

Thanks for the suggestions.

  1. nom seems interesting. can i parse Optional fields with NOM?

  2. will nom also have similar things to serialization? (from struct -> bytes)
    may be its as simple as implementing to_bytes on struct and on all inner structs/enums.
    so its may be similar to write on own.

  3. will it be lightweight? i am worried about debugging when something goes wrong.


#5

You might find http://spw17.langsec.org/papers/chifflier-parsing-in-2017.pdf interesting - it uses nom and has some example parsers (including parsing optional fields, which you asked about).


#6

thanks that looks very helpful.


#7

Nom AND byteorder are awesome, definitely use them if you can!

If you find yourself running into their limits, a nice, unsafe new feature in rust 1.19 is union support. This will allow the horrible C practice of reinterpreting bytes as different types! Horrible! (But potentially very useful).