Reading and parsing TLV data from a TCP stream

Hi!

I'm pretty new to rust, and still working my way through tutorials and books and such.

One thing I wasn't able to find a clear answer to, is what is the most "rust-y" way of reading and parsing data out of a tcp stream (and later on, writing). I don't wish to force a particular serialization scheme (hence, using json or some other serde), and in some scenarios - I can't.

My data is a relatively simple TLV construct, for the sake of the example let's assume type: u16, length: u32, data - according to length.

I would like to read the 6 byte header from the stream, then read the amount of data I'm supposed to - then do some additional parsing based on the type.

In python I'd use struct.pack or unpack. In C I'd read the bytes into a struct representing what I want.
I'm not what's the rust way of doing such a thing. From what I read:

  • serde & friends didn't seem like a good choice since it forces a protocol. (unless I'm mistaken? Should I impl. my own?)
  • Converting memory representations also didn't feel quite right as the default for rust is not to assume a certain memory layout on a struct.
  • comparing lists of bytes seems cumbersome. Can I somehow do a nice match-enum on the "legal" types and get the length from the struct?

I don't care much for speed in this context - it's mostly a learning exercise. I want to know how an experienced rust programmer would tackle this problem.

I wasn't able to find what I was looking for while digging the internet, so any references would be very welcome.

Thanks a lot, I appreciate the help!

You can the frame in the following way:

let mut header = [0u8; 6];
tcp.read_exact(&mut header)?;
let frame_type = u16::from_be_bytes([header[0], header[1]]);
let frame_len = u32::from_be_bytes([header[2], header[3], header[4], header[5]]);

let mut frame = vec![0u8; frame_len as usize];
tcp.read_exact(&mut frame)?;

To match on the type, you could use a pattern like this:

mod frame_types {
    const TYPE1: u16 = 1;
    const TYPE2: u16 = 2;
}

match frame_type {
    frame_types::TYPE1 => { ... }
    frame_types::TYPE2 => { ... }
    _ => { /* invalid frame type */ }
}
1 Like

Thanks @alice !
This definitely looks like a solid solution.

I guess my next question would be - what should I do if my structure contains more fields than just type, length? e.g. let's say I got 10 fields, with different types. I could follow this same example, but it seems a little cumbersome process to parse 10 fields this way. Is there a nicer way to "quickly" turn this into a struct, or some other equivalent?

Thanks again!

Well, you could define a utility to read various data types from the byte array. For example:

use std::io::{Result, ErrorKind};

struct BufferParser<'a> {
    data: &'a [u8],
}

impl<'a> BufferParser<'a> {
    fn new(data: &'a [u8]) -> Self {
        Self { data }
    }
    
    fn next_u32(&mut self) -> Result<u32> {
        if self.data.len() < 4 {
            Err(UnexpectedEof.into())
        } else {
            let u32_data = [0u8; 4];
            u32_data.copy_from_slice(&self.data[..4]);
            self.data = &self.data[4..];
            u32::from_be_bytes(u32_data)
        }
    }
}

Then you can read the data like this:

let mut parser = BufferParser::new(&frame);
let my_struct = MyStruct {
    field1: parser.next_u32()?,
    field2: parser.next_u32()?,
    field3: parser.next_u32()?,
};

Rust guarantees that the methods are called in the order they are written when you do this.

There are crates that can provide something like the utility above. For example, the Buf trait from the bytes crate can do it.