What would be a best strategy to represent different types of network messages

I am writing a UDS server where a server will receive a message of unspecified size(bounded).
First byte of the message indicates message group, second byte will indicate subgroup and subsequent bytes are optional, what is the best way to represent this in rust type system.

Additional Information:

  • it is designed for embedded system so it is no_std and also trying to make it no_alloc if possible.
  • looked into smoltcp, they represents each message fields as separate independent const ranges like
mod Amsg{
    const FieldA: Range = 0..2;
    const FieldB: Range = 2..4;
}

then the message is represented as simple &[u8] and fields are extracted/set via separate methods. But i feel like this is more error-prone.

So i tried to represent the message as

#[repr(packed)]
struct Amsg{
    fieldA: u8
    fieldB: u8
}

First I check for id of the message and based on that i transmute the buffer to given type but rust book suggests that repr packed should be avoided.

Is there any other better ways to interpret a arrived message and respond (call handler) based on type of message recieved?

Why do you need repr(packed) here?

so that i can directly transmute the received data

smoltcp defines a "abstract representation" type and a "raw bytes" type for each supported protocols. see the documents of the wire module:

take ethernet headers for example, EthernetFrame wraps a raw byte slice and provide methods to read (or decode) and write (or encode) individual field on the fly (meaning each time you call the methods, raw bytes are being read or written). the constant ranges you mentioned is for this purpose, they are the offset of the raw bytes.

the accessing the raw bytes is not efficient for reasons like alignment, endianness, etc, and it might contain mal-formed data. on the other hand, the high level abstract type EthernetRepr can be parsed from the raw bytes, which always holds valid packets. the protocols typically work with the high level representation, and only convert to the low level raw bytes when necessary.

for egress traffic, the protocol starts with a Repr and do the protocol work (padding, check sum, fragmentaion, routing, etc) and only when all the work is done, the repr is emit-ed into low level bytes and send to the network interface.

for ingress traffic, the protocol checks the validity of the incoming data, and convert the good packets into a Repr (together with it's payload raw bytes) and pass the Repr with the payload to next layer for processing.

no packed struct is every used in the whole process. packed struct should be avoid for most user cases, and you can always work with the raw bytes if you need to parse a raw network packet.

2 Likes

I still don't understand what you hope to achieve with repr(packed). How much do you understand about alignment and data layout? repr(packed) will only have an effect on how the data in the struct is layed out.
For example

repr(C)
struct Example {
    a: u8,
    b: u16,
}

will have an alignment of 2 bytes because its largest field b has an alignement of 2 bytes (16bit = 2bytes). Furthermore, there will be 1 byte of padding between a and b such that b will be correctly aligned to 2 bytes. So if we plot it it will look like this

0 mod 2    field a
1 mod 2    padding
0 mod 2    field b
1 mod 2    field b

I used repr(C) because the default rust layout is not defined and rust can optimize the layout as it sees fit.

If you used packed like

repr(C, packed)
struct Packed {
    a: u8,
    b: u16,
}

It will be layed out without padding and alignement is set to 1.

0 mod 1    field a
0 mod 1    field b
0 mod 1    field b

This is super error prone because unaligned access to fields through the dereference operator * are undefinded behaviour (UB) and you need to do stuff like
std::ptr::read_unaligned(std::ptr::addr_of!(packed.b)).

And again I don't see how this will help you with your problem.
Also

#[repr(C)]
struct Amsg{
    fieldA: u8
    fieldB: u8
}

will already not have any padding bytes, so packed is not needed. Also transmuting data from something like &u8 of length 3 to a struct that has fields that are larger than 1 byte will cause problems because the fields won't be aligned properly, causing again UB as soon as you dereference them. Even just struct.field is UB.

Edit:
That last sentence might not be true.

yeah you are right about performance penalty on unaligned access, I didn't knew that repr packed struct can still be packed in a optimized way causing problems. What I thought was representing message packet as struct would be less prone to programming errors, but it has its own disadvantages

I'm not talking about a performance penalty, although from what I read that's the case too. I'm talking about undefined behaviour.

1 Like

What did you think what repr(packed) does? The packed stands for "closely packed together" as in "without padding bytes"

I don't understand this. Help me out. Do you think that

#[repr(packed)]
struct Amsg{
    fieldA: u8
    fieldB: u8
}

can somehow hold messages of different size?

The packed stands for "closely packed together" as in "without padding bytes"

I knew this, but I didn't knew that it requires repr(c) along with it and i knew that unaligned access is bad but not this bad so that i should always avoid it.

I don't understand this. Help me out.

#[repr(packed)]
struct Amsg{
   fieldA: u8
   fieldB: u8
}

can somehow hold messages of different size?

No no, If i have

struct Amsg{
   fieldA: u8,
   fieldB: u8,
   fieldC: u16,
}

without repr(packed,C) I have to individually specify range and extract fields from &u[8] and in a big struct I can easily miss offset of a field. With repr(C, packed) i can take address of message and transmute it(with readunaligned!).
Edit: atleast that was my plan, that won't work so not going to do that

Ahhh now I get it. Yeah than afaik transmuting to the repr(C, packed) should be ok, but I am not entirely sure. But anyways, as I think you understood like 5 messages earlier than me, then you have a struct that is a nuissance to handle because it has unaligned fields.

I am not a 100 percent on this but I think without it the repr(C) the order of the fields is still not guaranteed which makes it useless for your purpose. I concluded this from the following sentence in the rustonomicon in the section about repr(packed)

This repr is a modifier on repr(C) and repr(Rust) .

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.