Crc of a struct

Hello.

I'm quite new to this programming language and I'm wondering how can I calculate crc of a struct with crc field in Rust.
Let's say I have struct like this one:
struct Data {
field1 : i32,
field2 : f64,
field3 : u8,
// ...
crc : u16
}
In C you usually pass the address of that struct and then the sizeof(struct Data) - sizeof(Data.crc) and assign that to crc. How this is done in Rust? Should I pass all fields one by one to crc function and then assign sum of those to crc?

Thank you and best regards.

In C you usually pass the address of that struct and then the sizeof(struct Data) - sizeof(Data.crc) and assign that to crc

You can do the same in (unsafe) rust, however it will not be portable (as in: returnt he same value) between little- and big-endian architectures.

Make sure you use #[repr(C)] because otherwise the layout is not guaranteed and crc may not be the last field.

Should I pass all fields one by one to crc function and then assign sum of those to crc?

That's the only way to do it with safe rust. Unless there's a strong performance reason to want to CRC the entire structure at once, it's probably preferable.

No you can't do so in C, unless there isn't any padding between fields. Let's assume we have some C struct like this:

struct Data {
    int32_t field1;
    double field2;
    uint8_t field3;
}

How big this struct would be? 4 + 8 + 1 = 13? No. This struct would occupy 24 bytes in memory. Why? Because the compiler inserts padding between fields to fulfill each field's alignment requirement.

What's alignment? Most types requires its address to be multiple of some value when it stored on memory for fast processing. For example double, an IEEE754 double precision floating point number type wants its address to be multiple of 8.

Unless otherwise specified, C compiler design the struct's layout to fulfill all its field's alignment. Let's get back to your struct Data. Struct itself's alignment follows the largest alignment of all its fields, and in this case it's 8 for double field2. int32_t field1 has alignment 4 which just fits into the offset 0 and it takes 4 bytes in size. Let's select offset for the field2. Obviously the offset 4 can't fulfill its alignment, 8. So we need to leave the next 4 bytes and give field2 an offset 8. This 4 bytes are padding.

According to the C spec, reading from the padding bytes is same as reading from uninitialized memory. And make decision based on the value read from uninitialized memory is clearly UB. This means the C code crc(&data, sizeof(Data)) is UB so the compiler will "optimize" it and produces faster but incorrect code, most likely the empty code.

But how about the Rust? It's similar, but worse. Unless it's #[repr(C)], the ordering of struct's fields are not specified. It means not only the field3 may comes before the field2 in memory, but also the ordering may change between the compiler version.

10 Likes

You can #[derive(Hash)] to automatically generate code to compute a hash of the data.
Then you will need something to compute the CRC (implementing Hasher).

Finally you will need to separate data and hash, as the hash is a function of the data.
A wrapper containing the data and the hash would work.

4 Likes

You might consider using zerocopy::AsBytes to safely convert a struct to a &[u8]. It will ensure, for example, that your struct has no padding and has a defined representation (such as repr(C)).

1 Like

For such cases I take digest of the struct encoded with bincode, as it guarantees a stable representation with no uninitialized padding.

Note that bincode can write to a smallvec or any &mut [u8] slice, so the "encoding" can be quite cheap and without heap allocations.

1 Like

May I ask why you're calculating cyclic redundancy checks? In 2020 they are not used that often in error detection anymore, in favor of proper hash functions.

Is it an intellectual exercise?

1 Like

@ jjpe
no it is not an exercise it's embedded code so it has to be this way. There are requirement so that last field is crc :slight_smile:

@ Hyeonu
cool but C people already solved this by adding packed attribute :slight_smile:

This is as I've already mentioned some C/C++ project with strict requirements about structure fields etc but I was wondering if I can add some spice to it by writing my part in Rust since only functionality count and nobody will look at my code. :wink: So this is purely theoretical discussion right now. :slight_smile: I want to see what is out there :wink:
Thank you for all the replies I'll take a look at possible solutions in a moment, right now I need some rest. :slight_smile:

2 Likes

Rust also has packed repr modifier. See reference.

2 Likes

In terms of struct layouts, Rust can define pretty much everything that C can so you can definitely write code which will look identical to the rest of the project!

It sounds like you want the #[repr(packed)] specifier as @Riateche said. This is identical to packed in C, and gives you full control over a struct's layout.

I'd also look at the zerocopy::AsBytes trait (as @BurntSushi mentioned) so you can view your data as a bunch of bytes for CRC calculation.

One way you could implement this is by splitting the data and the CRC.

use byteorder::BigEndian;
use zerocopy::AsBytes;

#[repr(packed)]
struct Message {
  /// The message's body.
  pub data: Data,
  /// The message's checksum.
  ///
  // A 2-byte array is used instead of a `u16` so we can explicitly handle 
  // endianness.
  crc: [u8; 2], 
}

impl From<Data> for Message {
  fn from(data: Data) -> Message {
    let mut crc = [0; 2];
    BigEndian::write_u16(&mut crc, data.calculate_crc());

    Message { data, crc }
  }
}

#[derive(AsBytes)]
pub struct Data {
  field1 : i32,
  field2 : f64,
  field3 : u8,
  ...
}

impl Data {
  pub fn calculate_crc(&self) -> u16 { crc(self.as_bytes()) }
}

fn crc16(bytes: &[u8]) -> u16 { ... }

You may want to be wary of endianness issues, especially if these structs are being sent across a network to another machine (the main reason you might want a CRC). A f64 or a u16 in your struct will be written with the host machine's endianness, and there's a chance the receiver will read the fields "backwards".

Abstractions-wise, I would try to get something like:

type Crc = u16; // or [u8; 2], or newtype, or whatever needed

struct Crced<T> {
   data: T,
   crc: Crc
}
  • AsByted/packed or whatever else is needed. Then you can have nice and generic handling for any struct you like.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.