Struct padding rules in Rust

Recently when I was learning Type Layout in Rust (Type layout - The Rust Reference), I saw that struct in Rust supports the #[repr(C)] directive, so I wanna to see the difference between the default(Rust) representation and C-like representation. Here comes the code:

use type_layout::TypeLayout;

#[derive(TypeLayout)]
struct ACG1 {
    time1: u16, // 2
    time2: u16, // 2
    upper: u32, // 4
    lower: u16, // 2
}

#[derive(TypeLayout)]
#[repr(C)]
struct ACG2 {
    time1: u16, // 2
    time2: u16, // 2
    upper: u32, // 4
    lower: u16, // 2
}
fn main() {
    println!("ACG1: {}", ACG1::type_layout());
    println!("ACG2: {}", ACG2::type_layout());
}

and I get the following output:


I can understand the padding rules and the size of the structure under the C representation, but what confused me is the struct ACG1 under Rust representation, I can't find any clear documentation on Rust padding rules, and I think the padding size should also be included in the overall size of the structure, but why is the size of ACG1 only 12 bytes?

BTW, This is the crate I used to assist in printing the layout of the structure: https://crates.io/crates/type-layout

The padding in Rust structs is not only not defined to be anything, it is not even stable across compilations.
That said, type-layout internally uses memoffset crate to calculate the offsets. Which constructs a pointer of the type and uses that to calculate the offset. I suspect this introduces problems. But I cannot put down a finger where....
Edit: Look at the answer from Hyenou.

The clear documentation is here.

The memory layout of a struct is undefined by default to allow for compiler optimizations like field reordering, but it can be fixed with the repr attribute.

In other words, it's UB to rely on any observation of it.

Also, it seems the result you got from the type-layout is incorrect.

3 Likes

Well that clears it up.

@RedDocMD @Hyeonu
Thank you both very much for your answers. What I actually want to know is: what rules does Rust's compiler use to lay out a structure, because my previous language was C++, so I'm trying to figure out something more fundamental; but when I check Rust's documentation, it seems that the richest explanation for the layout of a structure is that the layout of a structure is unstable and unobservable, so I'm a bit difficult to accept :frowning:
I used type-layout because I didn't find the official tools, and it seems to me that Rust's compiler shouldn't hide this information from developers, or rather, should provide a way to observe it all the time, even if the result may be different each time

It's okay, you'll get used to it.

It does - that's what #[repr(C)] is for - when you need to depend on the struct layout.
When you don''t need to, you don't care and Rust does whatever it finds convenient and thinks will do the task.

2 Likes

The slippery slope you risk there is that people will start to depend on implementation details, often rationalizing it as "it's okay because this is just a toy library and I don't care if it breaks", or "I really need to do X, so I'll use unsafe to calculate layouts anyway".

To quote Hyrum's Law:

With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody.

By not providing any official tools for inspecting the layout of a #[repr(Rust)] type, you make it simpler for people to follow the correct path of using #[repr(C)].

7 Likes

the result of type-layout is indeed wrong, it only works for structs which representation is C-like. see: Enforce repr(C)? · Issue #4 · LPGhatguy/type-layout · GitHub

I have found some useful links:
1.Optimizing Rust Struct Size: A 6-month Compiler Development Project | Blindly Coding
2. struct field reordering and optimization by ahicks92 · Pull Request #37429 · rust-lang/rust · GitHub
These articles explain why and how rust does this, and those interested in this issue may find useful information here :thinking:

1 Like

Looking at how type_layout is implemented, in particular type-layout/lib.rs at aacc3f053ee6cce278a723dcade584e55d8f91c7 · LPGhatguy/type-layout · GitHub I think it works only if the fields are not reordered, which is however what's happening for ACG1, in particular upper is reordered to be before time1 and time2. So what's happening is:

  • it sees time1 at offset 4 (because there's upper before) and think there are 4 bytes of padding before (wrong!)
  • it sees time2 after time1 (correct)
  • it sees upper after time2 (wrong, it's before time1!) and resets the last offset at byte 4 (while it should be at byte 8, 4 bytes of difference)
  • it sees lower after upper, sees the 4 bytes difference and thinks it's padding (wrong)
  • finally it correctly sees 2 padding bytes after lower

The correct layout should be:

ACG1: ACG1 (size 12, alignment 4)
| Offset | name      | Size |
| ------ | --------- | ---- |
| 0      | upper     | 4    |
| 4      | time1     | 2    |
| 6      | time2     | 2    |
| 8      | lower     | 2    |
| 10     | [padding] | 2    |

Edit: typo

1 Like

Yes! After I used rustc -- -Zprint-type-sizes, I got the same output as you wrote :slightly_smiling_face:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.