Confused by monomorphizations and memory layout

I am reading this section of the Rust nomicon here repr(Rust) - The Rustonomicon and I am finding hard to get the point that is being made regarding monomorphizations and memory layout.

This is what I understand thus far.

In that section it says

By default, composite structures have an alignment equal to the maximum of their fields' alignments. Rust will consequently insert padding where necessary to ensure that all fields are properly aligned and that the overall type's size is a multiple of its alignment.

Then it gave this as an example

struct A {
    a: u8,
    b: u32,
    c: u16,
}

So the overall type size should be a multiple of 32 right? Given the maximum size is 32. It then gave an example of how this type can be laid out ensuring this holds true. The first example was

struct A {
    a: u8,
    _pad1: [u8; 3], // to align `b`
    b: u32,
    c: u16,
    _pad2: [u8; 2], // to make overall size multiple of 4
}

Which eventually gives 8 + 24 + 32 + 16 + 16 = 96 and 96 is a multiple of 32, so it checks out.

But if we sum the sizes of the individual fields, we get 8 + 32 + 16 = 56, which is 8 less 64 and 64 is also a multiple of 32. So another way to do this is to pad with just 8. And the this was given as the second example

struct A {
    b: u32,
    c: u16,
    a: u8,
    _pad: u8,
}

Up till now, this sort of make sense.

Then it introduced the interplay of monomorphizations and memory layout by giving the generic struct

struct Foo<T, U> {
    count: u16,
    data1: T,
    data2: U,
}

and makes the argument that struct Foo<u16, u32> is already aligned while struct Foo<u32, u16> is not.

So for struct Foo<u16, u32> that would be

struct Foo<u16, u32> {
    count: u16,
    data1: u16,
    data2: u32,
}

Largest field is 32, so total size should be a multiple of 32, and this is the case with 16+16+32 = 64.

While it says struct Foo<u32, u16> would have to be padded this way

struct Foo<u32, u16> {
    count: u16,
    _pad1: u16,
    data1: u32,
    data2: u16,
    _pad2: u16,
}

and the book makes the case that this wastes space. But I don't get why it even needs to be padded this way? Given that struct Foo<u32, u16> is

struct Foo<u16, u32> {
    count: u16,
    data1: u32,
    data2: u16,
}

This also adds up to 64, a multiple of 32. So why does struct Foo<u16, u32> require padding but struct Foo<u16, u32> does not?

Alignment of the fields. Here's the first two paragraphs from your link:

First and foremost, all types have an alignment specified in bytes. The alignment of a type specifies what addresses are valid to store the value at. A value with alignment n must only be stored at an address that is a multiple of n. So alignment 2 means you must be stored at an even address, and 1 means that you can be stored anywhere. Alignment is at least 1, and always a power of 2.

Primitives are usually aligned to their size, although this is platform-specific behavior.

And compare

// alignment of u16: 2
// alignment of u32: 4
// alignment of `Foo` = max(2, 4) = 4
struct Foo<u32, u16> { // Address mod 4     Address mod 2
    count: u16,        // 0                 0
    data1: u32,        // 2 (!)             0
    data2: u16,        // 2                 0
} // EndOfStruct       // 0                 0

struct Foo<u32, u16> { // Address mod 4     Address mod 2
    count: u16,        // 0                 0
    _pad1: u16,        // 2                 0
    data1: u32,        // 0                 0
    data2: u16,        // 0                 0
    _pad2: u16,        // 2                 0
} // EndOfStruct       // 0                 0

Due to their alignments, u32 cares about their address mod 4 and u16 cares about their alignment mod 2. In the first layout of Foo, data1 is not properly aligned. Everything would work if data1 and data2 were swapped though:

struct Foo<u32, u16> { // Address mod 4     Address mod 2
    count: u16,        // 0                 0
    data2: u16,        // 2                 0
    data1: u32,        // 0                 0
} // EndOfStruct       // 0                 0

But the point of the example in the Nomicon was what would happen

if Rust didn't reorder fields

and concludes that

An optimal use of space requires different monomorphizations to have different field orderings.

5 Likes

Maybe here's a better way to visualize it.

       +---+---+---+---+---+---+---+---+---+---+---+---+
bytes  |   |   |   |   |   |   |   |   |   |   |   |   |
       +---+---+---+---+---+---+---+---+---+---+---+---+
mod 2: 0   1   0   1   0   1   0   1   0   1   0   1   0
       +-------+-------+---------------+-------+-------+
Foo:   | count | _pad1 | data1         | data2 | _pad2 |
       +-------+-------+---------------+-------+-------+
mod 4: 0   1   2   3   0   1   2   3   0   1   2   3   0

The u16 have to start and end on 0 mod 2; the u32 and Foo itself need to start and end on 0 mod 4.

Unlike repr(C) repr(Rust) does not guarantee any field order. Which means the compiler has the freedom to optimize their placement to minimize the necessary alignment padding.
This is done separately for each concrete set of generic arguments, i.e. for each monomorphization.
This also means that Foo<u32, [u8; 4]> and Foo<u32, [NonZeroU8; 4]> may end up having different layouts because they're different concrete types which means you can't reliably transmute between them.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.