Compact memory layout for struct containing enum

This is very similar to a question posted a few years ago about optimizing the size of struct containing an enum by storing the discriminant in otherwise unused padding:

For context I have something like the following that I want to be 32-bytes large for cache efficiency purposes:

// 24 bytes
struct SomeData {
    v: [f32; 6]
}

// 4 bytes + 1 byte for tag -> aligned up to 8 bytes w/ 4-byte alignment
enum AnEnum {
    B1(u32),
    B2(f32)
}

// 35 bytes, 4 byte alignment -> 36
struct MyStruct {
    data: SomeData, // 24
    an_enum: AnEnum, // 8
    short: u16, // 2
    byte: u8 // 1
}

enum MyEnum {
    // each variant is 31 bytes w/ 4-byte alignment
    B1 {
        data: SomeData,
        b1: u32,
        short: u16,
        byte: u8 
    },
    B2 {
        data: SomeData,
        b2: f32,
        short: u16,
        byte: u8 
    }
}

fn main() {
    println!("AnEnum size: {}", std::mem::size_of::<AnEnum>()); // 8
    println!("MyStruct size: {}", std::mem::size_of::<MyStruct>()); // 36
    println!("MyEnum size: {}", std::mem::size_of::<MyEnum>()); // 32
}

In the case of MyEnum with duplicated fields it looks like the discriminant manages to fit in the last unused byte up to alignment.

I'm curious if there's been any proposals or features introduced that would allow the compiler to produce a more compact representation for MyStruct.

Thanks for reading!

We can't because we need to be able to get a reference to any of the fields, and this sort of reordering would prevent that. Note that with your representation you need to check the discriminant to access any of the fields. (minor cost)

I guess that is possible. For example in stable Rust size_of::<Option<Option<Option<bool>>>>() == 1

struct/enum layout is explicitly mentioned as not stable and the compiler does optimizing it version by version. For example, in Rust v1.0 size_of::<Option<Option<Option<bool>>>>() == 4

I think I follow your explanation, but could you expand on it a bit? Is it a fundamental limitation that the entire enum appear contiguously in memory, or is it possible that the compiler could break up the memory layout of the enum payload and its discriminant when used as a field inside another struct? Maybe this is a question better asked on Rust Internals?

Thanks for taking the time to answer.

This is the better forum for your question. Rust internals is for discussion of language and tooling enhancements / changes.

In terms of your original question, your definition of AnEnum requires that it be allocated as a unitary subfield of MyStruct. Thus the compiler has no choice but to define an 8 B layout, either representing the enum discriminant in 4 B, or in 1 B with 3 B of pad. Then when you incorporate AnEnum in MyStruct, it has to be loadable and storable without affecting any other fields, in particular without the potential that a store of those 8 B would change the short and byte fields of MyStruct.

2 Likes

Thanks that's a pretty clear explanation!

More in-depth discussion here: https://internals.rust-lang.org/t/optimizing-layout-of-nested-enums/5098