Compact memory layout for struct containing enum

123v124rv12 · September 8, 2019, 3:21am

This is very similar to a question posted a few years ago about optimizing the size of struct containing an enum by storing the discriminant in otherwise unused padding:

For context I have something like the following that I want to be 32-bytes large for cache efficiency purposes:

// 24 bytes
struct SomeData {
    v: [f32; 6]
}

// 4 bytes + 1 byte for tag -> aligned up to 8 bytes w/ 4-byte alignment
enum AnEnum {
    B1(u32),
    B2(f32)
}

// 35 bytes, 4 byte alignment -> 36
struct MyStruct {
    data: SomeData, // 24
    an_enum: AnEnum, // 8
    short: u16, // 2
    byte: u8 // 1
}

enum MyEnum {
    // each variant is 31 bytes w/ 4-byte alignment
    B1 {
        data: SomeData,
        b1: u32,
        short: u16,
        byte: u8 
    },
    B2 {
        data: SomeData,
        b2: f32,
        short: u16,
        byte: u8 
    }
}

fn main() {
    println!("AnEnum size: {}", std::mem::size_of::<AnEnum>()); // 8
    println!("MyStruct size: {}", std::mem::size_of::<MyStruct>()); // 36
    println!("MyEnum size: {}", std::mem::size_of::<MyEnum>()); // 32
}

In the case of MyEnum with duplicated fields it looks like the discriminant manages to fit in the last unused byte up to alignment.

I'm curious if there's been any proposals or features introduced that would allow the compiler to produce a more compact representation for MyStruct.

Thanks for reading!

RustyYato · September 8, 2019, 4:16am

We can't because we need to be able to get a reference to any of the fields, and this sort of reordering would prevent that. Note that with your representation you need to check the discriminant to access any of the fields. (minor cost)

Hyeonu · September 8, 2019, 4:26am

I guess that is possible. For example in stable Rust size_of::<Option<Option<Option<bool>>>>() == 1

struct/enum layout is explicitly mentioned as not stable and the compiler does optimizing it version by version. For example, in Rust v1.0 size_of::<Option<Option<Option<bool>>>>() == 4

123v124rv12 · September 8, 2019, 4:56am

I think I follow your explanation, but could you expand on it a bit? Is it a fundamental limitation that the entire enum appear contiguously in memory, or is it possible that the compiler could break up the memory layout of the enum payload and its discriminant when used as a field inside another struct? Maybe this is a question better asked on Rust Internals?

Thanks for taking the time to answer.

TomP · September 8, 2019, 5:10am

This is the better forum for your question. Rust internals is for discussion of language and tooling enhancements / changes.

TomP · September 8, 2019, 5:23am

In terms of your original question, your definition of AnEnum requires that it be allocated as a unitary subfield of MyStruct. Thus the compiler has no choice but to define an 8 B layout, either representing the enum discriminant in 4 B, or in 1 B with 3 B of pad. Then when you incorporate AnEnum in MyStruct, it has to be loadable and storable without affecting any other fields, in particular without the potential that a store of those 8 B would change the short and byte fields of MyStruct.

123v124rv12 · September 8, 2019, 5:36am

Thanks that's a pretty clear explanation!

123v124rv12 · September 8, 2019, 7:11am

More in-depth discussion here: Optimizing layout of nested enums? - compiler - Rust Internals

system · December 7, 2019, 7:11am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compact memory layout of enum with shared fields help	6	2757	January 12, 2023
Compact nested enum help	3	1530	September 23, 2022
Layout rules for structs and enums? help	10	1057	January 12, 2023
Memory layout for enums and boxes help	7	254	December 5, 2023
Enum Size Optimization help	3	345	January 31, 2024

Compact memory layout for struct containing enum

Related Topics