Need workaround for Rust memory model to read potentially padding bytes of the enum

I have an enum with variants of different size. I'd really like (for performance reasons in very hot code) to read fist N bytes of the variant's contents regardless of whether particular enum variant uses those bytes or not.

The problem is that there doesn't seem to be an ergonomic way to do that within Rust memory model. Adding explicit padding fields does work, but makes things very ugly.

I though of workaround by copying value with potentially uninitialized contents over initialized memory region, but looks like according to current Rust memory model override removes "initialized" status despite not making any sense from hardware point of view.

Something like this:

#![expect(dead_code)]

use std::mem::MaybeUninit;
use std::{ptr, slice};

#[derive(Copy, Clone, Debug)]
#[repr(C, u8)]
enum BadEnum {
    Small(u8),
    Large(u64),
}

#[inline(never)]
fn copy_enum_to_memory(mem: &mut MaybeUninit<BadEnum>, value: BadEnum) {
    mem.write(value);
}

fn main() {
    let mut mem = MaybeUninit::<BadEnum>::zeroed();

    let value = BadEnum::Small(42);
    copy_enum_to_memory(&mut mem, value);

    let initialized = unsafe { mem.assume_init() };

    let bytes = unsafe {
        slice::from_raw_parts(
            ptr::from_ref(&initialized).cast::<u8>(),
            std::mem::size_of::<BadEnum>(),
        )
    };

    let checksum: u32 = bytes.iter().map(|&b| b as u32).sum();
    println!("Checksum: {checksum}");
    println!("No UB found");
}

Right now it is UB:

error: Undefined Behavior: reading memory at alloc191[0x1..0x2], but memory is uninitialized at [0x1..0x2], and this operation requires initialized memory
  --> src/main.rs:33:44
   |
33 |     let checksum: u32 = bytes.iter().map(|&b| b as u32).sum();
   |                                            ^ Undefined Behavior occurred here
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
   = note: stack backtrace:
           0: main::{closure#0}
               at src/main.rs:33:44: 33:45
           1: std::iter::adapters::map::map_fold::<&u8, u32, u32, {closure@src/main.rs:33:42: 33:46}, {closure@<u32 as std::iter::Sum>::sum<std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}>>::{closure#0}}>::{closure#0}
               at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/adapters/map.rs:88:28: 88:34
           2: <std::slice::Iter<'_, u8> as std::iter::Iterator>::fold::<u32, {closure@std::iter::adapters::map::map_fold<&u8, u32, u32, {closure@src/main.rs:33:42: 33:46}, {closure@<u32 as std::iter::Sum>::sum<std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}>>::{closure#0}}>::{closure#0}}>
               at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/iter/macros.rs:279:27: 279:85
           3: <std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}> as std::iter::Iterator>::fold::<u32, {closure@<u32 as std::iter::Sum>::sum<std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}>>::{closure#0}}>
               at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/adapters/map.rs:128:9: 128:50
           4: <u32 as std::iter::Sum>::sum::<std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}>>
               at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/accum.rs:52:17: 56:18
           5: <std::iter::Map<std::slice::Iter<'_, u8>, {closure@src/main.rs:33:42: 33:46}> as std::iter::Iterator>::sum::<u32>
               at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:3674:9: 3674:23
           6: main
               at src/main.rs:33:25: 33:62

Uninitialized memory occurred at alloc191[0x1..0x2], in this allocation:
alloc191 (stack variable, size: 16, align: 8) {
    00 __ __ __ __ __ __ __ 2a __ __ __ __ __ __ __ β”‚ .β–‘β–‘β–‘β–‘β–‘β–‘β–‘*β–‘β–‘β–‘β–‘β–‘β–‘β–‘
}

I found that LLVM has freeze instruction, but there doesn't seem to be a way to use it from Rust. Ideally I'd like to read uninitialized bytes as zeroes, but as the last resort returning any value will do.

The goal is to do this ergonomically: not changing enum variants and not changing enum variant instantiation.

Is this really not possible today at all even on nightly?

It's not possible in the memory model at the moment; and it only might be added[1]. (Which means it is UB in inline assembly too.)

Could you tell about how checksum works in your code considering that padding may change at any moment, especially when value is moved?


  1. Most probable to happen when volatile access interface is remade β†©οΈŽ

the hardware is allowed to skip writing to, or write garbage to padding bytes.
this is useful and (imo) makes sense.
for example in your code, here is what could be happenning hardware-wise :


fn main() {
    let mut mem = MaybeUninit::<BadEnum>::zeroed(); // mem is 0x00_00_00_00_00_00_00_00__00_00_00_00_00_00_00_00

    let value = BadEnum::Small(42); // value is 0x00_XX_XX_XX_XX_XX_XX_XX__2A_XX_XX_XX_XX_XX_XX_XX
    copy_enum_to_memory(&mut mem, value);
    // mem is 0x00_XX_XX_XX_XX_XX_XX_XX__2A_XX_XX_XX_XX_XX_XX_XX
    let initialized = unsafe { mem.assume_init() };

    let bytes = unsafe {
        slice::from_raw_parts(
            ptr::from_ref(&initialized).cast::<u8>(),
            std::mem::size_of::<BadEnum>(),
        )
    };
    // reads XX (unitnitialized) bytes which may contain any sort of garbage and are defnitiely not deterministic, so nonsense as a checksum
    let checksum: u32 = bytes.iter().map(|&b| b as u32).sum(); 
    println!("Checksum: {checksum}");
    println!("No UB found");
}

note that freeze really wouldn't be helpful against this issue.

if you want to serialize your enum, for example to get untyped memory to do checksums on, you should either do it by hand or use a serialization library. some are really fast.

What I do when I need better control over enum representations is to have a manually written packed representation as a repr(C) struct and an ergonomic unpacked representation as an enum with only inline methods.

@nazar-pc i made a mistake.
i wrote :

let mut mem = MaybeUninit::<BadEnum>::zeroed(); // mem is 0x00_00_00_00_00_00_00_00__00_00_00_00_00_00_00_00

this is incorrect, as the documentation of zeroed says.
it is in fact

let mut mem = MaybeUninit::<BadEnum>::zeroed(); // mem is 0x00_XX_XX_XX_XX_XX_XX_XX__00_00_00_00_00_00_00_00

so you got unitialized bytes from the very start

With care, you can create a block of memory which contains one or more values of type BadEnum (without modifying the enum definition), and which has every byte initialized. The key ingredients are:

  • The type of that memory, in the view of its owner, must not be BadEnum itself, but a type of the same size which contains no padding.
  • You must manually write the enum’s discriminant and fields to that memory, without copying the entire enum value. If any of the fields’ types have possibly-uninit bytes, you must manually copy those fields separately too. In general, you must be certain that you are only copying initialized bytes; in the following example, this is ensured by using safe code to copy to_ne_bytes()es into a [u8].
use std::mem::MaybeUninit;
use std::{ptr, slice};

#[derive(Copy, Clone, Debug)]
#[repr(C, u8)]
enum BadEnum {
    Small(u8),
    Large(u64),
}

#[derive(Copy, Clone, Debug)]
#[repr(C, align(8))] // alignment must be sufficient for BadEnum's fields
struct Holder([u8; size_of::<BadEnum>()]);

impl AsRef<BadEnum> for Holder {
    fn as_ref(&self) -> &BadEnum {
        // SAFETY: we only hold valid values of the enum
        unsafe { &*(&raw const self.0).cast::<BadEnum>() }
    }
}

#[inline(never)]
fn copy_enum_to_memory(mem: &mut MaybeUninit<Holder>, value: BadEnum) {
    let mut holder = Holder([0; _]);

    match value {
        BadEnum::Small(ref field_value) => {
            let field_offset = (&raw const *field_value as usize) - (&raw const value as usize);

            holder.0[0] = 0; // discriminant
            holder.0[field_offset..][..size_of_val(field_value)]
                .copy_from_slice(&field_value.to_ne_bytes());
        }
        BadEnum::Large(ref field_value) => {
            let field_offset = (&raw const *field_value as usize) - (&raw const value as usize);

            holder.0[0] = 1; // discriminant
            holder.0[field_offset..][..size_of_val(field_value)]
                .copy_from_slice(&field_value.to_ne_bytes());
        }
    }

    mem.write(holder);
}

fn main() {
    for value in [BadEnum::Small(42), BadEnum::Large(42)] {
        let mut mem = MaybeUninit::<Holder>::zeroed();
        copy_enum_to_memory(&mut mem, value);

        let initialized_holder: Holder = unsafe { mem.assume_init() };
        let initialized_enum: &BadEnum = initialized_holder.as_ref();
        println!("Value: {initialized_enum:?}");

        let bytes = unsafe {
            slice::from_raw_parts(
                ptr::from_ref(initialized_enum).cast::<u8>(),
                std::mem::size_of::<BadEnum>(),
            )
        };

        let checksum: u32 = bytes.iter().map(|&b| b as u32).sum();
        println!("Checksum: {checksum}");
    }
}

There is no checksum in the actual code I use, that is just a simple demonstration of UB of the same nature I have in real code. In real code I need to read two u8 fields, which may or may not be physically present in some variants.

I don't care what value is returned in general, even if it is garbage, even if it is unique value on every read, I don't really care. What I care about is reading values without any branches. Serialization is not an option here. This code is executed in RISC-V interpreter on every RISC-V instruction. Every extra instruction on the critical path has a massive cost in this case.

I know, but that means modifying instantiation in non-trivial way. I could generate necessary code with a macro that allocates initialized memory and then writes a discriminant and all the fields explicitly, but I wanted some more generic way where I don't need to distinguish between enum variants at all.

I guess from memory model perspective I want to copy initialized bytes over and skip uninitialized bytes so they don't poison previously fully initialized memory.

I have a suspicion that since LLVM would have to prove that the value in a particular call sometimes has uninitialized bytes to apply destructive optimizations, opaque input value will not trigger UB in practice, but Miri will certainly complain since it can track memory in runtime and I like to keep Miri happy in my code base.

Currently it's the prevalent opinion, but IMO it's very much debatable.

When you write inline assembly you deal with a "real" hardware which does not have the notion of "uninitialized memory" (MADV_FREE and other shenanigans are not relevant in this particular case), so on the assembly level reading "uninitialized" memory is absolutely fine as long as it does not trigger exceptions. Now the problem arises when inline assembly starts to interact with the rest of the code which operates in AM. The easiest way to reconcile them is to describe inline assembly in terms of AM operations and in this context reading uninitialized memory in inline assembly is indeed considered UB. But it's not the only way to do it!

We could consider the inline assembly as a "black box" which returns N "random" bytes and as long we do not assign any wrong meaning to those bytes we should be fine (for example, if we just print them). Now, correctly assigning meaning to data which was read from memory with padding bytes can be really tricky and it should be done with extreme care.

It would be fine; however, then its return value has no relationship to enum's bytes in initialized case... Or would a story "this inline-asm block is equivalent to a match on enum's discriminant, reading a field or minting a random byte" work? I'm not sure which stories are admissible.

Separately: is it true that the uninit byte will under no conditions contain a secret key part/such? If it can, then is its value guaranteed to be unobservable to the emulated code, including via timings?

I assume that the inline asm block would need to have read access over the enum's discriminant (perhaps simply by having read access over the whole enum) for that story to work (even if the asm doesn't actually read the discriminant).

Yes, but only from the compiler's point of view. But we know that our inline assembly reads some "materialized" representation of enum and we can interpret the resulting bytes accordingly (which can easily result in a mistake if you are not extremely careful).

In other words, I believe that the following code by itself should be considered safe:

pub enum Foo {
    A(u8),
    B(u32),
}

pub fn read_foo(foo: &Foo) -> u64 {
    const { assert!(size_of::<Foo>() >= size_of::<u64>()) }

    let res: u64;
    unsafe {
        core::arch::asm!(
            // note: unaligned reads are allowed by `mov`
            "mov {res}, qword ptr [{in_ptr}]",
            in_ptr = in(reg) foo,
            res = out(reg) res,
            options(nostack, readonly, pure, preserves_flags),
        )
    }
    res
}

That is exactly my thinking. When machine executes instructions, something must be there, even if there are no guarantees what exactly. The problem is, Miri doesn't support inline assembly.

And writing conditional code that matches on enum under Miri and uses inline assembly otherwise partially defeats the purpose of running tests under Miri in the first place.

then you should read the data as MaybeUninit<[u8;N]>.
you get to read the value, no matter what it is, without branches.

MaybeUninit is potentially uninitialized memory, I can't get uninitialized bytes out of it without UB. But I do need the bytes, I just don't care what they are if they were not initialized properly.

that's not quite accurate. it is potentially initialized memory. the initialized bytes are still initialized, so it is not UB to read them.
if your algorithm relies on the value of the uninitialized bytes, then it is wrong, but if it doesn't, then there is no problem.

I think you may misunderstand the question raised in this thread. I don't need MaybeUninit to read initialized bytes and MaybeUninit doesn't help me read uninitialized bytes.

And no, my algorithm is not wrong, but it might be impossible to express in Rust today in ergonomic way, hence the title of this thread.

Using inline asm is not UB in this case. You just have to pick a story that is not UB. For example, the story could be to read the discriminant and then either read the value in the enum, or produce some other random value. The inline asm shared previously would be a correct implementation of that story.

Anyway, perhaps you could say more about how your code is actually using the uninit value in a correct way?

In this case I have a RISC-V interpreter implementation with instructions represented by variants of an enum. Something like this:

enum Instruction {
    Add {
        rs1: Reg,
        rs2: Reg,
        rd: Reg,
    },
    Sub {
        rs1: Reg,
        rs2: Reg,
        rd: Reg,
    },
    // ...
    Ld {
        rs1: Reg,
        rd: Reg,
        imm: i16,
    },
    // ...
    Ebreak,
    Unimp,
}

For each instruction I want to extract rs1/rs2 register operands (u8) to load them upfront before executing an instruction. I need that because I hit a wall in LLVM's ability to reason about large functions that makes it impossible to promote RISC-V registers to native registers with SROA.

Now the problem is that not all instructions have rs1 or rs2 operands. I can easily guarantee that all variants will have rs1 as the first field and rs2 as the second if they exist at all, but they might be missing too.

The instruction execution then takes rs1 and rs2 values, treats them as registers and reads corresponding values. After matching on enum variant those values may or may not be used. For instructions that do have those operands they will be meaningful and used, but for instructions that don't I am fine with reading arbitrary register value as rs1 and rs2 since they will not be looked at anyway.

Reading register values inside instruction handler works, but results in large code duplication with certain optimization approaches and it generally confuses LLVM. So I want to move reading of those fields unconditionally outside of the match.

This happens on every single RISC-V instruction, so I can't afford to match on instruction variant or do any other logic since it will defeat the performance benefits from the refactoring I'm trying to do. I do really want to just read two bytes out of the enum contents and interpret them as register indices. I don't care if they are random for uninitialized data, though zero would be nice since there is a fast path for reading zero register.

There's no way to do that which miri will accept.

The only thing I found that seemed relevant was Emit noundef LLVM attribute Β· Issue #74378 Β· rust-lang/rust Β· GitHub, but it is really old and have not seen comments since 2022