How to modify possibly-uninitialized memory without UB?

I'm trying to create a MangledBox which would heap-store bytes of T XORed with a random key. Unfortunately, once I write T there might have been some padding introduced, and it seems no longer possible to XOR the bytes back.

struct MangledBox<T: Sized> {
    data: NonNull<MaybeUninit<T>>,
    key: MaybeUninit<T>,
}

impl<T: Sized> MangledBox<T> {
    pub fn new() -> Self {
        let mut key = MaybeUninit::uninit();
        getrandom::fill_uninit(key.as_bytes_mut()).expect("no keygen");
        Self {
            data: Box::into_non_null(Box::new_zeroed()),
            key,
        }
    }

    pub fn with_unmangled<F, R>(&mut self, f: F) -> R
    where
        F: FnOnce(NonNull<T>) -> R,
    {
        let data_ptr = self.data.as_ptr().cast::<u8>();
        let key_ptr = self.key.as_ptr().cast::<u8>();
        
        for i in 0..size_of::<T>() {
            let key_byte = unsafe {*key_ptr.wrapping_add(i)};
            let data_byte = ???;
            unsafe {data_ptr.write_volatile(data_byte ^ key_byte);}
        }
        todo!("omitted for brevity")
    }
}

Is there no freeze or similar function which would allow working with the range?

There is, as of yet, no non-UB way to read uninitialized data in a typed fashion.[1] There's no exposed llvm freeze. So you need to limit the types with something like bytemuck's NoUninit. Probably you just want to use bytemuck or zerocopy's functionality as they're designed for these sorts of use cases.

Here's one related issue. There are other more direct conversations, but they're a pain to search for.


  1. Some want it, some want to never allow it. ↩︎

In fact I don't need to read uninitialized memory to Rust per se (if there were intrinsics to just do arithmetics and put it back in place), but that stance is understandable.

bytemuck is fine, better solution than I expected! Incidentally, Strings and references don't implement bytemuck::NoUninit which fits my MangledBox perfectly: there's not much point in masking pointers rather than data.

A follow-up on this: I realized that assembler intrinsics do not have any problems with LLVM-"uninitialized" memory, so they can be used! They serve as basis of https://docs.rs/secretmangle/latest/secretmangle/struct.MangledBoxArbitrary.html.

Unfortunately that is not very cross-platform, and hand-written code is likely slower than what vector instructions could do, but still something.

In fact I don't need to read uninitialized memory to Rust per se (if there were intrinsics to just do arithmetics and put it back in place), but that stance is understandable.

Uninitialized bytes are not guaranteed to have the same value each time you read them, so there isn't a "place" to put them back until you initialize them. See the MaybeUninit documentation.

For my usecase it would have been totally fine to put uninit/poison bytes back into the allocation - if T has padding in that place, then they are not going to be read either.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.