Sources of uninitialized memory in Rust

jgarvin · May 4, 2022, 8:59pm

Here are the ways you can get uninitialized memory (undef in LLVM) that I am aware:

Padding between struct fields is considered uninitialized
MaybeUninit<T>
mem::uninitialized() (deprecated)
Padding between enum's hidden tag field and the variant actually contained

Are there any other circumstances where it comes up?

kpreid · May 4, 2022, 9:08pm

Calling the allocator returns a pointer to uninitialized memory.

Also, this is not another item for your list, but note that “between” is too specific; padding can exist at the end of a type due to alignment requirements (the size is always a multiple of the alignment).

LegionMammal978 · May 4, 2022, 9:19pm

The unused space in a union (which is how MaybeUninit works)

alice · May 4, 2022, 9:22pm

The unused capacity of a Vec. (a special case of calling the allocator)

quinedot · May 4, 2022, 10:29pm

Moving a non-Copy value leaves its old location uninitialized.

Writing to a pointer direct (*p = something) drops, and that Drop implementation may be handed uninitialized memory.

Cyborus · May 4, 2022, 10:37pm

Calling a naughty foreign function

LegionMammal978 · May 4, 2022, 10:39pm

Wouldn't this be UB in the case of an explicit Drop impl?

quinedot · May 4, 2022, 10:47pm

Yeah, true, it's immediate UB because a &mut T is created.

chrefr · May 5, 2022, 12:12am

That is a special-case of padding.

chrefr · May 5, 2022, 12:13am

Are you sure about that? I remember it being discussed whether it makes the memory uninitialized or just duplicates it. Better be safe, of course

quinedot · May 5, 2022, 12:15am

If it wasn't invalidated somehow, you could have aliasing, say.

The Nomicon says

If a value is moved out of a variable, that variable becomes logically uninitialized if the type of the value isn't Copy.

chrefr · May 5, 2022, 12:20am

If you don't use the original value everything is fine. Of course you should not use it in case it invalidates something.

Here it is:

github.com/rust-lang/unsafe-code-guidelines

Are `Copy` implementations semantically relevant (besides specialization)?

opened 04:20PM - 25 Nov 21 UTC

steffahn

The question comes basically from https://users.rust-lang.org/t/is-this-undefine…d-behaviour/67886 I’m wondering whether the fact that a type `A` implements `A: Copy` or not can make a semantic difference, i.e.: can code using the struct that compiles either way, with the implementation or without it, behave differently depending on that `impl` alone? In particular, can it trigger UB in only one of the cases? (I do not consider any code that specializes on a `T: Copy` bound, because then the bound can trivially make a difference.) The canonical code in mind is something like ```rs #[derive(Clone)] struct A(i32); impl Copy for A {} fn main() { let x = A(42); let p: *const A = &x; drop(x); println!("{}", unsafe { (*p).0 }); } ``` which is obviously sound; compared to ```rs #[derive(Clone)] struct A(i32); // REMOVED // impl Copy for A {} fn main() { let x = A(42); let p: *const A = &x; drop(x); println!("{}", unsafe { (*p).0 }); } ``` Does the latter code example trigger UB? (Miri doesn’t think it does.)

quinedot · May 5, 2022, 12:33am

The direct issue is

github.com/rust-lang/unsafe-code-guidelines

What about: use-after-move and (maybe) use-after-drop

opened 01:08PM - 31 May 19 UTC

RalfJung

T-uninitialized T-memory C-open-question

Miri currently considers `Copy` and `Move` the same operation. There are some pr…oposals in particular by @eddyb to make them not the same: it might be beneficial for optimizations to be able to rely on the data not being read again after a `Move`. This would correspond to replacing the data by `Undef`. It seems reasonable to do similar things for drop -- though that requires a precise definition of what is actually considered a drop (does that include just the `Drop` terminator or also calls to `drop_in_place`?). (Miri concern: One reason why I am a bit hesitant about this is that this makes read have a side-effect, so we'd have to make all read methods in Miri take the interpreter context by `&mut self`. Reads already have a side-effect in Stacked Borrows but that is fairly local and we just use a `RefCell` there.) @eddyb what would be such optimizations that benefit from this, where `StorageDead` is not happening so we need to rely on `Move`? Potential blockers that would prevent deinit-on-move: - https://github.com/rust-lang/rust/issues/91029 Reasons to do deinit-on-move: - https://github.com/rust-lang/rust/issues/71117#issuecomment-864592461 (but might be solved by aliasing restrictions?)

Looks like Miri will gain the ability to treat moves as deinit. There are some examples where it's probably not possible too though, or requires more nuance.

chrefr · May 5, 2022, 12:36am

Before MIRI can do it, we should ask whether it's invalid at all.

2e71828 · May 5, 2022, 5:04am

Given how moves work semantically, and the nomicon entry that @quinedot mentioned, I’ve always assumed that the compiler reuses the space of a moved-out stack variable for later variables¹.

In particular, it feels silly for a statement like this to not reuse the space:

let initial:T = T::new();
let modified:T = initial.method_that_takes_ownership();

This kind of optimization is only possible if accessing the original location after a move is UB, because the actual value stored there at any particular time is unpredictable to the programmer.

¹ I don’t know if the compiler actually has this kind of optimization

LegionMammal978 · May 5, 2022, 5:17am

IIRC it doesn't; space is reserved for every variable that can possibly occur at the start of the function, except for variables which can be stored in registers. Notably, every branch of a switch takes up its own stack space, which I recall has caused stack-usage issues in functions with many println!s.

chrefr · May 5, 2022, 5:39am

It's valid after regular moves because it's an error to access them after, but I'm talking about things like ptr::read().

Michael-F-Bryan · May 5, 2022, 7:33am

I was hoping a let with no initializer would allocate an undef value, but it looks like rustc has a MIR pass which trims variables that are never used before LLVM IR is emitted

fn main() {
    let x: u64;
}

cuviper · May 5, 2022, 4:05pm

They all get distinct space from rustc, but LLVM will try to merge them in the StackColoring pass.

scottmcm · May 5, 2022, 5:39pm

ptr::read is not a move, because it leaves the original data alone.

(It has to, or things like ptr::reading the String in an Option<String> would break niche optimizations.)

Topic		Replies	Views
Can bytes become undef just from casting?	10	712	September 3, 2022
Is it UB to std::ptr::copy padding bytes?	7	798	August 21, 2022
Why is dropping uninitialized memory considered bad? help	12	1102	May 10, 2021
Are uninitialized allocations from FFI also "uninitialized" in Rust? help	9	488	June 19, 2023
Is libc::memcpy of uninitialized bytes UB?	19	1283	May 15, 2020

Sources of uninitialized memory in Rust

Related Topics