I must read what Rust considers uninitialized memory. How to do it without undefined behavior?

Hi!

I am writing a logger for an embedded system (STM32, but the details don't really matter here). The logger writes the log messages to a ring buffer in system memory (RAM), which can then be read at a later time.

I need the log messages to persist when the system is reset, because then they are of most interest (for example after a panic or watchdog reset). Hardware-wise this is not a problem because the embedded system does not clear its memory during reset. All I need to do here is to put the ring buffer into a linker section that is not initialized at program start.

However, in Rust there seems to be absolutely no valid way to read what it considers uninitialized memory. Even using core::ptr::read_volatile on "uninitialized" memory is considered undefined behavior.

I've published my code at GitHub - surban/defmt-ringbuf-miri. This version can be run as a normal executable for testing, i.e. using cargo run. The code works fine, but running MIRI on it produces the following error:

$ cargo +nightly miri run
Preparing a sysroot for Miri (target: x86_64-unknown-linux-gnu)... done
   Compiling defmt-ringbuf-miri v0.2.0 (/data/surban/dev/defmt-ringbuf-miri)
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `/home/surban/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner target/miri/x86_64-unknown-linux-gnu/debug/defmt-ringbuf-miri`
error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
  --> src/ring_buffer.rs:44:29
   |
44 |             let signature = (addr_of!((*ptr).signature) as *const u32).read_volatile();
   |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
   = note: BACKTRACE:
   = note: inside `ring_buffer::RingBuffer::<8192>::init` at src/ring_buffer.rs:44:29: 44:87
note: inside `main`
  --> src/main.rs:13:27
   |
13 |     let buffer = unsafe { RingBuffer::init(&mut BUFFER) };
   |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error

What can I do in Rust to read the contents of memory that is considered uninitialized without invoking undefined behavior?

Thanks for any ideas!

4 Likes

Rust provides no way to read uninitialized memory.

However, Rust doesn't know how your hardware works. If you read the memory as-if it was initialized, the compiler will have to assume that you know that the hardware has initialized it to some value.

1 Like

Could MaybeUninit::assume_init help you here?

I am not sure if this is true, considering "What The Hardware Does" is not What Your Program Does: Uninitialized Memory.

1 Like

I am already using that. However the question is what to do before calling assume_init to ensure that no undefined behavior is involved.

It's true that you definitely can't do what I suggested to memory you created with MaybeUninit::uninit(). In that case, the memory truly is uninitialized no matter what your hardware thinks. However, when you are dealing with special memory registers/locations that have special behavior controlled by your hardware, then its a different story.

Of course, you could never test something like this in miri, because the machine that miri emulates has no such special memory locations.

1 Like

I understand, but how can I tell the Rust compiler that this memory region is pre-initialized when my program starts? I am afraid that otherwise it may apply some clever and wrong optimizations as described in the blog post by ralfj.

Could you look at my code and tell me what you think about my RingBuffer::init function?

1 Like

Have you considered initializing your static like this?

#[link_section = ".uninit"]
static mut BUFFER: RingBuffer<8192> = MaybeUninit::zeroed().assume_init();

Based on what you told me, I'm assuming that the value you give it is ignored in practice.

The code as posted in the repository already appears to work fine on the embedded system. My problem is that I want to be sure that it is indeed correct :wink:

When I change uninit() to zeroed() as you suggest, how can I be sure then that the compiler doesn't assume the struct to be all zeros during the call to RingBuffer::init()? This seems to be trading one potential problem for another to me.

Assuming your platform is one which inline assembly is available for, this might be a good place to use it - somewhere where you don't want the semantics of the rust abstract machine getting in the way.

2 Likes

I am not sure that would help. Even if I do the initialization and validity checking of the ring buffer in assembly, how would I tell the Rust compiler that it is now properly initialized?

The way that you tell the compiler that something is initialized is by reading it. There's no special step here.

1 Like

I don't think MaybeUninit is the way to go but maybe if you wanted; a custom Union type might separate being in a unknown state. (But probably not worth the effort.)

If you have a pointer "somehow" I would think sending it through black_box would remove any capability for the compiler to assume any of it is uninitialized. (Not tried.)

Too far out of my knowledge to say link_section will always do the right think for a target platform.

That does sound plausible, volatile read would stop any gain of such assumption.

I would think so as well, but the docs say that we cannot rely on it:

Note however, that black_box is only (and can only be) provided on a “best-effort” basis. The extent to which it can block optimisations may vary depending upon the platform and code-gen backend used. Programs cannot rely on black_box for correctness in any way.

If the link_section attribute can result in your hardware doing special things to the memory, then the compiler should know that it cannot assume that the value is unchanged when you reach main.

It's not a special think :wink: It is normal behavior of RAM to stay unchanged unless cleared. It's just the operating system in the non-embedded world that clears memory pages before handing them to newly started programs.

Unfortunately, and just like C in this regard, the exact details of when you can access externally supplied memory safely are not yet fully defined. Using #[link_section] and ptr::read_volatile or ptr::write_volatile is the best you can do, but comes with the caveat that this is similar to using volatile char * in C, where the meaning is not well defined (depends on the implementation doing what you want).

There is not (yet) a fully standardised way to say "this bit of RAM is platform defined, and is externally initialized", not in C or in Rust. Instead, we kinda depend on compilers not doing something too crazy when optimizing volatile reads and writes.

1 Like

Thanks!

If you don't mind, could you also have a quick look at my RingBuffer::init function and say if it is sound? Or do I have to remove MaybeUninit as @jonh says, because it might confuse the compiler?

Isn't that something worth reporting as bug? Because if Rust want to be big in embedded then that's something Miri definitely need to support, somehow.

Plus if you report that Ralf may provide a way write such code which should work with Miri in the future, eventually, when it would support "hardware devices".

I'm not an expert here, but I believe it's OK if you're using MaybeUninit::uninit for the memory area. The struct is repr(C), so the layout is well-defined , and MaybeUninit::uninit stops the compiler making assumptions about the content before the reads. Because all your accesses are _volatile, the compiler knows it can't make assumptions about not having external accesses.

That said, you're in the liminal space between definitely defined, and guaranteed UB, where the rules are, shall we say, murky right now, and can only really be established by testing. Miri is testing your code against a machine where reads of any address that has not yet been written by the program is definitionally UB, and on that machine, your code is UB.

However, you're actually interested in the behaviour of the program on your real hardware, using a toolchain that might (or might not, who knows?) understand that RAM it's not yet written to is still valid, where such behaviour may or may not be UB, depending on all sorts of details that haven't yet been formalized.

That said, Adding FFI support to Miri, by emarteca · Issue #2365 · rust-lang/miri · GitHub looks like a related piece of work in Miri, which may be enough to support your use case in Miri - or you may be better off filing a new issue, explaining your use case, and seeing if Miri can be extended to understand it.

Unfortunately, this whole area is a mess because Rust initially just said "memory semantics are like C11", and we're only now unpicking that and discovering that important chunks of C11 memory semantics boil down to "compiler writers have been getting away with this unstated assumption without users noticing", and not a formally defined semantics. The Strict Provenance experiment is an attempt to have a fully defined set of semantics that are guaranteed not to be UB, but where breaking the rules might still not be UB once we have a complete definition - the idea is that Strict Provenance is stricter than the combination of any sensible compiler and hardware.