LTO removal of memory written but not read in embedded baremetal app

Our team is doing some baremetal no_std Rust on an embedded ARM platform. We aren't really using any of the embedded WG code, but that's not really relevant. One thing we are doing is writing some system memory with data and data structures that is not read back by the application itself (it's intended to be read by another application, which could be written in C or Rust or whatever, but will run at a later time). So this could be considered to be FFI-related I guess.

The fundamental problem is that when LTO is enabled (as it was for our release build), the toolchain basically looks at that memory that is written and considers it ripe for optimizing away because it is never read back after being written. We had a similar problem earlier where this was happening within a compilation unit, in our dev/debug build with lto=false, and we addressed that using black_box. But this issue is now happening at the link stage with LTO.

After some research, we now think only volatile can really address this (based on Volatile | lokathor.github.io and the referenced LLVM doc, LLVM Language Reference Manual — LLVM 22.0.0git documentation), and prototypes show that it works. But we wanted to see if the community had some thoughts on using unsafe core::ptr::write_volatile or whether there is another suggested approach. The only other thought that we had was using some dummy inline asm, but that really seems like a hack.

If you write the same value twice in a row, do you care if it's actually written twice? Or is once enough?

Because volatile is for when the act of the write is important. If you just need it to happen for some other (magic) thing to read, then consider whether atomics would be sufficient.

2 Likes

We only need to write it once. It is just going into a particularly fixed location of RAM, there aren't any side-effects (i.e. this is not MMIO register stuff). But without any code (that is visible to the compiler & linker) reading this content after it is written, the LTO assumes those values don't matter (even though they definitely do, just not to this program that the compiler & linker are building) and removes the code that does the writes.

So I understand the semantics of volatile don't quite apply, but the side effects definitely do.

And does LTO remove it if you write using the weakest possible atomic ordering, rather than just using ptr::write?

(Also, how are you making this pointer in the first place? Linker script generating a static at a particular address? from_exposed_provenance? …)

We don't write it using ptr::write, and never have. We are writing to elements of a data struct that is linked at that location using a #[unsafe(link_section ="<section_name>"] directive. So no explicit atomic ordering ever comes into play here. When we go to use ptr::write_volatile, in the attempt to prevent the LTO issue, we get the pointer using addr_of_mut! macro on the data struct and/or its fields.

I think your data is not exported properly, so the linker thinks it was private data and discarded because it is unused in the code. or since you are on bare metal, maybe there's a bug in your linker script.

how do you access the data from the C code? e.g. do you mark the symbol with #[unsafe(no_mangle)] or #[unsafe(export_name)]? how the section was handled in the linker script? do you allocate fixed loading address in the linker script for the specific section?

volatile writes to a static variable is unusual to say the least, and potentially problematic since you need unsafe.

typically, the linker only knows the dependency graph among all the sections, it doesn't care if a code section references a data section using volatile or atomic. in fact, volatile doesn't exist during "normal" linking process, since code sections already contains machine code, and there are no "volatile" instructions at hardware level: volatile only exists in frontend (and middle end) to disable certain optimizations, it should not affect the linking process.

this happened to "work" in your case because how lto works. lto is not just linking, the actual machine code generation phase is deferred too! so for lto, code sections actually contain llvm ir instead of machine code. a volatile write prevents the code (as opposed to data) from being elided, then the data section is kept because it is a dependency of the code section.

If by this you mean mark the symbol using #[unsafe(export_name = "")]` this doesn't actually work. The data sections are still optimized away by LTO.

This program and the external program (could be written in C, not saying it is, doesn't matter) are both linked to know this data is at a fixed address.

Whether it worked because the code that references the memory wasn't elided or the LLVM marked that memory a certain way (because the accesses were volatile) isn't really material because so far it is the only thing that has worked to prevent the LTO from removing the data that the code is writing.

explicit exported symbols should not be eliminated by the linker. either this is an LTO linker bug, or the linker script has some issue.

wait, the rust code and external code are not linked together as a single program? I didn't expect a bare metal platform would support loading multiple ELF files at once?

are you allowed to to share your linker scripts please? I'm very interested on your setups now.

Yeah sorry if this wasn't made clear. These programs are effectively time multiplexed, this Rust program runs first, does it business including populating this data (I should mention it is repr(C)) in memory, then terminates. At a later time this external program will run and access this data. Both programs are linked such that they know this content is at a known fixed address.

I think this is very unlikely, we use the same linker file for both dev and release profile builds (and literally the only difference in the build options is lto=false vs lto=true, we made sure everything else was consistent) and the dev build does exactly the right thing.

I'll see if I can put together a minimal test example to illustrate. But yeah export_name doesn't do anything except change the name in the dev build's map file - it doesn't force LTO to keep those symbols in the release builds.

oh, that's an interesting setup. in that case, I would say a volatile write is actually the correct way to do it. from the view of the rust program, the destination memory location is no different than an mmio region, it's completely out of the scope of the rust abstraction machine.

on that note, I don't think you should define the destination location as a variable with #[link_section] annotation in rust, rather, they should be declared as external symbols, which can be defined in linker script directly.

for example, here's how I would probably do it in similar scenario:

#[repr(C)]
struct InitData {
    //...
}

// use external static to import the symbol to rust
unsafe extern "C" {
    static INIT_DATA: UnsafeCell<MaybeUninit<InitData>>;
}

and to write to the location using volatile:

fn main() {
    let init_data = InitData{
        //...
    };

    // SAFETY:
    //   single threaded, exclusive access
    //   `MaybeUninit` is `#[repr(transparent)]`
    unsafe {
        std::ptr::write_volatile(INIT_DATA.get().cast(), init_data);
    }
}

alternatively, the data can be constructed in place too, but it's more tricky to get right.

in linker script, assign a fixed address to the symbol:

INIT_DATA = 0x80200000;
2 Likes

We generally don't like to do the allocation in the linker command file and instead prefer that the allocation happens in the generated object file so that proper alignment and padding between stuff that will appear in this output memory region will be handled by the toolchain. But I'm hearing you say "yes use write_volatile" (regardless of the allocation mechanism) so that is what we will do (it really seemed like the only possible guarantee in pure Rust code, rather than some inline asm to hide from the compiler what was going on).

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.