Is #[repr(Rust)] layout deterministic given identical compiler and rlib?

Hi everyone,

First of all, thank you so much for taking the time to read this.

I am currently in the design phase of a toy monolithic kernel in Rust. I am trying to implement support for Loadable Kernel Modules (LKMs).

I haven't started the implementation yet because I want to validate the theoretical soundness of my architecture before I go down a rabbit hole. I must admit that my understanding of rustc internals and LLVM behavior is limited. I have built this design based on my current understanding, but I am worried that it might rely on undefined behaviors that happen to work by coincidence, rather than solid principles. I would incredibly value your expertise on this.

The Context (Constraints & Alternatives):

  1. No Stable ABI needed: I am not trying to support modules compiled with different rustc versions. The Kernel and Modules are always compiled in lockstep (exact same compiler, exact same flags).
  2. Ergonomics first: I want to avoid #[repr(C)] or Opaque Pointers everywhere. I want to use native Rust structs and Traits as much as possible.
  3. No Generics: I assume the strategies below cover my needs, so I do not plan to support exporting generic functions across the boundary.
  4. Why not existing crates? I am aware of crates like thin_trait_object, abi_stable, and safer_ffi. However, I find them too "heavy" for my kernel context. I aim for a minimal solution that leverages the compiler's behavior rather than introducing complex abstraction layers or dependencies.

The Proposed Architecture:

I plan to split the project into a library (ABI), the kernel (Runtime), and the modules.

/workspace
β”‚
β”œβ”€β”€ kernel_abi/ (Lib, crate-type="rlib")
β”‚   └── Defines structs (repr-Rust), traits, and interface functions.
β”‚       Pure logic/interface, NO static state.
β”‚
β”œβ”€β”€ kernel/ (Bin)
β”‚   └── Statically links `kernel_abi`.
β”‚       Holds global state (allocator, etc.).
β”‚       Contains my custom ELF Loader.
β”‚
└── driver_net/ (Lib, emits ".o")
    └── Depends on `kernel_abi`.
        Acts as a leaf node (device driver).

The Intended Workflow:

  1. kernel_abi is compiled once to libkernel_abi.rlib.
  2. kernel links libkernel_abi.rlib (with LTO enabled).
  3. module compiles using metadata from libkernel_abi.rlib but emits a relocatable object (.o). (LTO disabled).
  4. Runtime: My kernel loads the module's .o file, parses it, and manually resolves undefined symbols (U) against the kernel's internal symbol table.

My Hypotheses & Questions:

1. Structs & Layout (The "Determinism" Gamble)
I plan to use default #[repr(Rust)] for most structs in kernel_abi.

  • My Hypothesis: Since both the Kernel and the Module compile against the exact same rlib metadata and use the exact same compiler version, the memory layout (field offsets) should be deterministic and identical in both artifacts.
  • The Goal: If this holds true, I hope to pass &MyStruct across the boundary safely without forcing repr(C).

2. Function Boundary (No Inlining)
This applies to both standard impl methods and statically dispatched trait methods (e.g. obj.trait_method()) called by the module but implemented in the kernel.
I intend to use a macro to enforce #[inline(never)] and #[export_name].

  • Intended Mechanism: The module generates a standard call SymbolName.
  • Relocation: My kernel loader handles patching these calls at runtime to point to the implementation inside the kernel.

3. Dynamic Traits & VTables (The LTO Nightmare)
I need to pass dyn Trait objects bidirectionally across the boundary.While Module-to-Kernel calls can often be handled via static dispatch (exported symbols), the critical path is Kernel-to-Module calls (e.g., the Kernel calling driver.read() on a trait object provided by the Module).

The possibility of LTO in the Kernel silently reordering VTable entries gives me a massive headache. I am worried that LTO might perform Virtual Function Elimination (VFE) or reorder pointers based on usage, causing a fatal layout mismatch with the Module (which generates a full VTable).

  • My Guess: If I compile the Kernel with -C link-arg=-export-dynamic, I suspect this might force LTO to preserve all public symbols (including VTables) intact, because the linker assumes they might be used externally.
  • The Proposed Hack: Since I don't actually want the bloat of dynamic symbol tables in my raw kernel binary, I plan to discard the .dynsym section in the Kernel's Linker Script (/DISCARD/).

I am seeking validation on these points:

  1. Is my assumption about #[repr(Rust)] layout determinism theoretically safe? (Given: exact same compiler + shared rlib metadata).

  2. Does -export-dynamic actually stop LLVM/LTO from altering VTable layouts (e.g. trimming unused methods)? Or am I misunderstanding how LTO interacts with exported symbols?

  3. The Dealbreaker: Does LLVM/LTO ever reorder VTable entries for #[repr(Rust)] traits? * Context: If I can ensure no entries are trimmed (via Q2), can I assume the order remains deterministic (e.g., declaration order)? If LTO reorders VTables, my kernel will explode and I will abandon this approach immediately.

I sincerely appreciate any guidance you can provide. I want to know if this architecture is sound in principle, or if I should stop trying to be clever and stick to repr(C). Thank you!

(P.S. Please forgive the AI-like formatting of this post. Trying to organize all these architectural details into readable text manually is a disaster, so I used some help to clean it up! :-))

I think I should give up on LTO optimizationβ€”it's making things significantly worse

From a formal standpoint: When something is undefined, then it is. So no, you cannot conclude that memory layout is deterministic.

From a practical standpoint: An optimizing compiler could choose different memory layouts for different usage in distinct programs, compile units, binaries and so on. A compiler can even use random driven optimizing algorithms that result in different outcomes with each run.

Sure, you can validate how a compiler version X works. But you have absolutely no guarantee that version X+1 behaves identically.

The result is: You build your house on sand and it is a question of time until it collapses.

So, what's wrong with #[repr(C)] ?

1 Like

The answer is no. randomize_layout - The Rust Unstable Book exists as a nightly option. (I think it is there mostly to prove the point that the layout is not stable.)

5 Likes

Thanks for the pointer on -Z randomize-layout.

However, my scenario is strictly constrained: the Kernel and Module share the exact same libkernel_abi.rlib file.

My assumption is that for non-generic structs, the layout is computed once when the rlib is built and stored in its metadata. Downstream crates (Kernel/Module) should read this fixed layout from the metadata rather than re-computing it.

Does the randomization risk still apply when consuming the same pre-compiled artifact for non-generic types?

If I have misunderstood how layout randomization interacts with pre-compiled metadata, please correct me

Yeah. "Building on sand" is exactly what I want to avoid.

I agree with you regarding Structs: I will switch to enforcing #[repr(C)] for all data types crossing the boundary. As you said, the aesthetic cost is minor compared to the peace of mind it brings. Or I might enforce opaque pointers where possible.

However, for Traits, abandoning native dyn Trait for C-style manual vtables feels like a huge ergonomic regression. Since I am controlling the entire build pipeline (same compiler, no LTO), I might still try to rely on native Traits, but I acknowledge this is the riskiest part of the architecture.

Thanks

If everything is built in the same workspace invocation I assume it should be possible to use dylib targets for this (not cdylib, just dylib).

This can be abstraced over with macros, which is what I believe stabby etc does.

For types defined in the rlib, the layout will be deterministic given an identical rlib. All information that determines the layout is stored inside the rlib or dependencies of this rlib. And an rlib records exact versions of all dependencies.

LTO currently can't do that ever. Vtable layouts are fixed when the trait definition is in an unchanged rlib, just like type layouts.

5 Likes

Thanks for the correction!

I think I was confusing this with concepts related to LLVM VFE (Virtual Function Elimination). I mistakenly thought that if LTO eliminated the unused functions, it would also alter the vtable layout itself.

Good to know that's not the case

I think so too

I've considered macros too, but I assume they eventually just lower everything to a C-ABI layer. I'm trying to see if there's a more "rust" way to do this first

That optimization is incompatible with dylibs. And kernel modules are effectively dylibs.

1 Like

The critical part here is the shared rlib. (Since rlibs aren't portable between compiler versions, that automatically requires the same-compiler part.)

Remember that all cargo compilation with a crate graph is multiple rustc invocations, so given that separate compilation works, this has to work too.

Re-compiling the shared crate with the types can introduce issues, as you discuss, but in some sense you're only talking about essentially making your own build system, which is absolutely doable.

8 Likes

Whether a crate was compiled with randomize-layout or not is stored in the rlib and will be properly handled by dependent crates. This plan will still work.

For other compiler options, it depends on the option for whether mixed values are supported in a crate graph.

3 Likes

I think even if this worked in practice, it wouldn't work in theory.