Data in &'static kernel entry point argument seems corrupted

It's hard to diagnose without more context :grimacing:, or the actual code you used (the one you've given has several typos)

I'm understanding of the limitations of the info I've given. This is a closed-source project, which is why I've not provided the entire source code, I'll try to provide better context in a bit, my apologies.

1 Like

@Yandros That took a bit. I've updated the gist with more code. Posting it again just in case: Kernel & boot loader code. (github.com).

Thanks, that should already allow for more knowledgeable people in this are than I to help you :slightly_smiling_face:

In my case I don't get the exact semantics of the memory mappings shenanigans, so I'll assume they're fine.

Could you move the info!("{:#X?}", explosion) right before calling the entry()?

  • If you have access to some debugger of sorts, could you inspect the layout of that explosion static reference (and that of its referee) before and after the call to entry()?

  • If you don't, you can try to mock a debugger through debug printing, by using stuff such as:

    unsafe // Safety: `*it` must not contain padding or uninit bytes
    fn debug_bytes (it: &'_ impl ?Sized)
    {
        let bytes = ::core::slice::from_raw_parts(
            <*const _>::cast::<u8>(it),
            ::core::mem::size_of_val(it),
        );
        info!("{:#X?}", bytes);
    }
    macro_rules! debug_bytes {( $place:expr $(,)? ) => (
        debug_bytes(&$place)
    )}
    

    so as to do:

    let at_explosion = Box::leak(explosion);
    unsafe {
        debug_bytes!(at_explosion); // shallow-inspect the `&'static` itself
        debug_bytes!(*at_explosion); // shallow-inspect the struct
    }
    

    right before the entry() call, and, ideally, do something similar right at the start of kernel_main().

Form there, two cases:

  • either the addresses match, and there is something wrong with the virtual memory setup / the memory mappings;

  • or the addresses don't match, and there is something wrong with the call ABI and/or the layout of these things.

    • In that regard, you can reduce the chances of ABI issues by using #[repr(C)] things at as many layers as possible (using, for instance, SharedSlice<'lt, T> or slice_ref<'lt, T> instead of &'lt [T]), together with extern "C" functions (mainly for kernel_main and the EntryPoint type definition).
1 Like

Thanks for your suggestions. I will try them soon and report my results

Before


After

No corruption happening

debug_bytes!(explosion) and debug_bytes!(*explosion)
Before


After

debug_bytes!(explosion) before and after are obviously not the same
so, it's the 2nd case

@Yandros, I changed the entry point ABI to extern "C", but, the addresses still don't match, even if I change the argument to take just u64:

[ INFO]:  src/main.rs@245: 0x0000000006538018
Fuse ignition begun.
Boot loader data ptr: 0x6537918

EDIT: Just as an experiment, I decided to call the entry point manually using inline assembly, like so:

unsafe {
    asm!("call {}", in(reg) entry, in("rdi") explosion);
}

and...

[ INFO]:  src/main.rs@245: 0x0000000006538018
Fuse ignition begun.
Boot loader data ptr: 0x6538018
Fuse initialization complete.

Huh, it works.
Does this mean I've found a compiler bug?
I'm suspecting that rustc might be ignoring the function's calling convention and using the entry point convention. Since UEFI uses the efiapi ABI, that would only make sense.
EDIT 2: Bingo! I've just found a compiler bug!

[ INFO]:  src/main.rs@248: 0x0000000006538018
Fuse ignition begun.
Boot loader data ptr: 0x6538018
Fuse initialization complete.

Above output is with extern "efiapi"
Final edit: I made an issue in the rust GitHub repo: Rust ignores extern ABI used for called function if is different than the caller's · Issue #88749 · rust-lang/rust (github.com)

1 Like

In your transmute you explicitly told rustc to cast the entrypoint to a function with the default extern "Rust" calling convention, but if it's actually been defined as an extern "C" or extern "efiapi" function and they pass around arguments differently that would explain the "corrupted" arguments you've been seeing.

Here is a list of all calling conventions supported by Rust:

https://doc.rust-lang.org/nomicon/ffi.html#foreign-calling-conventions

3 Likes

The examples in the post itself are out of date. See the gist instead.
But, both the entry point and the transmute used the same signature and ABI.
So yeah, it's a compilation bug.
EDIT: note: I'm using rust nightly

https://gist.github.com/VisualDevelopment/9260154493fa8b6b006a9a1ff9cfa017 is missing the source for kaboom.

Yup.
Here is lib.rs:

kaboom source code
#![no_std]

pub mod tags;

pub const CURRENT_REVISION: u64 = 0x5;

pub type EntryPoint = fn(&'static ExplosionResult) -> !;

#[derive(Debug)]
pub struct ExplosionResult<'a> {
    pub revision: u64,
    pub tags: &'a [tags::TagType<'a>],
}

impl<'a> ExplosionResult<'a> {
    pub fn new(tags: &'a [tags::TagType<'a>]) -> Self {
        Self {
            revision: CURRENT_REVISION,
            tags,
        }
    }
}

I've also added this:

static _KERNEL_MAIN_CHECK: kaboom::EntryPoint = kernel_main;

to kernel/src/main.rs in order to make sure both use the same signature

EDIT: Updated the gist with more code

ExplosionResult and TagType need a #[repr(C)] to ensure that the two versions of kaboom (compiled for the bootloader cq the kernel) use thr same layout for them. This is not necessarily the fix though.

Why would the layout be different? As far as I know, the layout should not be compiled differently each time you run rustc.

And you need to use extern "C" fn in both the bootloader when calling and the kernel when defining the kernel entry point. The default rust abi is unstable.

The -Cmetadata argument passed by cargo may be different. It is possible that this causes a different layout.

I see.

I had no idea. Thanks for all the information

I think it will actually need to be extern "sysv" fn or extern "fastcall" fn. The UEFI target you use for the bootloader uses the windows fastcall calling convention as default C calling convention, while the target for your kernel uses the SystemV calling convention as default C calling convention.

2 Likes

Using extern "sysv64" resolved the issue. Thanks a lot.
I guess I expected the crate to use a different calling convention than the crate using it.
Closing the issue on the rust repository. Have a nice day

2 Likes

Rust reserves the right to re-optimize the layout of each non-repr(C) struct and enum each time it compiles, even though it appears to you that the layout(s) is/are the same. See this recent URLO comment for an example of why this could happen.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.