Direct access to argc and argv?

I have an unusual topic that it seems like (understandably!) not many people are talking about.

I know every process on UNIX and Windows has argc and argv stored in the stack portion of the process's memory. I also know that Rust doesn't have an option to access these directly from a normal main; instead, it converts them to OsString and then exposes those via std::env::args().

I'd like to access these because they point to strings which are already encoded in the exact form I need (e.g. null-terminated utf-8 on UNIX, null-terminated widechar on Windows) for passing to some C functions via FFI which expect that format anyway.

I'd like to avoid having Rust convert them from the format I already need into a format I don't need (OsString, which isn't null-terminated and doesn't store widechar on Windows) and then have to take that OsString and convert it back into a null-terminated OS-appropriate string...when the process's stack memory has the data in the exact format I already wanted right there!

I do realize that from a performance perspective this is not a big deal in practice, but it bugs me enough that I've gone searching for a way to get direct access to argc and argv without having to go through OsString. This works, but with an unfortunate downside:

#![no_main]

#[cfg(target_os = "macos")] // other targets need other symbols
#[no_mangle]
pub extern "C" fn main(argc: i32, argv: *const *const u8) -> i32 {
    // ...code goes here
}

This totally works, but it has the unfortunate downside that none of Rust's normal runtime setup logic runs. One consequence of this is that panics abort after printing the message (including the backtrace if RUST_BACKTRACE=1 is set) followed by "fatal runtime error: failed to initiate panic, error 5". I think it might actually be UB.

Does anyone have any info on this topic? I couldn't find a language proposal about direct access to argc and argv, and the workaround I have now isn't worth it because it requires skipping the runtime setup (which I don't want to skip). I'd appreciate any ideas for how to get this to work!

1 Like

I think this is the intended feature. But see also. I don't know much about the details (like what else you're missing out on -- the panic handler at least it seems from that issue, and wouldn't surprise me if there's more), so consider this a hint towards something to research more and not necessarily the end answer.

You can certainly cause UB with no_mangle.

For Windows, is there a reason you can't use GetCommandLineW + CommandLineToArgvW?

I suspect that's not true for Windows processes. I vaguely recall that Windows processes don't even have a return-to-operating system address on the stack; that calling ExitProcess is required to avoid a fault.

TIL about GetCommandLineW, thanks for the link!

I guess technically Windows stores it in the PEB portion of the process's memory rather than the stack, but I didn't know you could call a (non-syscall) function to access it!

I know about macOS's _NSGetArgc() and _NSGetArgv(), but as far as I'm aware, the only way to do something like that on Linux is reading /proc/self/cmdline, which requires a syscall that could easily be more costly than doing the redundant conversions.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.