Differing stack allocation just by adding a function call

This question may not be specific for embedded and also not specific for Rust (it could be, though; all tests described have been with Rust code built for and run on an ARM Cortex M4 target).

I'm having the following scenario:

  1. I can call a library function keygen() which works just fine, and the program finishes successfully
  2. if I add another library function call sign() after keygen(), then keygen() crashes, probably with a stack overflow

I've watched the stack pointer for both cases:

Indeed the value of the stack pointer (which yields deterministic values for a given compilation) differs by a substantial amount (ca. 12k) for the two cases. So I assume that's what causes the keygen() to crash when allocating a specific amount of memory on the stack: it's okay in the first case, but causes a stack overflow in the second case due to less stack memory being available.

#[entry]
fn name_of_this_function_desnt_matter() -> ! {
    // workaround is necessary for debugging, see https://users.rust-lang.org/t/debugging-with-probe-rs-and-vs-code-and-nrf52840-cant-set-breakpoints/108566
    main()
}

fn main() -> ! { // breakpoint 1 to check SP
    defmt::println!("main()");
    let _board = dk::init().unwrap(); // breakpoint 2 to check SP
    /* some more code */
    keygen();
    /* some more code */
    sign(); // called only in second case, otherwise commented out 
}
  • breakpoint at // breapoint 1: SP is equal in both cases
  • breakpoint at // breapoint 2: SP differs by same amount ca. 12k (and keeps doing that until crash)

What's causing the stack allocation to change when the sign() function is used, even before it is called?

Usually LLVM would just create one stack frame for all local variables. That's faster and uses less code.

I'm not sure if it even have a mode where it may [try to] use stack springly.

Try to see if #[inline(never)] would make a difference?

But (as I understand) that would then only be w.r.t. the "unconditionally used" local variables in the main() scope. There are a few more variables when adding the sign() call. But those are not even close to consuming 12k of memory.

I'm sure (please correct me if I'm wrong) we are not talking about local variables in another scope (those that are used/allocated after entering sign()), because that's how the stack and stack frames for functions (jumps) are working.

That's why I was talking about #[inline(never)]: if these functions are inlined then their local variables are embedded in the stack frame whether they are “conditional” or “unconditional”.

The worst scanario that I had was instruction decoding finction which called hundreds of tiny small helper functions and each of these created few small local variables. LLVM inlined everything and gave each local variable separate slot on the stack which produced stack frame larger than megabyte!

Solution, in our case, case to put everything in the union (with hundreds of sepaerate possibilities) and pass reference to that union in these hundreds of tiny (and, thankfully, autogenerated) functions.

You may experience something like this, just not as drastic.

1 Like

Thanks for the idea and the explanations, they make sense to me.

I've now tried to add #[inline(never)] before the implementation of the sign() function, but the stack pointer shows the same values as before when calling sign().

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.