Figure out stack config on uController

I'm working with a nRF52840 controller which has a memory size of 256k.

I'd like to know the memory configuration.

I assume that the program is executed via in-place-execution and does not need any memory

So mostly I'm looking at

  • heap (the code avoids using heap)
  • stack
  • and probably some memory for globals (obviously they can't stay with the program in the NVM)

I'm looking for some sort of solution as here: I think it should be possible to find the stack configuration somewhere in the toolchain, linker script or similar.

I'm using Ferrous Systems' rust-exercises to build my application, the respective toolchain is thumbv7em-none-eabihf.

I've had the following ideas so far:

  • find config in toolchain / linker script: I expected some kind of linker script to be somewhere in .cargo but didn't find anything yet
  • find clues via verbose output of cargo -vv
  • configuration in Cargo.toml or .cargo/config.toml
  • some configuration in the rust-exercises repository

All I found so far was a demo program in the aforementioned exercises repo which is supposed to cause a stack overflow (I could use a trial-end-error approach with different values to see when it crashes...)

I'm not quite sure what you mean. cortex-m is no_std, there's no heap, and the ram size for the stack and statics depend on the actual program, but it is restricted by the linker script. if the linker cannot fit all the segments into the capacity specified by the linker script, the build will fail.

the default linker script (from the nrf52840-hal crate) contains only one rom region and one ram region which occupies the whole available memories, your application can overwrite this to suit your specific use case if needed. see:

this script only defines the memory resources, actual placement for individual segments is determined by other means. for example, assuming you are not writing your own low level startup code and uses cortex-m-rt, the linker script template can be found here:

as for stack overflow, the nrf52 chips have no MMU but only MPU, and there's flip-link which utilize the MPU and works pretty well in practice. see:

First of all, thanks for the response. I'll try to clarify some things and add some information that I gathered by now.

Yes, our code is no_std.
Why would there be no heap (not that I'm using it, but it could still be there)? The cortex-m linker scripts I've found (see below) at least mention heap.

actual placement for individual segments is determined by other means. f

I've found by now the linker script which I assume is configuring the regions within the RAM:

.cargo/registry/src/index.crates.io-6f17d22bba15001f/cortex-m-rt-0.7.3/link.x.in

which also mentions heap:

/* Place the heap right after `.uninit` in RAM */
  PROVIDE(__sheap = __euninit);

and also reads:

Symbols that start with a single underscore (_) are considered "semi-public"; they can be overridden in a user linker script,

this remark probably refers to such "memory.x" scripts:

.cargo/registry/src/index.crates.io-6f17d22bba15001f/nrf52840-hal-0.16.1/memory.x
.cargo/registry/src/index.crates.io-6f17d22bba15001f/nrf52840-hal-0.16.0/memory.x

Those all just declare the physical Flash and RAM regions, which then probably are used in the memory.x.in scripts.

MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 1024K
  RAM   : ORIGIN = 0x20000000, LENGTH = 256K
}

In the "HAL" repository there are two more of those "user" scripts:

rust-exercises.git/nrf52-code/boards/dk/memory.x
rust-exercises.git/nrf52-code/boards/dk-solution/memory.x

there's flip-link which utilize the MPU and works pretty well in practice.

Yes, our repo uses flip-link (and also heapless, but I don't know too much about those).

My question came up because I'm doing some estimation on how much stack our program would need, and to see on which microcontrollers we can run it.

Now looking at the link.x.in script still doesn't tell me how the RAM is configured and how much stack I have. That may be a lack of capability on my side to read such a linker script.

I have played a bit with the stackoverflow application.

To see whether I can change the size of the stack, I changed the heap size setting in memory.x. Acc. to the link.x.in script, the heap is the last configured section in RAM and thus adjacent to the stack which would comprise the rest of the RAM from higher to lower addresses.

MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 1024K
  RAM   : ORIGIN = 0x20000000, LENGTH = 256K
}
_heap_size = 8192;

But the stack_overflow.rs crashed always at the same round, so increasing the size of the heap doesn't seem to affect (decrease) the size of the stack.

I'm refering to the heap memory managed by alloc. "heap" really makes sense on a hosted platform, where the size of the heap can only be determined at runtime. for bare metal environment, some library may reserve certain amount of (statically configured) memory region and provide "heap"-like APIs (e.g. alloc, free etc) and call it a "heap", that's not what I mean. for example, on Linux, there's no "heap" segment in a ELF file, it's requested from the kernel through the brk syscall at runtime, and I would not call static variables (in ELF term, the data and bss segment) as "heap".

ok, now I understand your question better. what you are trying to do is to measure the runtime limit of memory occupied by the stack, is that correct? in that case, the linker script really cannot help much.

the linker script just direct the linker where to place each code and data segments and calculate the final values for the symbolic addresses used in code. for the stack, all the linker can determine is the initial address of the stack pointer symbol (i.e. _stack_start symbol in the example), and the low level initialization code read this symbol and set the CPU register to this value. because the stack size cannot be determined statically (well, sometimes it can, but not in general), it is meaningless to provide a limit in the linker script (even if you define something like _stack_end, there's no MMU to protect when the stack grows exceeding the limit)

what flip-link really does is just put the static allocated address (data and bss) to the end of the ram region and put the stack below the static data, which is opposite to the traditional linker allocation method. by doing this, the program will get a protection fault on stack overflow instead of silently corrupting static data.

and the heapless crate provides some data structure as alternative to the standard containers (e.g. Vec) without heap allocation (in fact they use statically allocated memory and has fixed capacity)

the _heap_size is just a symbolic value (it's a constant value in the example, not even calcuated based on linker input), and it's up to the program (or another linker script) to decide how to use the symbol. if you don't use that to allocate a static chunk of memory, it won't affect the calculated _stack_start address by itself.

no, of course data and bss have nothing to do with heap. But micro controllers can provide sth. like heap and APIs with at least sth. similar to the brk syscalls (I've worked with C++ and heap on bare-metal (tricore), using placement new).

Measuring would be an option, but I can do some good estimations as well and use configurations that use less data on the stack. But of course I'd still need to know how much stack is available in total.

So the linker script puts the following sections into RAM:

/* ## Sections in RAM */
  .data : ALIGN(4)
  /* ... */
  .gnu.sgstubs : ALIGN(32)
  /* ... */
  .bss (NOLOAD) : ALIGN(4)
  /* ... */
  .uninit (NOLOAD) : ALIGN(4)
  /* ... */
  PROVIDE(__sheap = __euninit);

If flip-link moves those regions so that the stack starts at a lower address, then the stack overflow would happen if the program tries using an address < 0x0. And I could verify that by printing some stack pointer addresses -- the stack should then start at 0x0 + 64k (given that the stack_overflow.rs shows 64k of stack available).

But still the linker script / flip-link would somehow calculate the start address for the stack by consuming the 256k - 64k for other segments in the RAM, wouldn't it?

Understood, thanks.

most of time, they are just a illusion. most such libraries will reserve a chunk of static memory as a memory pool (presumably in the form of a global variable, the size can often be configured at compile time). so technically the memory would end up in the bss segment anyway.

but API wise, I can see why it is referred as "heap".

if your RAM is mapped to address space starting at 0, that is.

but more commonly for a microcontroller, the address space starts with Flash, then unmapped/reserved/gap region, then (potentially multiple banks of) RAM, then more gap, then MMIO. using flip-link, when the stack overflows, it will access unmapped addresses, which sould be caught by the MPU and a fault should be raised.

correct. the start address of stack is calculated based on the size of the static data segments.

traditional method put the static data at the lowest address of the RAM region, and put the initial stack pointer at the highest (aligned) address of the RAM region (assuming stack grows downwards).

while flip-link put the static data at (end(RAM) - sizeof(data) - sizeof(bss)), and the start address of stack just below that (plus all the alignment stuff).

my bad - as stated above, the RAM should start at 0x20000000, so obviously I'd expect this address as end of the stack instead of 0x0 (FLASH would start there).

Small correction for my post reg. the stack_overflow.rs program:

The code reads

    // allocate and initialize one kilobyte of stack memory to provoke stack overflow
    let use_stack = [0xAA; 1024];

I used this to count the stack size, but assumed that the comment was correct (reserve 1024 * u8 = 1024 * 1 byte) and thus concluded that the stack size would be 64k. However, the code allocates 1024 u32 values, i.e. 4k bytes stack, each round.

Debugging the program and checking the SP (stack pointer) value shows that first SP value (breakpoint at main) starts at 0x2003fbb0

Given the content of memory.x:

MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 1024K
  RAM   : ORIGIN = 0x20000000, LENGTH = 256K
}

that's 1104 bytes below the end address of the RAM region =

0x20000000 + 256 k = 0x20040000

The program only crashes when reaching a SP value below 0x20000000.

So the stack seems to comprise all or almost all of the 256k RAM.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.