Rust compile for AARCH64 target creates unaligned stack access

Hi there,
I'm developing bare metal for a Raspberry Pi using Rust. It has been quite successful in Aarch32. However, I've started adjusting my code to also work with Aarch64 compilation target.
As Aarch64 compile target has a strict requirement on a stack pointer to be 16 Byte aligned I encountered an issue where the compiled code does not comply to this rule.

The disassembly of the corresponding parts look like this:
80170: f81e0ffc str x28, [sp, #-32]!
80174: a9017bf3 stp x19, x30, [sp, #16]
80178: d108c3ff sub sp, sp, #0x230
[..... some other code where the stack pointer is still aligned]
801e4: 910023e8 add x8, sp, #0x8
801e8: 94000114 bl 80638 <_ZN12ruspiro_uart5uart15Uart13new17hdc68614e1535f836E>
0000000000080638 <_ZN12ruspiro_uart5uart15Uart13new17hdc68614e1535f836E>:
80638: 6f00e400 movi v0.2d, #0x0
8063c: 3904011f strb wzr, [x8, #256]
80640: ad070100 stp q0, q0, [x8, #224] // <-- due to the add x8, sp, #8 the stack pointer is unaligned here !
Is there any way to force Rust to use proper stack alignment when compiling/optimizing the code ?
Any hint would be much appreciated.

I'm using a nightly rust with target aarch64-unknown-linux-gnu and the gcc cross compile toolchain aarch64-elf from ARM hosted on a Windows machine.

That is odd. I have build Rust programs on a Raspberry Pi running a 64 bit Debian with no such issues. I was not cross compiling mind.

x8 is not a stack pointer, so it doesn't need to be aligned, does it?

Well it's not a stackpointer as such - it's a general purpose register, however, it got's its value from the stackpointer so it is pointing to the stack and a subsequent call to stp q0, q0, [x8, #224] is throwing an exception or just hangs my RPi.

So x8 is used for accessing data on the stack so I would assume the same rules apply.

When activating the MMU on the RPi and disabling the stack alignment checks (register sctlr_el1) the code works, but this should not be necessary as I would expect the compiler to care for the rules applying to the stack pointer in aarch64. Or is the compiler not that clever, knowing it took x8 as an alias for the stack pointer but not seeing the stack pointer has alignment requirements?

the stack alignment requirement only means that code can assume sp to be aligned to 16 to be able to place types that need this in the local frame at alignments <=16, not that any access to the stack needs to be that granularity (that would be wasteful for smaller data)

in this case it seems that the compiler incorrectly assumes that the address of a new instance of the ruspiro_uart::uart type doesn't need 16-alignment

then the constructor uses the stp instruction to store two full 128-bit registers, which does require 16-alignment [on this hardware, i don't think it's a general ARM thing]

this may still be a compiler bug, but an incorrect alignment requirement for the type would be just as much of a problem anywhere else; on the heap, or in the global data

edit: it looks like aarch64-unknown-none has feature +strict-align enabled, but aarch64-unknown-linux-gnu does not; i don't know if this is a oversight, or the explicit assumption that linux-capable aarch64 hardware will support unaligned accesses
it might be possible to enable through Cargo.toml somehow

thanks for your detailed response.
So you would propose to add an explicit alignment requirement to my structure using #repr(align(16))]?

Nevertheless, I've checked but I'm not able to choose aarch64-unknown-none as a build target. But I managed to pass the flag +strict-align to the RUSTFLAGS (RUSTFLAGS = -C target-cpu=cortex-a53 -C target-feature=+strict-align)to build my bare metal kernel.
The generated binary does no longer do the stp q0, q0 [...]thingy but rather now uses a call to memset to initialize the array of my structure. So the unaligned access issue is gone with this flag.
I guess I leave it as this and assume the issue solved for the time being. :slight_smile:

only if it's a special requirement for the structure, that the compiler cannot determine by itself; but it doesn't seem that's the case here

Nevertheless, I've checked but I'm not able to choose aarch64-unknown-none as a build target.

why not ? when building a bare-metal kernel, it would be the first choice

the *-linux-* targets assume that a linux kernel is running (which has already set up the MMU)

anyhow, great that you got it to work by passing that option manually !

Well, rustup targets list does not show this option nor does rustup targets add aarch64-unknown-none works. May be it is not available as cross compilation target from Windows host platform? However, this might be stuff for another thread :wink:

weird, i don't see it in rustup either; it only shows in rustc
(it does show -none- targets in rustup for RISC-V, ARM32, etc, no idea what this means)

$ rustc +stable --print target-list|grep aarch64-unknown-none
$ rustc +nightly --print target-list|grep aarch64-unknown-none

apparently, they use these for the rust-raspi3-OS-tutorials