Remote debugging an application on an embedded Linux system - illegal instruction

Hello

I'm trying to find a workflow for remote debugging an application on my device (STM32MP1 based). The device runs an image created with yoco (kirkstone).

Thanks to these threads:

https://users.rust-lang.org/t/linker-gcc-sysroot/28138/5
https://stackoverflow.com/questions/68888706/remote-debug-of-rust-program-in-visual-studio-code

...I was able to cross compile my application using the toolchain build with yocto. I can copy the binary to the device and it runs just fine. However, when I try debugging, VS Code shows only assembly code and the application does not run at all when hitting continue. It just stops at what VSCode identifies as "Source location: /usr/src/debug/glibc/2.35-r0/git/sysdeps/arm/start.S:79"

; id = {0x00000aff}, range = [0x0000000000004200-0x0000000000004234), name="_start"
; Source location: /usr/src/debug/glibc/2.35-r0/git/sysdeps/arm/start.S:79
00404200: 4F F0 00 0B                mov.w  r11, #0x0
00404204: 4F F0 00 0E                mov.w  lr, #0x0
00404208: 02 BC                      pop    {r1}
0040420A: 6A 46                      mov    r2, sp
0040420C: 04 B4                      push   {r2}
...

I use gdbserver on the device:

gdbserver *:177777 helloworld 

In VSCode I use lldb with the following launch task:

{
    "type": "lldb",
    "request": "custom",
    "name": "Remote debug executable 'helloworld'",
    "targetCreateCommands": ["target create ${workspaceFolder}/target/armv7-unknown-linux-gnueabihf/debug/helloworld"],
    "processCreateCommands": ["gdb-remote 192.168.0.192:17777"]
}

Further, I noticed that the application can't even be debugged locally on the device itself. I can start and run it using gdb:

root@stm32mp1:~# gdb helloworld 
GNU gdb (GDB) 11.2
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-poky-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from helloworld...
warning: Unsupported auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/root/helloworld.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) r 
Starting program: /home/root/helloworld 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Hello,
world!
123
[Inferior 1 (process 366) exited normally]

It runs as expected. However, as soon as I set a breakpoint and execution hits it, gdb is no longer able to continue after that. I get an 'illegal instruction' message:

gdb) b 4
Breakpoint 1 at 0x404604: file src/main.rs, line 4.
(gdb) r
Starting program: /home/root/helloworld 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Hello,

Breakpoint 1, helloworld::main () at src/main.rs:4
4	src/main.rs: No such file or directory.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x00404d5e in core::option::Option::unwrap<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>> () at library/core/src/option.rs:775
775	library/core/src/option.rs: No such file or directory.

Any idea what causes this and how I can fix it? Missing libraries? Issue with the cross compilation (but then, why does it run without debugging)?

What is the full backtrace when the SIGILL happens?

Here's the backtrace:

Program received signal SIGILL, Illegal instruction.
0x00404d5e in core::option::Option::unwrap<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>> () at library/core/src/option.rs:775
775	library/core/src/option.rs: No such file or directory.
(gdb) bt
#0  0x00404d5e in core::option::Option::unwrap<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>>
    () at library/core/src/option.rs:775
#1  std::sync::once::{impl#4}::call_once_force::{closure#0}<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>> () at library/std/src/sync/once.rs:334
#2  core::ops::function::FnOnce::call_once<std::sync::once::{impl#4}::call_once_force::{closure_env#0}<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>>, (&std::sync::once::OnceState)> () at library/core/src/ops/function.rs:248
#3  core::ops::function::FnOnce::call_once<std::sync::once::{impl#4}::call_once_force::{closure_env#0}<std::sync::once_lock::{impl#0}::initialize::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<std::sys_common::remutex::ReentrantMutex<core::cell::RefCell<std::io::buffered::linewriter::LineWriter<std::io::stdio::StdoutRaw>>>, std::io::stdio::stdout::{closure_env#0}>, !>>, (&std::sync::once::OnceState)> () at library/core/src/ops/function.rs:248
#4  0x0000007a in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

And this is the test application by the way:

fn main() {
    let some_value = 123;
    println!("Hello,");
    println!("world!");
    println!("{}", some_value);
}

Take this with a grain of salt, but the error and backtrace resemble what happens when building the executable for the wrong target. It looks like the illegal instruction is within a synchronization primitive, which could be something like an atomic instruction that doesn't exist on the CPU.

The device is STM32MP1 based, but how was the executable built? And I suspect it's only running on the A7 core, not the M4, right?

Yes, it runs on the A7 core. And I agree, it really looks like an issue with a wrong target. But, I cannot understand why my executable would run without debugging then.

Unfortunately, the SDK created with Yocto does not include the Rust compiler. Hence, I use Rust's precompiled toolchain (armv7-unknown-linux-gnueabihf) in combination with the linker of my SDK:

source /opt/poky/4.0.5/environment-setup-cortexa7t2hf-neon-vfpv4-poky-linux-gnueabi 

export RUSTFLAGS='-Clink-arg=-Wl,-soname=lib${NAME}.${VERSION} -Clink-arg=--sysroot=/opt/poky/4.0.5/sysroots/cortexa7t2hf-neon-vfpv4-poky-linux-gnueabi' 

cargo build --target=arm-unknown-linux-gnueabihf --config target.arm-unknown-linux-gnueabihf.linker=\"arm-poky-linux-gnueabi-gcc\" 

Am I missing something here?

Apparently, the newest Yocto release (langdale) allows to include the Rust compiler in the SDK. Long-term, I assume this should be the way to go. But as of now, I had no luck building the SDK (honestly, I stopped reading the error message after 'python2' - it frustrated me enough to not look any further into it for now).

--target=arm-unknown-linux-gnueabihf is for ARMv6, Cortex A7 is ARMv7. I don't know whether ARMv7 is backward compatible with ARMv6.

According to the rustc platform support page, the armv7-unknown-linux-gnueabihf target triple might be a better fit?

I don't know, I still feel like I'm grasping at straws here. This could all just be coincidence.

Thanks for taking the time. I don't know about compatibility between the two either. But I actually have tried them both and both executables show the exact same behavior regarding this issue.