Just for fun I had the urge to learn me some assembler on my M1 MacBook, as inline asm in Rust of course, when I discovered this curious phenomena:
Step 1: I write some inline asm that has a simple one character mistake:
#![feature(naked_functions)]
#![no_main]
use core::arch::naked_asm;
#[naked]
extern "C" fn print_hex_64(n: u64) {
unsafe {
naked_asm!(
// Make space for output string on stack. Stack pointer must have 16-byte alignment.
// See: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/using-the-stack-in-aarch64-implementing-push-and-pop
"sub sp, sp, #(16) ",
// Number to print passed as parameter in X0
//
// Set X1 to the end of the destination on stack
"mov x1, sp",
"add x1, x1, #15",
// The loop is FOR x5 = 16 TO 1 STEP -1
"mov x5, #16", // 16 digits to print
"42: and x6, x0, #0xf", // mask of least sig digit
// If x6 >= 10 then goto letter
"CMP x6, #10", // is 0-9 or A-F
"b.ge 1f",
// Else its a number so convert to an ASCII digit
"add x6, x6, #'0'",
"b 2f", // goto to end if
"1:", // handle the digits A to F
"add x6, x6, #('A'-10)",
"2:", // end if
"strb x6, [x1]", // store ascii digit
"sub x1, x1, #1", // decrement address for next digit
"lsr x0, x0, #4", // shift off the digit we just processed
// next x5
"subs x5, x5, #1", // step x5 by -1
"b.ne 42b", // another for loop if not done
//
// Setup the parameters to print our hex number
// and then call Linux to do it.
"mov x0, #1", // 1 = StdOut
"mov x1, sp", // Start of string
"mov x2, #16", // length of our string
"mov x16, #4", // linux write system call
"svc #0x80", // Call linux to output the string
//
// Restor stack and return
"add sp, sp, #(16)",
"ret",
);
}
}
#[unsafe(no_mangle)]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) -> isize {
let n = 0x0123456789abcdef;
print_hex_64(n);
0
}
This fails to build with:
cargo run
Compiling rust_asm v0.1.0 (/Users/me/rust_asm)
error: invalid operand for instruction
|
note: instantiated into assembly here
--> <inline asm>:16:6
|
16 | strb x6, [X1]
Step 2: I edit my code changing "strb x6, [x1]" to "strb w6, [x1]".
This compiles and runs but produces garbage output:
$ cargo run
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/rust_asm`
`{�k%
Step 3: I run cargo clean
and cargo run
:
$ cargo clean
$ cargo run
Compiling rust_asm v0.1.0 (/Users/me/rust_asm)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/rust_asm`
0123456789ABCDE
Which is the output I want.
This had me scratching my head for a long while. What is going on?
While we are here, anyone know how to build this as no_std. As I'm using sys calls to get everything done I wondered how we could jettison all the unused library junk for a tiny executable.