__aeabi_memclr compiling to suicide loop


#1

The following Rust function:

#[no_mangle]
pub unsafe extern "C" fn __aeabi_memclr(dest: *mut u8, size: usize) {
    let mut i = 0;
    while i < size {
        *dest.offset(i as isize) = 0u8;
        i += 1;
    }
}

Compiles to the following (cargo build --release):

08000980 <__aeabi_memclr>:
8000980:    b580          push    {r7, lr}
8000982:    2900          cmp    r1, #0
8000984:    d001          beq.n    800098a <__aeabi_memclr+0xa>
8000986:    f7ff fffb     bl    8000980 <__aeabi_memclr>
800098a:    bd80          pop    {r7, pc}

Which is a suicide loop; it recursively calls itself forever until the stack overflows and the processor signals a fault.

This is in an embedded environment (STM32F0, ARM Cortex-M0). __aeabi_memclr and __aeabi_memset seem to be required. Originally I was using the following implementation:

#[no_mangle]
pub extern fn __aeabi_memclr(dest: *mut u8, size: usize) {
    unsafe {
        use core::intrinsics::volatile_set_memory;
        volatile_set_memory(dest, 0, size);
    }
}

But that was also generating a suicide loop. Debug builds generate verbose suicide loops.

cargo 0.9.0-nightly (6ffd134 2016-01-26)
rustc 1.8.0-nightly (4b615854f 2016-01-26)
arm-none-eabi-gcc 5.2.1 20151202 (release) [ARM/embedded-5-branch revision 231848]

Thank you.


#2

Is the former implementation really generating a suicide loop on debug builds?

The basic problem appears to be that LLVM implements a memset() to zero, by calling __aeabi_memclr(). LLVM also recognizes that the while loop is indeed such a memset(), and optimizes it to an appropriate tail call, leading to the suicide loop.

As you’re using nighly anyway, inline asm might be the easiest workaround.


#3

Is the former implementation really generating a suicide loop on debug builds?

Checking again I see that the code using intrinsics::volatile_set_memory turns into a suicide loop for both debug and release builds. The code that uses a raw loop instead of volatile_set_memory turns into a suicide loop in release builds, but it compiles correctly in debug. At least, that’s what it looks like on a cursory glance over the assembly. Dumps are below, using the same toolchain versions.

As you’re using nighly anyway, inline asm might be the easiest workaround.

Okay, I will give that a try, thank you. I’m trying to avoid as many non-stable things as I can, in the hopes of moving off nightly as soon as possible, but such is the life of embedded Rust.

Besides a workaround, I’m wondering if this is a legitimate compiler bug that I should be opening a github issue for? Or am I just doing something wrong? __aeabi_* in relation to embedded Rust is sparsely documented, so it’s hard to tell what’s compiler bugs and what’s usage error.

Dumps

Build:

cargo clean
cargo build --target=thumbv6m-none-eabi
arm-none-eabi-objdump -D target/thumbv6m-none-eabi/debug/blink

Code:

#[no_mangle]
pub extern fn __aeabi_memclr(dest: *mut u8, size: usize) {
    unsafe {
        use core::intrinsics::volatile_set_memory;
        volatile_set_memory(dest, 0, size);
    }
}

Disassembly:

08002e6c <__aeabi_memclr>:
 8002e6c:    b580          push    {r7, lr}
 8002e6e:    af00          add    r7, sp, #0
 8002e70:    b084          sub    sp, #16
 8002e72:    460a          mov    r2, r1
 8002e74:    4603          mov    r3, r0
 8002e76:    9003          str    r0, [sp, #12]
 8002e78:    9102          str    r1, [sp, #8]
 8002e7a:    9201          str    r2, [sp, #4]
 8002e7c:    9300          str    r3, [sp, #0]
 8002e7e:    f000 f802     bl    8002e86 <_ZN10lang_items14__aeabi_memclr10__rust_abiE>
 8002e82:    b004          add    sp, #16
 8002e84:    bd80          pop    {r7, pc}

08002e86 <_ZN10lang_items14__aeabi_memclr10__rust_abiE>:
 8002e86:    b580          push    {r7, lr}
 8002e88:    af00          add    r7, sp, #0
 8002e8a:    b084          sub    sp, #16
 8002e8c:    460a          mov    r2, r1
 8002e8e:    4603          mov    r3, r0
 8002e90:    9003          str    r0, [sp, #12]
 8002e92:    9102          str    r1, [sp, #8]
 8002e94:    9803          ldr    r0, [sp, #12]
 8002e96:    9201          str    r2, [sp, #4]
 8002e98:    9300          str    r3, [sp, #0]
 8002e9a:    f7ff ffe7     bl    8002e6c <__aeabi_memclr>
 8002e9e:    b004          add    sp, #16
 8002ea0:    bd80          pop    {r7, pc}
 8002ea2:    ffff b5d0     vsli.64    <illegal reg q13.5>, q0, #63    ; 0x3f

Code:

#[no_mangle]
pub unsafe extern "C" fn __aeabi_memclr(dest: *mut u8, size: usize) {
    let mut i = 0;
    while i < size {
        *dest.offset(i as isize) = 0u8;
        i += 1;
    }
}

Disassembly:

08002e6c <__aeabi_memclr>:
 8002e6c:    b580          push    {r7, lr}
 8002e6e:    af00          add    r7, sp, #0
 8002e70:    b084          sub    sp, #16
 8002e72:    460a          mov    r2, r1
 8002e74:    4603          mov    r3, r0
 8002e76:    9003          str    r0, [sp, #12]
 8002e78:    9102          str    r1, [sp, #8]
 8002e7a:    9201          str    r2, [sp, #4]
 8002e7c:    9300          str    r3, [sp, #0]
 8002e7e:    f000 f803     bl    8002e88 <_ZN10lang_items14__aeabi_memclr10__rust_abiE>
 8002e82:    b004          add    sp, #16
 8002e84:    bd80          pop    {r7, pc}
 8002e86:    ffff b580     vabal.u<illegal width 64>    <illegal reg q13.5>, d31, d0

08002e88 <_ZN10lang_items14__aeabi_memclr10__rust_abiE>:
 8002e88:    b580          push    {r7, lr}
 8002e8a:    af00          add    r7, sp, #0
 8002e8c:    b08a          sub    sp, #40    ; 0x28
 8002e8e:    460a          mov    r2, r1
 8002e90:    4603          mov    r3, r0
 8002e92:    9009          str    r0, [sp, #36]    ; 0x24
 8002e94:    9108          str    r1, [sp, #32]
 8002e96:    2000          movs    r0, #0
 8002e98:    9007          str    r0, [sp, #28]
 8002e9a:    9205          str    r2, [sp, #20]
 8002e9c:    9304          str    r3, [sp, #16]
 8002e9e:    e001          b.n    8002ea4 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x1c>
 8002ea0:    b00a          add    sp, #40    ; 0x28
 8002ea2:    bd80          pop    {r7, pc}
 8002ea4:    9807          ldr    r0, [sp, #28]
 8002ea6:    9908          ldr    r1, [sp, #32]
 8002ea8:    4288          cmp    r0, r1
 8002eaa:    d2f9          bcs.n    8002ea0 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x18>
 8002eac:    e7ff          b.n    8002eae <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x26>
 8002eae:    9809          ldr    r0, [sp, #36]    ; 0x24
 8002eb0:    9907          ldr    r1, [sp, #28]
 8002eb2:    f000 f81b     bl    8002eec <_ZN3ptr14_$BP$mut$u20$T6offset21h17255864961057226909E>
 8002eb6:    9006          str    r0, [sp, #24]
 8002eb8:    2100          movs    r1, #0
 8002eba:    7001          strb    r1, [r0, #0]
 8002ebc:    9807          ldr    r0, [sp, #28]
 8002ebe:    1c42          adds    r2, r0, #1
 8002ec0:    2301          movs    r3, #1
 8002ec2:    4282          cmp    r2, r0
 8002ec4:    9203          str    r2, [sp, #12]
 8002ec6:    9302          str    r3, [sp, #8]
 8002ec8:    9101          str    r1, [sp, #4]
 8002eca:    d201          bcs.n    8002ed0 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x48>
 8002ecc:    9802          ldr    r0, [sp, #8]
 8002ece:    9001          str    r0, [sp, #4]
 8002ed0:    9801          ldr    r0, [sp, #4]
 8002ed2:    43c0          mvns    r0, r0
 8002ed4:    9902          ldr    r1, [sp, #8]
 8002ed6:    4208          tst    r0, r1
 8002ed8:    d003          beq.n    8002ee2 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x5a>
 8002eda:    e7ff          b.n    8002edc <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x54>
 8002edc:    9803          ldr    r0, [sp, #12]
 8002ede:    9007          str    r0, [sp, #28]
 8002ee0:    e7e0          b.n    8002ea4 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x1c>
 8002ee2:    4801          ldr    r0, [pc, #4]    ; (8002ee8 <_ZN10lang_items14__aeabi_memclr10__rust_abiE+0x60>)
 8002ee4:    f001 f92a     bl    800413c <_ZN9panicking5panic20h1085ed7747e4bc1bUWLE>
 8002ee8:    08006b58     stmdaeq    r0, {r3, r4, r6, r8, r9, fp, sp, lr}

08002eec <_ZN3ptr14_$BP$mut$u20$T6offset21h17255864961057226909E>:
 8002eec:    b085          sub    sp, #20
 8002eee:    460a          mov    r2, r1
 8002ef0:    4603          mov    r3, r0
 8002ef2:    9004          str    r0, [sp, #16]
 8002ef4:    9103          str    r1, [sp, #12]
 8002ef6:    9804          ldr    r0, [sp, #16]
 8002ef8:    1840          adds    r0, r0, r1
 8002efa:    9002          str    r0, [sp, #8]
 8002efc:    9201          str    r2, [sp, #4]
 8002efe:    9300          str    r3, [sp, #0]
 8002f00:    b005          add    sp, #20
 8002f02:    4770          bx    lr

#4

Looking at rlibc, which provides similar functions, I think you need to add #![no_builtins] to your crate’s attributes.


#5

Thank you, that fixed it. With #![no_builtins] on my library crate that contains __aeabi_memclr and this implementation:

#[no_mangle]
pub unsafe extern "C" fn __aeabi_memclr(dest: *mut u8, size: usize) {
    let mut i = 0;
    while i < size {
        *dest.offset(i as isize) = 0u8;
        i += 1;
    }
}

I get this working disassembly:

08000980 <__aeabi_memclr>:
 8000980:    e003          b.n    800098a <__aeabi_memclr+0xa>
 8000982:    2200          movs    r2, #0
 8000984:    7002          strb    r2, [r0, #0]
 8000986:    1c40          adds    r0, r0, #1
 8000988:    1e49          subs    r1, r1, #1
 800098a:    2900          cmp    r1, #0
 800098c:    d1f9          bne.n    8000982 <__aeabi_memclr+0x2>
 800098e:    4770          bx    lr

I wonder what the “correct” solution is though. I guess either the __aeabi_* functions are intended to be implemented strictly in assembly, or I need to get compiler-rt working on the platform.


#6

#![no_builtins] is the correct solution.
It tells the compiler that at this point it can’t assume the builtins exist, i.e. they are not part of the platform you are building your code on. Which is the right thing when you are implementing them.