Reliably working around rust emitting `memset` when putting a slice on the stack

So I have this problem, I am writing a prototype no-libc, no_std stdlib for Rust on Linux GitHub - MarcusGrass/tiny-std: A tiny Rust std-lib for Linux x86_64 and aarch64 and I'm now trying to get dynamic symbol relocation to work for static-pie linked executables.

In a very machine-dependent way I've gotten it to work, essentially look up and remap the symbols at start, the hitch is that this has to be done without relying on any known symbols.

To make this work properly I need to get some dynamic values off the dyn-vector that linux maps into the application memory, there is a potential of 37 of these values provided. on a key -> value structure, where the key is an index.

This is an analogous structure to the aux-value which mercifully has 32 values, the 37 causes the problem.

When a +32-len zeroed array is put on the stack rust uses memset, ex:

fn main() {
    let mut my_slice = [0usize; 37];
    let my_ptr = my_slice.as_mut_ptr();
    println!("Initialized {:?}", my_ptr);
}

This produces the asm:

.section .text.test_project::main,"ax",@progbits
        .p2align        4, 0x90
        .type   test_project::main,@function
test_project::main:

        .cfi_startproc
        push rbx
        .cfi_def_cfa_offset 16
        sub rsp, 368
        .cfi_def_cfa_offset 384
        .cfi_offset rbx, -16

        lea rbx, [rsp + 72]

        mov edx, 296
        mov rdi, rbx
        xor esi, esi
        call qword ptr [rip + memset@GOTPCREL]
       ...

the memset@GOTPCREL segfaults since addresses aren't properly mapped yet.

Funnily enough, this boils down to the same asm:

fn main() {
    let mut my_ptr: MaybeUninit<[usize; 37]> = MaybeUninit::uninit();
    unsafe {
        for i in 0..37 {
            my_ptr.as_mut_ptr().as_mut().unwrap_unchecked().as_mut_ptr().add(i).write(0);
        }
        //my_ptr.as_mut_ptr().write([0usize; 64]);
        let rf = my_ptr.assume_init_mut();
        println!("Initialized {:?}", rf);
    }
}

But writing, say black_box(0) instead of 0, or i instead of 0 is fine. However, I'd rather not try to trick the compiler. Some way to get it not to use any symbols without writing the entire procedure of remapping symbols in asm would be amazing, any tips?

1 Like

I would probably recommend that you write this part in assembly. LLVM is clearly making incorrect assumptions, and I don't think you can tell it that memset isn't available yet.

2 Likes

Alright, I think I'll try to condense it down to some values fewer than 32 to get under the memset limit in a struct for the time being. Allocating a slice on the stack through asm turns out to be extremely difficult, since the offset depends on what Rust decides to put on the stack above etc.

To be clear, I would probably have implemented the entire function in assembly, not just the part that initializes the array. That will give you control over the stack space as well.

Yeah, it'd work but it's a bit too much that needs to be done in that function for me to want to get into it on two architectures without being able to debug.
Even print-debugging with no symbol-access is a pain to set up.

LLVM assumes that certain functions are always available if not explicitly told otherwise.

But there's an option for clang which can fix the issue. I wonder if rustc have similar one or if it can be added.

1 Like