#![no_builtins] does not remove memcpy from pure Rust code

I want to write a shared library for Linux which does not depend on libc. Since it contains purely computational (and panic-free) functions it depends only on memset and memcpy symbols.

Let's use the following simplified example:

#![no_std]

#[unsafe(no_mangle)]
pub unsafe extern "C" fn my_cpy(buf_in: *const u8, buf_out: *mut u8, len: usize) {
    core::ptr::copy_nonoverlapping(buf_in, buf_out, len);
}

#[unsafe(no_mangle)]
pub unsafe extern "C" fn my_set(buf_ptr: *mut u8, buf_len: usize) {
    let s = core::slice::from_raw_parts_mut(buf_ptr, buf_len);
    s.fill(0);
}

As expected, readelf -Ws results in the following symbol table:

     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND memcpy
     6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND memset
     7: 00000000000015f0    15 FUNC    GLOBAL DEFAULT   11 my_cpy
     8: 0000000000001600    17 FUNC    GLOBAL DEFAULT   11 my_set

Ideally, I would like to "inline" memcpy and memset into the library without exposing them as global symbols (I am fine with it being slightly less efficient than the system's implementation). I could use the compiler-builtins crate with enabled mem feature, but unfortunately it can be used only on Nightly.

But recently I encountered the #![no_builtins] attribute which looks exactly what I want. And indeed, adding it to the crate removes the memcpy symbol, but for some reason memset is still present:

     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND memcpy
     6: 00000000000015c0    15 FUNC    GLOBAL DEFAULT   11 my_cpy
     7: 00000000000015d0   156 FUNC    GLOBAL DEFAULT   11 my_set

Is this behavior expected?

Your code using copy_nonoverlapping that is memcpy under the hood. Since you didn't provide your own memcpy in the library, the compiler expects it to come from an external source (See UND in Ndx).

You can provide your own memcpy in your library

#[unsafe(no_mangle)]
pub unsafe extern "C" fn memcpy(dest: *mut u8, src: *const u8, n: usize) -> *mut u8 {
    unreachable!()
}

With this provided symbol readelf shows this dynsym

     5: 0000000000001172    17 FUNC    GLOBAL DEFAULT    7 my_set
     6: 0000000000001149    24 FUNC    GLOBAL DEFAULT    7 memcpy
     7: 0000000000001163    15 FUNC    GLOBAL DEFAULT    7 my_cpy

memcpy is not UND now but it is still exported :frowning: But I think that is current behavior: all no_mangle symbols are exported from a cdylib. I was able to find such an issue (Private `#[no_mangle]` symbols are exported from a `cdylib` · Issue #98449 · rust-lang/rust · GitHub)

See https://doc.rust-lang.org/core/#how-to-use-the-core-library.

A bunch of things are required even if you never call them -- LLVM will optimize certain code patterns into calls to those library methods that are expected to exist, for example.

1 Like

The UND part is not relevant to the question. In practice I use #[link(name = "c")] unsafe extern "C" {} to implicitly link to memset/memcpy from libc.

Yes, but fill(0) compiles to memset by default as well and #![no_builtins] forces the compiler to generate appropriate code instead of calling the symbol.

I guess the difference is whether memset/memcpy is called explicitly or as a result of optimization passes. Replacing fill(0) with core::ptr::write_bytes results in memset regardless whether #![no_builtins] is used or not.

I wonder what prevents the compiler from replacing the symbols with "naive" code when the attribute is used. In the current form it looks quite useless...

Yeah, a no_mangle variant which does not export the symbol probably could've been useful here. Meanwhile, compiler-builtins looks like the only practical option for creating a libc-free library.

fill only compiles to memset due to llvm optimizations. The standard library implements it as

for item in self.iter_mut() {
    *item = value;
}

for Copy types. The reference only talks about optimizations to functions that are assumed to exist. I am guessing that memcpy is directly called by the standard library.

I think you're right. Compiler Explorer. Compiler optimized my_set using sse instructions

So what is to stop it from optimising your memcpy implementation into calling memcpy? :upside_down_face:

And I hope it is not something magic based on the current function name, because that would break if my memcpy calls into private_memcpy_helper...

Optimizing compiler-builtins intrinsics into self-recursive calls is the only thing #![no_builtins] prevents. It doesn't cause already memcpy that already exist in the input llvm ir to be turned into copy loops or anything like that. And rustc does emit memcpy for copy_nonoverlapping as well as moves.

3 Likes

Why? If you never call memset or memcpy explicitly that calls to them are not generated.

Providing memset/memcpy-free core is entirely separate exercise, but that one doesn't need any special attributes, just some patience.

The simplest way to ensure that this wouldn't happen is to write it in assembler. Easy and simple.

1 Like