ELF Relocations

The short version

I'm working on a BPF program that is loaded via tc which requires the ELF to be in a particular format. Rust (or LLVM?) utilize relocations for local literal variables, as well as some common local variables which trips up the tc loader because it doesn't know how to handle these relocations.

How is Rust making these determinations / is there a way to control this behavior?

The Longer Version

Specifically, the tc BPF loader it requires the ELF binary use the following format:

  • all BPF map declarations be in an ELF section called maps
  • any tc actions/classifiers be in their own ELF sections (names can be arbitrary, as you tell tc which sections to use at runtime)
  • Relocations can only point to sections .text, maps, or the sections containing your classifier/actions (<-- This is the problem)

Placing maps in a maps sections easy, and works flawlessly with Rust's #[link_section = "maps"]. So that isn't the issue.

For a simplified example, assume there is a well defined map like a HashMap<K,V> (meaning assume things like HashMap::init already work as intended) which is declared in a manner like:

#[link_section = "maps"]
static mut my_map: HashMap<i32, u64> = HashMap::init();

The problem is within user code, such as a function my_action() if I do something like my_map.set(&1, &count) (assuming count is defined as some u64 I'm trying to update). Rust will place the &1 in its own section of something like .Lalloc52 with a value of 0x1, and a relocation pointing to .Lalloc52. Because the relocation points to .Lalloc52 and not one of the valid relocation sections listed above, tc loading fails.

However, I change out &1 to some valid memory location (assuming type i32 is correct), such as my_map.set(&some_i32, &count), the relocation is gone and everything works.

If I try to make my own local variable binding, something like:

let one = 1;
my_map.set(&one, &count);

The relocation is now to .rodata instead of .Lalloc52, but still causing tc to fail loading.

I know this probably difficult to follow and a niche problem, but any help is much appreciated!

Which target are you using to compile your code?

Which relocations are emitted by the compiler is highly dependent on the target and the compiler flags passed. Normally on x86 when generating PIC with the right flags, the only relocations that need to be emitted are for pointers in the .data and .rodata sections (global data structures), which includes vtables for trait objects. Here's a set of linker flags that does that.

Unfortunately, I can't really give you much advice other than try playing around with the linker flags.

I appreciate the help. I'm using the x86_64-unknown-linux-gnu (i.e. default on my machine). The compile process is essentially:

  • Compile with RUSTFLAGS='-C embed-bitcode=yes' cargo rustc --release -- --emit=llvm-bc -C panic=abort -C lto -C link-arg=-nostartfiles -C opt-level=3
  • Compile the resulting bitcode with llc --march=bpf --filetype=obj

There is actually another step in the middle which iterates over the bitcode and ensures all functions are annotated with always-inline due to BPF not allowing function calls, and the abort instructions are replaced with BPF's exit instruction.

I have also tried adding the link options you mentioned, but there is no change to the resulting object file.

All the code examples, and code in the wild I've seen has a similar process when using C (clang) with --emit-llvm followed by llc --march=bpf --filetype=obj. Even though clang has a -target=bpf, it doesn't appear to be used much unless the host machine is 32bit even though the kernel documentation recommends using it..

I don't think x86_64-unknown-linux-gnu is an appropriate target to use with rustc, but since you're not letting LLVM do any code generation I don't think it really matters. (I think the proper way to do this is to create a new target bpf-unknown-unknown or so.)

Regardless, I think the flow you're using now doesn't actually put anything through the linker. You're just using the relocatable ELF directly. I don't know if this is standard for BPF? I also think this is not a Rust issue, I get the same thing with clang -target bpf:

void my_fn(const int* a) {
}

void test_fn() {
    const static int v = 1;
    my_fn(&v);
}

Maybe you can post-process your ELF somehow to merge your rodata to text?

I think the proper way to do this is to create a new target bpf-unknown-unknown

I've found a fork of Rust that does exactly that, by Solana Labs. So I'm currently in the process of updating it to the latest nightly and will report back if that changes anything.

Regardless, I think the flow you're using now doesn't actually put anything through the linker.

I'm somewhat outside my area of expertise, so forgive me if I'm asking something dumb; but I've noticed some the linker args I supply to make changes to the resulting ELF, even if they're not the changes I want so is it possible that Rust is emitting linker hints through the LLVM bitcode, or that the .bc file contains enough linker info that llc picks it up and follows through?

I don't know if this is standard for BPF?

Depends on which loader you're using. For BCC or bpftrace (libbpf), no. For tc/iproute2 yes. If you're using a custom loader, the I think most people go the libbpf route, but only because the BTF/CO-RE standard that allows pre-compiling the object file and loading later is still new and in active development.

[example]

That points out something interesting to me. In your example you're using const static which makes sense it ends up in .rodata. However, what I thought I was telling Rust is to make a non-static local stack variable. Again, being outside my area of expertise, is Rust "promoting" the let one = 1; to static data for performance reasons?

Maybe you can post-process your ELF somehow to merge your rodata to text?

That will be something I look into more or less as a last resort :stuck_out_tongue:

Again, thanks for taking the time to help out here!

That's very odd. Passing -C link-arg to rustc should have no impact on the output of --emit=llvm. Arguments passed to llc could of course influence object generation.

Since it's a constant, probably something like that yes. This would allow it to merge it with other constants with value 1 later.

Perhaps I'm mixing concepts, but for example -C link-args=-z,notext does in fact remove the .text section from the final ELF.

This would allow it to merge it with other constants with value 1 later.

Yeah that seems to be exactly what is happening. This also matches with if I only have that let one = 1; in a single place of this BPF program, that "optimization" doesn't happen, but as soon as a I second or beyond it seems like Rust is thinking, "Lets save space and place all these into a single spot with a relocation." Is there a way to keep Rust from trying to do this? I.e. I'm imagining something like the benchmarking black box.

Are you looking at the BPF ELF (the on produced by llc) or the x86 ELF (the one produced by rustc)?

I don't think you're going to have luck coaxing rustc to produce different LLVM IR. If it's common for BPF not to have a .rodata section, then LLVM should know that when generating code for the BPF target and avoid it.

Here's another thing you can try: edit the LLVM IR to put all constant globals in the text section. For example:

-@alloc1 = private unnamed_addr constant <{ [4 x i8] }> <{ [4 x i8] c"\01\00\00\00" }>, align 4
+@alloc1 = private unnamed_addr constant <{ [4 x i8] }> <{ [4 x i8] c"\01\00\00\00" }>, align 4, section ".text"

The BPF ELF produced by llc. I don't want you to have to go down a rabbit hole on that one though, its also entirely possible I'm mixing up things in my mind with all the different experiments I've been doing :stuck_out_tongue:

I'll try that with the LLVM-IR and see, thanks!