How do I find the function pointers for tests from the LLVM IR code of a Rust program?

We are developing a mutation testing system based on LLVM called Mull (see it on Github). The system supports C++ projects that use GoogleTest and I am trying to support Rust. To do so, we need to accomplish the following steps:

  1. Compile the language into LLVM IR. Rust supports this.
  2. Find the tests in the LLVM IR.
  3. Run the tests the code that is exercised by the tests ("testees").

The challenge is to find the unit test methods via LLVM IR API.

Consider the following example. It has 4 tests and one testee function:

pub fn sum(a: i32, b: i32) -> i32 {
    return a + b;
}

pub fn just_print() {
    println!("I am just_print() function. I just say hello!");
}

#[test]
fn rusttest_foo_sum1() {
    assert!(sum(3, 4) == 7);
}

#[test]
fn rusttest_foo_sum2() {
    assert!(sum(4, 5) == 9);
}

#[test]
fn rusttest_foo_sum3() {
    assert!(sum(5, 6) == 11);
}

#[test]
fn rusttest_foo_sum4() {
    assert!(sum(5, 6) == 11);
}

See there also slightly prettified LLVM IR: example.ll.pretty that is produced when compiling this Rust code.

Having explored that LLVM IR for a while, one can notice that Rust/Cargo run tests via a function main that invokes the test_main_static function which is given arrays of descriptions. Each description is a pair of a test function name and a test function pointer. See the @ref.e at line 47.

Our challenge is to collect function pointers to these tests by parsing this sophisticated struct layout so that later we can run these functions via LLVM JIT by giving it the function pointers we accumulated.

The obvious brute-force approach we are going to take is to run through this struct layout and parse the structs carefully and find the correct offsets of the test functions. This approach appears to be not portable across different versions of Rust or LLVM IR that might change in a future.

What is the easiest and at the same time reliable way of finding the test function pointers, other than the default of parsing the offsets by hand?

This question has been also cross-posted to StackOverflow: "How do I find the function pointers for tests from the LLVM IR code of a Rust program?".

2 Likes

Why not just collect anything that looks like a function pointer to LLVM by exhaustively searching the whole structure passed into the function called test_main_static?

In general I can not think of any approach that could work if the test suite is compiled with optimisations LLVM can and will inline even indirect calls just fine if the body of test_main_static ever gets its IR put into the same module as main.

Thanks for the answer. I have made it working using exactly the brute-force approach that I expected to work in my question (the same what you called "exhaustively searching the whole structure...").

The approach seems to work however our concern was and is that it might be not portable across different versions of Rust/LLVM hence this question.

https://github.com/mull-project/mull/pull/109/files

My next step will be to try this RustTestFinder on real code bases and see if I have any problems.