Profiling a Library

There are a bunch of posts on the Internet about profiling Rust, but they all seem to focus on profiling a binary… how does one go about profiling a library? My directory setup is as such:

  -> all my library files, including
  -> - single file that has a #[test] function that runs my benchmark

To run my benchmark, I simply issue: cargo test benchmarks -- --nocapture. How would I go about using something like cargo profiler ( in such a setup? Also, how can I simply generate an optimized binary with debug symbols so I can run various other programs against it: valgrind, operf, etc?

Here is my actual repo if anyone is interested:

1 Like

Any reason you’re not using the benchmark suite as described in In that case, cargo tells you exactly which binary it’s running, which you can then feed into your profiler.

1 Like

I’ve had success with:

  1. writing a benchmark using built-in test library (in benches/)
  2. running cargo bench so that it prints the executable path for the benchmark
  3. launching that executable in a profiler

It helps if you add [profile.release] debug = true to Cargo.toml

1 Like

@KillTheMule, when I use the benchmarking tool, it doesn’t print the time:

      Running target/release/deps/benchmarks-d1fe1b6ff6d839ad

running 1 test
test benchmarks ... bench:           0 ns/iter (+/- 0)

I think the only way to get it to print timing information is to use the Bencher, but because I’m creating files and doing all sort of other things that don’t really pertain to the benchmarking, I don’t want it to be called multiple times.

@kornel, I didn’t know about [profile.release] so that at least lets me inject debug symbols into the bench mark. Thanks!

Can you create files before letting bencher start iterations? Or use dependency injection or configure benched code to use “files” in memory instead?

Na, I cannot easily setup the files first… this is a key/value store and so I’d want to start with fresh files each iteration. Is there a way to tell Bencher to only do a single run?

Regardless I’ve got a pretty good setup for timing my own functions:

fn put(start: u64, end: u64, db: &mut KVS, is_update: bool) {
    let range = start..end;

    let (elapsed, _) = measure_time(|| {
        for i in range {
            let key = format!("KEY_{}", i).as_bytes().to_vec();
            let value = if is_update { format!("{}_VALUE", i) } else { format!("VALUE_{}", i) }.as_bytes().to_vec();

            db.put(key, value);

    println!("Took {} to {} {} records", elapsed, if is_update {"UPDATE"} else {"PUT"}, (end-start));

If you’re seeing times of 0ns that tends to indicate the compiler has optimized everything away. The benchmarks generated by cargo bench will always tell you the average iteration length (and that ± spread indicator), otherwise its a bug in cargo bench.