Profiling by thread

Hello Community,

I am developing a multi-threaded rust application where many threads continuously run a computationally heavy optimization algorithm. I am looking for performance bottlenecks in each of the algorithms, but I am having a hard time producing correct profiling information and flamegraphs.

Every flamegraph that I create always seems to be missing stack information from some of the running threads?

Does anyone know a good way of profiling threads individually in rust? Currently I am thinking about just learning perf and solely relying on that but cargo flamegraph seems a little more convenient.

Kind regards,
Christian

Write unit/integration/bench tests. Although looking at the whole can be useful for some detail, it hides other bits.

If you can make code where a feature is used so program runs without multi threads you also eliminate some of the hassle.

Be sure to build with
RUSTFLAGS="-g -Cforce-frame-pointers=on" cargo build --release

perf is good to learn. It takes time; but if your wanting to profile you should be willing to do so.

Instrumenting your code "intrusively" allows fine grained control over what gets captured in the flamegraph. I've used profiling for this in the past, but there are also pprof, firestorm, and tracing-flame.