https://github.com/bevyengine/bevy/discussions/655
Greater context is in the above discussion, but essentially when working on bevy I found that ecs_bench micro-benchmarks would return drastically different results upon extremely minor changes to the benchmarks, such as commenting out unrelated pieces of code. To test the performance of bevy in more real-life situations I wanted to create some headless "games" that work essentially like a real game, but they run themselves and don't rendering.
For example, I just finished an asteroids game that has the semblance of a real game from the system logic perspective, but doesn't require user input or rendering ( rendering is optional ):
I'm thinking of making at least one more game like this and probably using the bevy breakout example as well, but now I want to know what the best way to profile and analyze the performance of Bevy using this game is.
I've setup Linux perf
and valgrind
and their GUIs Hotspot and KCachegrind, but I don't really know how to use them. Also, on Linux ( using perf ) the asteroids example can print out the number of CPU cycles and CPU instructions that have been run over the execution of the game, which seems like a useful metric:
cycles / instructions: 4.05076 M / 2.42554 M (1.67 cpi)
We might be able to do other similar things.
Is the best way to test this just to run the game for a certain number of frames and time how long it takes? What kind of strategies can I use to approach this?
The goal is to be able to more effectively determine the effect that changes to the bevy engine have on engine performance. We need to be able to compare one version of bevy to other versions of bevy and measure the difference in performance.
I'm imagining a nice workflow would be if we could script out a benchmarking suite that runs through a few of these benchmark games and collects a certain set of stats on them and displays the difference in the stats between two runs of the suits. I know that criterion does this, and maybe that would be useful for the timing portion, but we might also want to do the same with CPU instruction counts.
Anyway, any guidance or tips would be appreciated, thanks!