Chess engine is faster with debug-assertions = true

I just updated the GUI for latest Xilem, and did again a few performance tests.

GitHub - StefanSalewski/xilem-chess: First Xilem GUI for the tiny salewski chess engine

With

[profile.release]
debug-assertions = true

the performance increase is about 20%.

Some months ago, we had the assumption that the debug_assert!() statements might give the compiler invariants, which might improve performance, see Funny random performance changes

But using just assert!() instead of debug_assert!() does give no performance gain. And I can even uncomment all the debug_assert!() calls in engine.rs, and get still the performance gain with use of debug-assertions = true.

Well, this is for a 64 bit x86 Linux box -- it is some funny magic.

I also tried LTO or codegen-units=1 with debug-assertions = false, but that has no significant effect on performance.

1 Like

Do you make sure the benchmarking environment is consistent? x86 is notorious for a fair comparison: P-core vs E-core, and hyper-threading, there are lots of things to make comparison not fair. Maybe taskset -c 0 could make it fairer.

2 Likes

Perhaps the performance gain is not from debug-assertions per se, but from overflow-checks, which defaults to the same value as debug-assertions. Try

[profile.release]
debug-assertions = false
overflow-checks = true

and see if that gets you the faster performance. If it does, then, when I encountered such a problem recently, I switched the problematic arithmetic to use .checked_*() methods to get the performance benefit regardless of build settings.

2 Likes

Profile the code to find out.
The compiler will probably be removing a bound check in a deep loop.

1 Like

I guess that is not that easy. Some weeks ago I tried profiling a Bevy app to find a regression and failed, so I assume for successful profiling I would need a longer vacation. See Window resizing delay/lag on latest stable version · Issue #22579 · bevyengine/bevy · GitHub

Using taskset -c 0 or overflow-checks = true makes no significant difference.

But actually using opt-level = 2 instead of the default 3 for the release build gives the same performance improvements! So the chess engine is one of the rare cases where 02 is better than O3, and I assume that debug-assertions = true avoids a special O3 optimization which in this case decreases performance.