Rectangle clipping parameters and performance


I have a central routine (perform_transfer_rect_clipping) which is called very often as well as the union and intersection routines of Rect for which I wrote the traits And, AndAssign, Or, OrAssign.

I was wondering:
What is the best strategy for passing the Rect arguments in the several functions, regarding performance?
Currently they are all passed by reference. The size of Rect is 16 bytes.

I am still in the proces of learning the language...

If anyone has suggetions to improve the performance of the clipping routine, let me know :slight_smile:
Although I have no complaints about hte performance. Just interested.
Any other suggestions welcome as well.


There are several levels on which this question can be answered. Please consider all of them:

  • As a default, you should prefer to pass values that small directly, not by reference. 16 bytes fits in just 2 registers on most common (64-bit) platforms today, and that is much cheaper than accessing memory. And when you do pass a larger value, the ABI usually says that it will be passed by reference anyway. So, forcing a reference does not likely gain you anything.

  • But, when your functions are inlined, what the function signature says stops mattering — the code of the two functions is combined and they are optimized as a single unit. So, inlining means the choice is unlikely to matter, except when a function is not inlined (and at that point, the function is likely large enough, and takes long enough to execute, that the details of argument passing are insignificant).

  • But, if you care about performance, you must write benchmarks and try both options. This sort of micro-optimization is full of extremely surprising results; the effect of such a small change to the program cannot be predicted reliably. Run benchmarks; study the assembly output; determine empirically what choice is better.

All that said, I also recommend passing by value as a matter of style. Your code will read better without extra & and *, and you probably won't have any meaningful performance difference from sticking to that.

I also prefer by value, but not for big structs of course.
Benchmarking I never did (except measuring myself in releasemode).
The assembler output in (for example) rust playground I cannot read. Impossible to see what is what.
I changed - for now - these rects to "by value" and now the first task is to get my game running. Almost there...