Cache efficient batch search

raidwas · July 29, 2020, 12:26pm

So I thought about this some time and I implemented a more cache friendly search (in sorted slices).

The code can be found here.

The basic idea is that during a binary search in (big) sorted slices the cpu is mainly waiting for the next value to compare to and therefore wasting time.
This can be improved by asking the cpu to prefetch the data and during that prefetching start searching for the next values.

I implemented this concept in three ways:

a handcrafted state machien
using futures
using generators

I personally prefer the two versions using generators as they stay extremely close to the original (the binary search from the std) while greatly improving performance (on big slices). Sadly generators cannot be used yet in stable...

I am not looking for a code review but mainly wanted to share it and ask for your opinions or try to answer questions you have.

notriddle · July 30, 2020, 7:32pm

I'm curious how it compares to Eytzinger Binary Search, which takes a completely different approach to defining a cache-friendly algorithm.

raidwas · July 30, 2020, 8:06pm

As far as I understand it, the text you linked does not prefetch the data for the immediatly following comparison, but for a comparison down the line.

Either way, here are my thoughts after flying over the link:

inserting or removing values might require rebuilding the whole array in a pattern that is far more complicated than a memcpy (what a array would use for insertion or deletion), thus making the Eytzinger representation not suitable for many situations
The text is only trying to speed up a single search, so the setting is a bit different. If I tried to speed up a single search I would have to prefetch multiple locations in the array beforhand without knowing which of them will be required, which would waste a lot of memory bandwith.

system · October 28, 2020, 8:06pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
How could one make this search more cache friendly? help	11	2528	January 12, 2023
Possible Bug in binary_search_by help	14	1018	January 12, 2023
Faster alternative to binary search?	34	4279	January 12, 2023
Binary_search is not fast as the auther said - why? help	6	595	November 23, 2020
Match statement efficiency?	3	11795	January 12, 2023

Cache efficient batch search

Related topics