So I thought about this some time and I implemented a more cache friendly search (in sorted slices).

The code can be found here.

The basic idea is that during a binary search in (big) sorted slices the cpu is mainly waiting for the next value to compare to and therefore wasting time.

This can be improved by asking the cpu to prefetch the data and during that prefetching start searching for the next values.

I implemented this concept in three ways:

- a handcrafted state machien
- using futures
- using generators

I personally prefer the two versions using generators as they stay extremely close to the original (the binary search from the `std`

) while greatly improving performance (on big slices). Sadly generators cannot be used yet in stable...

I am not looking for a code review but mainly wanted to share it and ask for your opinions or try to answer questions you have.