Sbbf-rs Fastest (without asterisks) bloom filter library in Rust

Hey! I implemented a bloom filter library which seems to be the fastest in Rust according to my benchmarks, it is also probably one of the lightest since it doesn't have any dependencies (not even hashing functions).

There is a low level, unsafe, version that doesn't allocate any memory and is no_std: GitHub - ozgrakkurt/sbbf-rs: Split block bloom filter implementation in Rust

And there is a wrapper that handles memory and gives a safe api: GitHub - ozgrakkurt/sbbf-rs-safe: Split block bloom filter implementation

This particular implementation is a split block bloom filter and it is optimized using cpu specific instructions on x86, x86_64, aarch64 and wasm. It loads the best implementation of a filter on runtime based on the capabilities of the cpu, so no need to do target-cpu=native or anything like that to get the best performance.

The filter also produces the same byte buffers regardless of the system so it can be used to implement persistent bloom filters for databases or file formats, or to send filters over the wire.

It is an exact implementation the the bloom filter from the parquet parquet spec: parquet-format/BloomFilter.md at master · apache/parquet-format · GitHub

There are a lot of variants and implementations of bloom filters. But this variant is known to be the fastest among all of them, because it is very cache(loads a single small section of memory per query) and cpu(doesn't have any branches and is fully vectorized) friendly.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.