Announcing TeardownTree

Hi all!

I have just pushed a new project to crates.io, I've been working on it over the past few weeks. Admittedly, it's quite a niche thing, but read on and see if you have any use for it.

TeardownTree is a data structure that is basically a binary search tree, with a quirk that it supports efficient teardown scenarios, i.e. the typical usage pattern is to build a master copy of the tree, then

  1. clone the master copy
  2. tear the tree down with a series of delete-range operations (and do something with the retrieved items)
  3. rinse, repeat

It does not support insertion and does not make any attempt to remain balanced. Fast cloning is achieved by the virtue of implicit representation (like in Binary Heap, we store all values in the array and access nodes by calculating their indices on the fly). As for delete_range(), I have designed an algorithm specifically for that purpose. It works in O(k + log n) time, where n is the initial number of nodes in the tree and k is the number of nodes deleted. The algorithm is quite straightforward and applicable to the normal binary search tree, but, to my surprise, I have not found anything like it on the web. (I intend to implement it for the regular BST at some point and then compare it against an algorithm based on split/merge).

I have run some benchmarks against the BTreeSet in the standard library and got nice results - which is not very surprising, considering that BTreeSet does not support an O(k + log n) bulk deletion (issue #34666). Anyway, to give you an idea of the current state of affairs: I ran TeardownTree::delete_range() on a tree with 1,000,000 items repeatedly, deleting 1000 items at a time. Then I did the same with BTreeSet (removing items one-by-one). TeardownTree is ~14 times faster. As a bonus, TeardownTree also consumes 45% less memory (but, as opposed to BTreeSet, it only frees memory when the whole tree is dropped, not after deletions). You can see many more benchmarks by building the project and running the "benchmarks" binary.

https://github.com/kirillkh/rs_teardown_tree

9 Likes

Some benchmarks:

https://raw.githubusercontent.com/kirillkh/rs_teardown_tree/master/benchmarks/full_refill_teardown_100.png

https://raw.githubusercontent.com/kirillkh/rs_teardown_tree/master/benchmarks/full_refill_teardown_1000.png

The above shows the comparison between a full refill/teardown cycle when performed using 4 methods: TeardownTree::delete_range() (baseline), Treap::delete_range(), BTreeSet::remove() and TeardownTree::delete(). Take with a grain of salt, as BTreeSet lacks a way to delete a range of items in O(k + log n) time, and the Treap implementation that I used is not optimized. If you happen to know of a solid AVL tree/Treap/... implementation to benchmark against, please let me know.