I have recently been building a Fibonacci heap in Rust and was surprised at how performant std's BinaryHeap is.
Especially the push method, which states:
Time complexity
The expected cost of push, averaged over every possible ordering of the elements being pushed, and over a sufficiently large number of pushes, is O(1)
I always thought that a binary heap had a theoretical insert time complexity of O(log n), so I was wondering if I could get an explanation as to how std's implementation can claim O(1)?
It seems to be the distinction between randomized average complexity vs worst-case complexity. Wikipedia does claim the exact same bounds â I don't think std is doing anything special here.
Simple algorithms often outperform complicated ones on real machines, since the sizes involved are rarely big enough to overcome constant factors. BinaryHeap is a simple wrapper on a Vec, so there's no fancy pointer chasing or node allocating going on, which helps a bunch.
As for the amortized O(1), it's clearly O(1) to insert an element that's smaller than the element above it in the heap, since there's no sift-up needed. And half the elements are in the leaves of a binary tree, so for sufficient randomness half the elements will hit that "no sift-up" condition.
Then I forget the exact details, but it probably boils down to the "pie series" [1]. Half the elements need the one operation, Half of that need one more, etc, making a geometric series of finite sum, meaning O(1).
As an aside, don't forget that heapify is O(n), so you can get this more reliably if you can do it all at once.
Insertion is implemented by adding to the end of the tree and propagating up the tree. However, the tree is naturally bottom-heavy since half of all elements in a complete binary tree are leaves. On average, they don't move very far up the tree.
Contrast this with removal which must be O(log(n)) average -- otherwise, we would have O(n)-average comparison-based sorting using the heap. The O(log(n)) happens because the removal process removes from the root, moving the last element there and propagating it down the tree, typically traversing O(height) nodes since the tree is bottom-heavy.