I have recently been building a Fibonacci heap in Rust and was surprised at how performant std's
push method, which states:
The expected cost of
push, averaged over every possible ordering of the elements being pushed, and over a sufficiently large number of pushes, is O(1)
I always thought that a binary heap had a theoretical insert time complexity of O(log n), so I was wondering if I could get an explanation as to how std's implementation can claim O(1)?
It seems to be the distinction between randomized average complexity vs worst-case complexity. Wikipedia does claim the exact same bounds – I don't think std is doing anything special here.
Simple algorithms often outperform complicated ones on real machines, since the sizes involved are rarely big enough to overcome constant factors.
BinaryHeap is a simple wrapper on a
Vec, so there's no fancy pointer chasing or node allocating going on, which helps a bunch.
As for the amortized O(1), it's clearly O(1) to insert an element that's smaller than the element above it in the heap, since there's no sift-up needed. And half the elements are in the leaves of a binary tree, so for sufficient randomness half the elements will hit that "no sift-up" condition.
Then I forget the exact details, but it probably boils down to the "pie series" . Half the elements need the one operation, Half of that need one more, etc, making a geometric series of finite sum, meaning O(1).
As an aside, don't forget that heapify is O(n), so you can get this more reliably if you can do it all at once.
And you might be interested in On Modern Hardware the Min-Max Heap beats a Binary Heap | Probably Dance
The intuition is this:
- Insertion is implemented by adding to the end of the tree and propagating up the tree. However, the tree is naturally bottom-heavy since half of all elements in a complete binary tree are leaves. On average, they don't move very far up the tree.
- Contrast this with removal which must be O(log(n)) average -- otherwise, we would have O(n)-average comparison-based sorting using the heap. The O(log(n)) happens because the removal process removes from the root, moving the last element there and propagating it down the tree, typically traversing O(height) nodes since the tree is bottom-heavy.
Actually, you know what, StackOverflow has this covered in far more detail.
Thanks, the link does a good job of explaining the probability aspects.
Coupled with the array structure I can see how a push comes out O(1) on average.
I think I may have been conflating the worst case with the average push time.
Yeah, the precise thing being reported is very important here -- it's like how
Vec::push is amortized O(1), but worst-case O(n).
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.