How does BinaryHeap achieve O(1) push?

I have recently been building a Fibonacci heap in Rust and was surprised at how performant std's BinaryHeap is.
Especially the push method, which states:

Time complexity
The expected cost of push, averaged over every possible ordering of the elements being pushed, and over a sufficiently large number of pushes, is O(1)

I always thought that a binary heap had a theoretical insert time complexity of O(log n), so I was wondering if I could get an explanation as to how std's implementation can claim O(1)?

It seems to be the distinction between randomized average complexity vs worst-case complexity. Wikipedia does claim the exact same bounds – I don't think std is doing anything special here.

1 Like

Simple algorithms often outperform complicated ones on real machines, since the sizes involved are rarely big enough to overcome constant factors. BinaryHeap is a simple wrapper on a Vec, so there's no fancy pointer chasing or node allocating going on, which helps a bunch.

As for the amortized O(1), it's clearly O(1) to insert an element that's smaller than the element above it in the heap, since there's no sift-up needed. And half the elements are in the leaves of a binary tree, so for sufficient randomness half the elements will hit that "no sift-up" condition.

Then I forget the exact details, but it probably boils down to the "pie series" [1]. Half the elements need the one operation, Half of that need one more, etc, making a geometric series of finite sum, meaning O(1).


As an aside, don't forget that heapify is O(n), so you can get this more reliably if you can do it all at once.

And you might be interested in On Modern Hardware the Min-Max Heap beats a Binary Heap | Probably Dance


  1. if you eat half a pie, then a quarter of a pie, then an eighth of a pie, etc, in the limit you eat one pie. ↩ī¸Ž

6 Likes

The intuition is this:

  • Insertion is implemented by adding to the end of the tree and propagating up the tree. However, the tree is naturally bottom-heavy since half of all elements in a complete binary tree are leaves. On average, they don't move very far up the tree.
  • Contrast this with removal which must be O(log(n)) average -- otherwise, we would have O(n)-average comparison-based sorting using the heap. The O(log(n)) happens because the removal process removes from the root, moving the last element there and propagating it down the tree, typically traversing O(height) nodes since the tree is bottom-heavy.

Actually, you know what, StackOverflow has this covered in far more detail.

4 Likes

Thanks, the link does a good job of explaining the probability aspects.

Coupled with the array structure I can see how a push comes out O(1) on average.

I think I may have been conflating the worst case with the average push time.

Excellent read!

Yeah, the precise thing being reported is very important here -- it's like how Vec::push is amortized O(1), but worst-case O(n).