Simple algorithms often outperform complicated ones on real machines, since the sizes involved are rarely big enough to overcome constant factors. BinaryHeap is a simple wrapper on a Vec, so there's no fancy pointer chasing or node allocating going on, which helps a bunch.
As for the amortized O(1), it's clearly O(1) to insert an element that's smaller than the element above it in the heap, since there's no sift-up needed. And half the elements are in the leaves of a binary tree, so for sufficient randomness half the elements will hit that "no sift-up" condition.
Then I forget the exact details, but it probably boils down to the "pie series" . Half the elements need the one operation, Half of that need one more, etc, making a geometric series of finite sum, meaning O(1).
As an aside, don't forget that heapify is O(n), so you can get this more reliably if you can do it all at once.
Insertion is implemented by adding to the end of the tree and propagating up the tree. However, the tree is naturally bottom-heavy since half of all elements in a complete binary tree are leaves. On average, they don't move very far up the tree.
Contrast this with removal which must be O(log(n)) average -- otherwise, we would have O(n)-average comparison-based sorting using the heap. The O(log(n)) happens because the removal process removes from the root, moving the last element there and propagating it down the tree, typically traversing O(height) nodes since the tree is bottom-heavy.