What is the algorithmic complexity of HashSet::contains()?

I was kinda expecting to find this in the documentation, but it's not there. I'm particularly interested in the worst-case complexity (amortized/average is also interesting, but I assume that's going to be O(1)). In general, is there a go-to place to find this kind of information (besides reading the source)?

My impression is that Rust currently doesn't try to pin-down algorithmic complexities super precisely. That is, I expect that, in theory, Rust should be able to change hash table implementation to have O(1) worst-case insert (doing lazy resizing, a-la Go) or contains (cuckoo hashing) without breaking any guarantees. So, O(1)-ish is probably the best you can really depend on.

That said, I am surprised that even this is not documented. The docs seems to assume that the reader knows how hash-maps work in general. I think sending a PR to improve the docs in this area would be welcome. For example, BTreeMap does a better job explaining current performance characteristics.

As for the current implementation, I think it depends on hash-function quality (it's possible to dos a hash map if a bad hash function is used). This means that operations can be O(N) worst-case. And yeah, reading the source is probably your best bet here.

1 Like

The worst case should be O(n) if each hash is the same, however it should be noted that by default the stdlib's HashMap and HashSet use an hasher that provide resistance against HashDos attacks, so you should be safe.

This is actually present in the documentation of the collections module, in particular here.

6 Likes

An, I am wrong, nice!

Thanks! I couldn't find that part (I was looking at the specific page for HashSet I think).

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.