HashSet capacity is more than it was requested

Hi folks,

Today I've found that an actual capacity of a HashSet is more than it was requested via HashSet::with_capacity. Is it something expected? If yes, would you mind to explain why it is so?

use std::collections::HashSet;

fn main() {
    let set_usize: HashSet<usize> = HashSet::with_capacity(10);
    println!("set_usize capacity: {}", set_usize.capacity()); // set_usize capacity: 14
    
    let vec_usize: Vec<usize> = Vec::with_capacity(10);
    println!("vec_usize capacity: {}", vec_usize.capacity()); // vec_usize capacity: 10
}

(Playground)

Output:

set_usize capacity: 14
vec_usize capacity: 10

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
     Running `target/debug/playground`

with_capacity only guarantees that you will get at least the specified capacity. I think you want a with_capacity_exact, but this doesn't exist right now. on any collection To work around this you would use reserve_exact like this,

let mut set = HashSet::<usize>::new();
set.reserve_exact(exact_capacity);

But unfortunately, it look like HashSet doesn't have reserve_exact for some reason.

According to this talk a HashMap (and HashSet) uses groups of 7 elements, so it makes sense for it to not have reserve_exact.

8 Likes

The raw capacity (buckets) is always a power of 2, so it's a simple bitmask to get the raw index from a hash value. Combining this with a maximum load factor of 7/8 means the available capacity is 7*2n, except for sizes smaller than 8.

5 Likes

Thank you guys a lot. It get much more clear for me after your hints :handshake:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.