Iterating through HashMap approx twice the time of Vec?

    let mut v = vec![];
    let mut h = HashMap::new();

    let n = 1_00_000_000;

    for i in 0_u64..n {
        v.push((i, i));
        h.insert(i, i);}

    let start = std::time::Instant::now();
    let mut t = 0_u64;
    for i in v.iter() {
        t = t + i.1;}
    let end = std::time::Instant::now();
    println!("duration: {:?}", end - start);

    let start = std::time::Instant::now();
    let mut t = 0_u64;
    for i in h.iter() {
        t = t + i.1;}
    let end = std::time::Instant::now();
    println!("duration: {:?}", end - start);

Running through the hashmap appears to take twice as long as running through the Vec.

Any intuitions on why / whether this is reliable rule of thumb ?

I don't know if it's reliable measurement, but it's expected that sequential iteration on hashmap will be slower than on vector because the map is more sparse, hurting the cache.

2 Likes

Iterating over a hashtable requires looking at the empty slots as well as the populated ones, because there’s no other way to tell whether the slot contains data. I don’t know what load factor HashMap targets, but 50% isn’t unreasonable. If that’s the case, you’re just seeing the effects of the hash map inspecting twice as many slots as the vector.

2 Likes

Maximum 7/8 load, capacity is a power of 2. All of this is considered implementation detail AFAIK, not a guarantee.

3 Likes

Hashbrown uses 7/8 load factor, and the raw capacity must also be a power of 2.

There's a good chance that the Vec iterator can be auto-vectorized (SIMD) as well, while that's harder on map buckets when they're conditionally filled.

6 Likes

You may also consider using indexmap for an hashmap that's as fast to iterate as a Vec.

@SkiFire13 : Interesting. What is the tradeoff / downside of indexmap compared to the existing HashMap (swiss table design?) ?

It stores extra information to allow iteration in insertion order. Also, unless it uses hashbrown internally, it's quite likely to be slower at normal hashmap operations.

1 Like

indexmap does use hashbrown as its raw table implementation nowadays. It is generally a bit slower due to the extra metadata it keeps track of, but it's always been competitive in performance (just not necessarily best-in-class).

3 Likes

Yeah, reading how its implemented now it seems like it is a pretty fast implementation.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.