Iterating through HashMap approx twice the time of Vec?

zeroexcuses · May 23, 2022, 2:36am

    let mut v = vec![];
    let mut h = HashMap::new();

    let n = 1_00_000_000;

    for i in 0_u64..n {
        v.push((i, i));
        h.insert(i, i);}

    let start = std::time::Instant::now();
    let mut t = 0_u64;
    for i in v.iter() {
        t = t + i.1;}
    let end = std::time::Instant::now();
    println!("duration: {:?}", end - start);

    let start = std::time::Instant::now();
    let mut t = 0_u64;
    for i in h.iter() {
        t = t + i.1;}
    let end = std::time::Instant::now();
    println!("duration: {:?}", end - start);

Running through the hashmap appears to take twice as long as running through the Vec.

Any intuitions on why / whether this is reliable rule of thumb ?

chrefr · May 23, 2022, 2:39am

I don't know if it's reliable measurement, but it's expected that sequential iteration on hashmap will be slower than on vector because the map is more sparse, hurting the cache.

2e71828 · May 23, 2022, 3:49am

Iterating over a hashtable requires looking at the empty slots as well as the populated ones, because there’s no other way to tell whether the slot contains data. I don’t know what load factor HashMap targets, but 50% isn’t unreasonable. If that’s the case, you’re just seeing the effects of the hash map inspecting twice as many slots as the vector.

quinedot · May 23, 2022, 4:09am

Maximum 7/8 load, capacity is a power of 2. All of this is considered implementation detail AFAIK, not a guarantee.

cuviper · May 23, 2022, 4:09am

Hashbrown uses 7/8 load factor, and the raw capacity must also be a power of 2.

There's a good chance that the Vec iterator can be auto-vectorized (SIMD) as well, while that's harder on map buckets when they're conditionally filled.

SkiFire13 · May 23, 2022, 6:03am

You may also consider using indexmap for an hashmap that's as fast to iterate as a Vec.

zeroexcuses · May 23, 2022, 7:38am

@SkiFire13 : Interesting. What is the tradeoff / downside of indexmap compared to the existing HashMap (swiss table design?) ?

alice · May 23, 2022, 7:51am

It stores extra information to allow iteration in insertion order. Also, unless it uses hashbrown internally, it's quite likely to be slower at normal hashmap operations.

CAD97 · May 23, 2022, 8:04am

indexmap does use hashbrown as its raw table implementation nowadays. It is generally a bit slower due to the extra metadata it keeps track of, but it's always been competitive in performance (just not necessarily best-in-class).

alice · May 23, 2022, 8:06am

Yeah, reading how its implemented now it seems like it is a pretty fast implementation.

system · August 21, 2022, 8:06am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Iterating hashmap vs iterating vec	2	1140	July 15, 2023
Filling vector with for in vs range with iterator. Extremely different performance help	4	1841	April 9, 2020
How to push HashMap values into Vec in order? help	5	6607	January 12, 2023
How to iterate a vector twice with mutable and immutable borrow? help	7	645	October 24, 2023
Another for loop vs iterator performance question help	9	1354	July 6, 2019

Iterating through HashMap approx twice the time of Vec?

Related Topics