let mut v = vec![];
let mut h = HashMap::new();
let n = 1_00_000_000;
for i in 0_u64..n {
v.push((i, i));
h.insert(i, i);}
let start = std::time::Instant::now();
let mut t = 0_u64;
for i in v.iter() {
t = t + i.1;}
let end = std::time::Instant::now();
println!("duration: {:?}", end - start);
let start = std::time::Instant::now();
let mut t = 0_u64;
for i in h.iter() {
t = t + i.1;}
let end = std::time::Instant::now();
println!("duration: {:?}", end - start);
Running through the hashmap appears to take twice as long as running through the Vec.
Any intuitions on why / whether this is reliable rule of thumb ?
I don't know if it's reliable measurement, but it's expected that sequential iteration on hashmap will be slower than on vector because the map is more sparse, hurting the cache.
Iterating over a hashtable requires looking at the empty slots as well as the populated ones, because there’s no other way to tell whether the slot contains data. I don’t know what load factor HashMap targets, but 50% isn’t unreasonable. If that’s the case, you’re just seeing the effects of the hash map inspecting twice as many slots as the vector.
Hashbrown uses 7/8 load factor, and the raw capacity must also be a power of 2.
There's a good chance that the Vec iterator can be auto-vectorized (SIMD) as well, while that's harder on map buckets when they're conditionally filled.
It stores extra information to allow iteration in insertion order. Also, unless it uses hashbrown internally, it's quite likely to be slower at normal hashmap operations.
indexmap does use hashbrown as its raw table implementation nowadays. It is generally a bit slower due to the extra metadata it keeps track of, but it's always been competitive in performance (just not necessarily best-in-class).