Another for loop vs iterator performance question

When I use these different solutions on leetcode the "for loop" version is 4ms and the "iterator" version is 44ms fairly consistently. I know the pseudo-benchmark I'm using on the playground is not equivalent, but it consistently seems to give a faster result with the "iterator" version. Can anyone explain why one might be faster or slower than the other?

use std::collections::HashMap;

fn main() {
    let nums = vec![1, 2, 3, 4, 5];
    let k = 1i32;
    let start1 = time::precise_time_s();
    for _ in 0..100000 {
        find_pairs(nums.clone(), k);
    }
    let end1 = time::precise_time_s();
    for _ in 0..100000 {
        iter_find_pairs(nums.clone(), k);
    }
    let end2 = time::precise_time_s();
    println!("for loop time: {}", end1 - start1);
    println!("iterator approach time: {}", end2 - end1);
}

pub fn find_pairs(nums: Vec<i32>, k: i32) -> i32 {
    if nums.is_empty() || k < 0 {
        return 0;
    }
    let mut tally = HashMap::with_capacity(nums.len());
    for num in nums {
        *tally.entry(num).or_insert(0) += 1;
    }
    let mut count = 0;
    for (num, _val) in tally.iter() {
        if let Some(paired_num) = tally.get(&(num - k)) {
            if k > 0 {
                count += 1
            } else if k == 0 && *paired_num > 1 {
                count += 1
            }
        }
    }

    count
}

pub fn iter_find_pairs(nums: Vec<i32>, k: i32) -> i32 {
    let mut tally = HashMap::with_capacity(nums.len());
    for num in nums.iter() {
        *tally.entry(num).or_insert(0) += 1;
    }
    match k {
        k if k < 0 => 0,
        k if k == 0 => tally.iter().filter(|(_, val)| **val > 1).count() as i32,
        _ => tally
            .iter()
            .filter(|(x, _)| nums.contains(&(**x - k)))
            .count() as i32,
    }
}

(Playground)

Output:

for loop time: 1.5166003638878465
iterator approach time: 1.0781516963616014

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 1.02s
     Running `target/debug/playground`

I think there's a few things to be mentioned here:
  • Your use of a HashMap makes things abit weird. Iterating over a HashMap will get your values, while your get(&(num - k)) requires hashing it.
  • This can be wildly optimized
  • ... especially if release mode is enabled, but the pattern still shows.

    Your HashMap makes things difficult to actually time because an iterator is essentially
let internal_HashMap = vec![(1, 1), (2, 1), (3, 1), (4, 1)];
for i in internal_HashMap {}

while the other requires something like this

let internal_HashMap = ///
let val_to_compare = {
    //Hash the value we `get` on
};
for (hash, data) in internal_HashMap {
    if hash == val_to_compare {return Some(&data);}
}
None

your other methods are fast but do not do the same thing. The goal is to find complementary pairs such that the difference between them is k. you are just finding matches of k.

1 Like

Sorry, it seems I might've overlooked what the algorithm was trying to actually do :cold_sweat:
In any case, my point on how the iterator would be faster still somewhat stands, in that an iterator would essentially require iterating over a set of key/value pairs and checking equality, while a for loop would force you to hash each value and then look for it in the key/value pairs.

Make sure to benchmark the code in release mode, it makes a big difference:

   Compiling playground v0.0.1 (/playground)
    Finished release [optimized] target(s) in 1.68s
     Running `target/release/playground`
Standard Output
for loop time: 0.032227625139057636
iterator approach time: 0.029460886493325233

But as others have said, the code is quite different in the two versions, so I would start by making it more equal and then do benchmarking afterwards.

1 Like

I didn't do a good job in the original post by not providing the prompt for the problem. In essence the goal is to find number pairs with an absolute difference of k in a set of numbers, but each pair is only counted once. So for k = 2 and nums = [1,3,4,1] the count is 1 because (1,3) and (3,1) are duplicates. This is why a HashMap is used, to ensure uniqueness but still be able to find pairs for k = 0.

If I understand your point correctly, you are saying the iterator should be expected to be faster than the for loop because the hash function is applied in each iteration of the for loop. That makes sense and is consistent with what I get on the playground but contradicts the output I get from the leetcode site.

1 Like

Thanks. With regard to release vs debug mode, that shouldn't affect the relative efficiency of the different methods to one another much, should it? For instance even in your output the for loop is still slower than the iterator method. Which is in contrast to what the leetcode site returns. I'm just trying to figure out what it is about the different approaches that would make one more efficient than the other. So far the major differentiating factor seems to be HashMap::get() vs slice::contains() and the precedence for checking k == 0.

UPDATE:
I think the get() vs contains() was the biggest differentiator. Using the same match syntax and precedence for checking k == 0 did not have much impact, but using the slice::contains() vs HashMap::get() in the for loop made it slightly faster than the iterator version. Changing the iterator version to use fold() vs chaining filter().count() made no impact. I wonder if some of the test cases in the leetcode problem actually make the get() approach more efficient than the contains() one. Maybe for a very large sequence of numbers searching the HashMap is more efficient than searching the slice, but in smaller sequences the slice is faster.

Just the opposite, actually.

Iterators depend on optimizations, particularly inlining, to be fast. They're often slower in debug even when they're faster in release.

6 Likes

thanks for educating me on that point! Seems like a really important thing to be aware of.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.