How can I learn to use iterators effectively

I have started doing the Rust track on Exercism, but I am writing in a very "C++-like" style. Especially when I need to iterate over a list of stuff, I only get to think in for and while loops, not all the iterator adaptors. Where and how can I get to learn how to use iterators instead? Here is a working program, but I want to know if I could do it more idiomatically and efficiently. It is summing all the unique multiples of a list of numbers below a certain limit (exclusive), e.g. all the multiples of 3 and 5 below 20 is 3, 5, 6, 9, 10, 12, 15, 18 and their sum is 78.

pub fn sum_of_multiples(limit: u32, factors: &[u32]) -> u32 {
    let mut multiples = vec![0];

    for i in factors.iter() {
        if *i != 0 {
            for j in (*i..limit).step_by(*i as usize) {
                if !(multiples.contains(&j)) {
                    multiples.push(j);
                }
            }
        }
    }
    multiples.iter().sum()
}

My mother tongue is Fortran, so I had the same problem with functional style. My strategy was to write the loop version and then translate it to functional. At first, those translations took an inordinate amount of time. After about 6 months, the loop versions started looking odd. Now when I write an explicit loop I feel compelled to add a comment explaining why I did it.

3 Likes

But how though? My problem is that I start trying to translate the loop into iterators, but I hit a wall, especially with the code I posted above. I simply don't know how to translate it into using iterators.

That's where the inordinate amount of time comes in. Start with the documentation for std::iter::Iterator and try various methods. In your case, map() looks interesting, but flat_map() or flatten() might be useful. Since you have an if test, perhaps you can use filter(). Another method that looks relevant is fold().

The point is to use trial and error until you get something that works. Then go back and see if there's a better way to do it. Eventually, you won't even have to think about it; you'll just know the solution.

2 Likes

I would say that this use of loops is reasonable. But if you want to go full iterator as an exercise:

  • You can use flat_map to produce an Iterator of all j values that would ever appear.
  • The next bit is not so straight-forward because it involves state. But Itertools has adaptors for many common patterns that require state; in this case, it has Itertools::unique.

More generally, any loop involving state can be rewritten using .fold(), but I wouldn't say the result is always better than the loop.


Personally, here's how I might write it without depending on Itertools, by using filter with a predicate that keeps mutable state:

pub fn sum_of_multiples(limit: u32, factors: &[u32]) -> u32 {
    factors.iter().cloned()
        .filter(|&i| i != 0)
        .flat_map(|i| (i..limit).step_by(i as usize))
        .filter({
            let mut seen = HashMap::new();
            move |&m| seen.insert(m)
        })
        .sum()
}

(actually, that's a lie; I'd write it almost exactly as you have written it!)

4 Likes

Thanks for the replies! It seems I'll be spending an inordinate amount of time then. :wink: And yes, I think this specific case is not too bad using loops, but I still never know if I am just ignorant of all the tools in the Iterator trait or if loops is actually the cleanest way to go.

Of course I can't help but notice that very unefficient contains call. I would recommend not checking for duplicates, then calling sort_unstable and dedup on the vector. This would also make it easier to turn the loop into iterators.

let mut factors = factors.iter().cloned().filter(|i| i>0).flat_map(|i| (i..limit).step_by(i as usize)).collect();
factors.sort_unstable();
factors.dedup();
factors.into_iter().sum()
2 Likes

I agree with your solution, just one minor improvement. I would collect() into a HashSet so sorting and deduppling are no longer required.

2 Likes

My suggestion: try writing with some FP language, idaelly in Haskell, but even JS + rambda would do the job. It's not like Rust is functional, but this Iterator concept is at least for me very functionalish, and when you would use to this mindset, your Iterator chains would be natural.

1 Like

Thanks to all who replied! It is much appreciated! I have done the following after your suggestions, and I think it is very clean and readable:

use std::collections::HashSet;

pub fn sum_of_multiples(limit: u32, factors: &[u32]) -> u32 {
    factors
        .iter()
        .filter(|&&x| x != 0)
        .flat_map(|&f| (f..limit).step_by(f as usize))
        .collect::<HashSet<_>>()
        .iter()
        .sum()
}
1 Like

After an inordinate amount of time ( :wink: ) I converted the commented out part to an iterator-based solution:

pub fn build_proverb(list: &[&str]) -> String {
    /*let mut result = String::new();
    if list.len() > 0 {
        for i in 1..list.len() {
            result = format!("{}For want of a {} the {} was lost.\n", result, list[i-1], list[i]);
        }
        result = format!("{}And all for the want of a {}.", result, list[0]);
    }
    result*/
    if list.len() > 0 {
        list
            .iter()
            .zip(list[1..].iter())
            .map(|(a, b)| format!("For want of a {} the {} was lost.", a, b))
            .collect::<Vec<_>>()
            .iter()
            .chain(vec![format!("And all for the want of a {}.", list[0])].iter())
            .cloned()
            .collect::<Vec<_>>()
            .join("\n")
    } else {
        String::new()
    }
}

Now I am wondering: which is most efficient? I am thinking the loop-based one, but I would like to know what do you think?

The iterator-based one does far less work joining strings. result = format!("{}...stuff", result) does quadratic work (you could have possibly used push_str to greater effect). Meanwhile, join copies each byte exactly once. (it even precomputes the exact length of the final string, so that no intermediate reallocations are necessary; this is why it requires a slice of strings)

I would replace .cloned() with .map(|x| &x[..]), to build a Vec<&str>.

Now I am wondering: which is most efficient? I am thinking the loop-based one, but I would like to know what do you think?

There's no use guessing. Benchmark it! (try criterion or the bencher crate)

1 Like

For speed in loop replace format! with write! (+ minor changes)

Quite a bit of redundancy; can be replaced with just

.chain(std::iter::once(format!("And all for the want of a {}.", list[0])))
1 Like

You can think to form your intuition, but the real answer is... measure it!

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.