Are iterators even efficient?

I always thought that iterators in Rust are quite optimized and in most cases they outprform manual loops. But it seems that I wrong. At least if measuring in lines of resulting ASM code:

  1. First version - 1000 lines: https://godbolt.org/z/Wpjyfd
  2. Second version - replaced a..=b with a..b+1 - 800 lines: https://godbolt.org/z/htXHMF
  3. Replace collect with mutable vector and extend - 500 lines: https://godbolt.org/z/v-W7Iu
  4. Completely manual iterator - 500 lines: https://godbolt.org/z/-orLjK

I didn't benchmark it, although I think that manual loops are of the same performance as version #3.

What are good practices for that? I have chosen rust to write a clean code instead of manual loops but I'm returning to where I started, writing unnecessary stuff to make it performant.

Or there is a way to write in in a nice and clean way without paying for "zero" cost abstractions?

Naturally the optimizer can not optimize every case perfectly, but it is still rather good at it. There are some things that it is known that the optimizer is not too happy with, e.g.:

  1. Sometimes a .. b+1 is better optimized than a ..= b. Note that these are not equivalent as only the latter works correctly for b = INT_MAX.
  2. The chain combinator works rather poorly in for loops compared to a for_each call. The for_each impl for chain directly gives the compiler two consecutive loops instead of one loop with a weird chain condition.

In this case I'd also like to mention that you should not be surprised if 4. is the most performant as it's the only one with a tight size_hint. This is usually combatted by a Vec::with_capacity followed by an extend.

Additionally note that instruction count is definitely not the same as performance.

7 Likes

I just noticed that when going from 2. to 3. you did more than just replace collect with extend — you also replaced a flat_map with an ordinary for loop.

This likely had some impact similar to what happens with chain, because both collect and extend ultimately end up here, where the items are added by what is essentially a for loop.

Essentially you may have ended up in a situation where every iteration in the inner loop has to go through the outer loop (since it's wrapped in the iterator of the outer loop), which could add extra cost compared to the two nested loops you have in 3 and 4.

I recommend trying out for_each instead of the for loop, since this allows the generic implementations of for_each to format the loops as it wishes, instead of being forced into being a single loop.

Could you be more specific about for_each method? I intentionally avoid it since it violates "statements for effects, expressions for values" principle I'm following.

There are performance considerations against regular loops?

Unfortunately yes. The clearest example is the chain iterator, so I will use that as an example.

For loop: When using a chain iterator in a for loop, you're repeatedly calling next. This means that the code will have to check if you're in the first or second half for every iteration.

for_each: When using for_each, the custom implementation of for_each for the chain iterator will simply generate two loops, one after the other, so there is no need to check which half you're in for every iteration.

To be precise, for_each goes through the fold method, which is defined here. As you can see it, simply calls fold twice.

If you're lucky, the optimizer can fix it, but it's not very good at handling this case.

2 Likes

This principle is valuable in a lot of languages, but won't serve you terribly well in Rust; the for_each case @alice is explaining is only one of many such situations.

5 Likes

Hmm, that's interesting. But I don't quite get it: it only happens to chain, isn't it? I mean I didn't use chain anywhere and I doubt flat_map uses it internally.

Why did you reference it? Just as an example?

The flat_map combinator has a similar problem, it's just easier to explain for chain.

Basically the issue is that with flat_map, you are creating a nested loop, but when using it in a for loop, it has to look at every layer of loop at every iteration.

This is becaus in Rust almost every statement is also a value.

But in the latter case I only iterate a slice. Should i use for_each in this case as well?

I miss the guide when I have to use one or another. Or should I just avoid for anywhere?

If you use combinators that change the control flow, use for_each, otherwise for loops are fine. This includes flat_map, flatten, chain, and a few others.

When using for_each with a flat_map, it corresponds to manually writing nested loops, and in the case of the slice, you actually did one level of this optimization manually, so the slice for loop is better than the flat_map.

A very good question, and one that has been troubling me as a newbie to Rust.

As an old school, long time, programmer in procedural languages like C and many others I hardy know what an iterator is. All that map, fold, zip, zip, functional programming style is nothing I even recognize as software. My first Rust programs had proper loops and array and indexing spelled out as they should be.

Under advice from "cargo clippy" and the good folks here I started to replace all that with iterators and the rest.

So far my conclusion is:

  1. The move to the iterator/functional style made no noticeable difference to the run time of my programs.

  2. Except, using 'enumerate' was a killer of performance in at least one case.

  3. I'm still not convinced any of that functional style makes what I'm trying to express any clearer. The cost of the "zero cost" abstraction is that nobody can understand what you wrote.

All in all, what you get out of a compiler now a days is very unpredictable. Including C++. If performance is key then you just have to measure it.

4 Likes

I think it's something you need to mentor people to use to. I mean iterator are much easier to understand as you can see data flow, and with loops doing anything it's not that easy. Although it's something contradicting with how schools gives programming to people.

I always prefer data flow style but in exactly this case I needed a good performance for reasons.
Thanks for answering.

Except, not for me apparently. I do not see the data flow. I even wonder where the data is. What type is it and what do all those words mean anyway?

Often times I don't want my data flowing anywhere. I want it to stay just where it is, where I can see it, whilst I bash it into shape with a mutation hammer.

This article is a bit older, now, but it might still be of interest to you.

4 Likes

That's quite an overstatement. Just because you don't recognize the iterator-based style, doesn't mean that nobody else does.

Moderator note: Let's focus on the performance issues raised in the original topic here. Style preference around iterator chaining vs. imperative control flow is a perpetual debate that is already being discussed in several recent threads; we don't need to do it again here.

10 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.