Composing iterators and number of iterations

Hello everyone,
I have a question regarding iterators. I know that iterators are lazy, but are they also composable the way that Elixir Streams are?
For example, will the following code lead to two iterators

vals.map(|x| x * 2).map(|x| x + 1);

or will it be transformed to something like this:

vals.map(|x| (x * 2) + 1);

If yes, are there any guarantees around this?

Thanks!

At the point where your code is being optimized, there are not “two iterators” or one iterator. There are just functions (that we know as implementations of Iterator::next()) and data that they operate on. Unlike garbage-collected languages where there is often a concept of “object identity” that has to be preserved through run time, in Rust, there is no additional cost to “another object” beyond the data it carries and the code in its functions.

So, yes, it is very likely that the former code will be transformed to be equivalent to the latter code. But no, that transformation does not work on a principle that is anything like 'reducing the number of iterators'.

The standard library does sometimes have special cases for things like specific iterator combinations; for example, some_vec.into_iter().map(|x| x + 1).collect::<Vec<i32>>() will reuse the memory of the input vector rather than unnecessarily discarding it and allocating new memory. But most of the performance that the compiler gets you does not come from manipulating high-level entities like that, but from breaking everything down to individual operations and rewriting those into equivalent but more efficient forms.

5 Likes

And note that you can always use something like Matt Godbolt's Compiler Explorer to take a look at what codegen actually does.

For example, Compiler Explorer has your two example iterators - but t2 has compiled down into .set example::t2, example::t1 - saying that the compiler has determined that they're identical, and it need not output two sets of code.

2 Likes

Incidentally, if you compile this particular example using the default optimization settings and examine the output, you will see that Rust compiles each map into a closure.

The assembly code for 2 maps:

.LBB113_9:
        mov     rdi, qword ptr [rsp + 88]
        call    _Unwind_Resume@PLT
        ud2

example::f_2maps::{{closure}}:
        push    rax
        mov     qword ptr [rsp], rsi
        mov     rax, rdi
        mov     rdi, qword ptr [rsp]
        mov     esi, 2
        call    <&i32 as core::ops::arith::Mul<i32>>::mul
        pop     rcx
        ret

example::f_2maps::{{closure}}:
        push    rax
        inc     esi
        mov     dword ptr [rsp + 4], esi
        seto    al
        test    al, 1
        jne     .LBB115_2
        mov     eax, dword ptr [rsp + 4]
        pop     rcx
        ret

versus 1 map:

.LBB111_8:
        mov     rdi, qword ptr [rsp + 72]
        call    _Unwind_Resume@PLT
        ud2

example::f_1map::{{closure}}:
        sub     rsp, 24
        mov     qword ptr [rsp + 8], rsi
        mov     rax, rdi
        mov     rdi, qword ptr [rsp + 8]
        mov     esi, 2
        call    <&i32 as core::ops::arith::Mul<i32>>::mul
        inc     eax
        mov     dword ptr [rsp + 20], eax
        seto    al
        test    al, 1
        jne     .LBB112_2
        mov     eax, dword ptr [rsp + 20]
        add     rsp, 24
        ret

So the compiler hasn't performed any magic to merge the two closures into one, and therefore f_1map is very marginally more efficient than f_2maps.

If you're interested in what the compiler does with closures, it's also worth checking out the Rust Compiler Development Guide.

1 Like

Keep in mind that the codegen is completely different with -C opt-level=3. LLVM is smart enough to recognize that both functions reduce to the same exact machine code, so one copy is eliminated entirely. Compiler explorer

Change f_2maps just slightly, like making the addition + 2, and suddenly it will be output as its own function.

3 Likes

Interesting, I completely underestimated LLVM, so didn't even bother changing the optimization level. :neutral_face:

1 Like

In general, if you're looking at Rust codegen, you need (at least) -O in your rustc command line - by default, there's very little optimization done, and you need to ask for optimizations if you want to see what "real" output looks like.

This is also why you're recommended to only deploy the result of a release build - the difference is significant.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.