.collect() vs reduce()

Hi!
I was learning rust by doing the rustling exercises and I encountered an exercise where iterators were used. My solution used .reduce() to concatenate all the strings and their solution used simply .collect() which feels a bit like magic. I assume that the right implementation for the conversion is chosen to be something similar to reduce.

// my code
fn capitalize_words_string(words: &[&str]) -> String {
    words.iter().map(|x| capitalize_first(x)).reduce(| acc, e| acc + &e).unwrap()
}
// their solution
fn capitalize_words_string(words: &[&str]) -> String {
    words.iter().map(|x| capitalize_first(x)).collect()
}

I wanted to check how smart the compiler is and if the generated assembly code is the same for both versions.

I was a bit surprised to see that there are some differences in the generated x86 assembly code.
Here is the Godbolt link of the comparison: Compiler Explorer

I am no compiler expert, nor x86 expert but I was wondering if anyone knows why there is a difference in the code generated. I added opt-level 3 to both code versions.

Intuitively, to me, it would make sense that the reduce version is better optimized because I express the intent better but maybe .collect() calls a better-optimized version of my simple reduce.
I am curious what your thoughts are about this.

impl FromIterator<String> for String simply concatenates all strings. (The same goes with impl FromIterator<&str> for String).

No, why would it "express intent better"? The one based on collect is defined to concatenate as well.

4 Likes

TBF, I don't see how is reduce expressing the intent better. collect knows you are, well, collecting, and reduce have to somehow derive the idea and that's "hard".

Oh I see. It just feels a bit "magical" that collect figures out what to do based on the type. Now I understand.

For reference, here's the actual implementation:

#[cfg(not(no_global_oom_handling))]
#[stable(feature = "extend_string", since = "1.4.0")]
impl FromIterator<String> for String {
    fn from_iter<I: IntoIterator<Item = String>>(iter: I) -> String {
        let mut iterator = iter.into_iter();

        // Because we're iterating over `String`s, we can avoid at least
        // one allocation by getting the first string from the iterator
        // and appending to it all the subsequent strings.
        match iterator.next() {
            None => String::new(),
            Some(mut buf) => {
                buf.extend(iterator);
                buf
            }
        }
    }
}

So there's a slight optimization (reusing allocations) in the collect implementation.

Technically speaking same optimization applies to the code using .reduce() via this impl in std:

impl Add<&str> for String {
    type Output = String;

    fn add(mut self, other: &str) -> String {
        self.push_str(other);
        self
    }
}

As a result each addition preserves the allocation of self during the whole loop.

1 Like

Yours only works if there is at least one element. You can make them the same by using unwrap_or_default instead of unwrap. Other than that, they're essentially equal. But collect is a very popular method, so it'll be worthwhile to understand how it works.

In the general case collect can make use of an iterator's size_hint and preallocate space in the collection for the expected number of items, which can be more efficient if the iterator has some knowledge of its size. Whereas methods like fold and reduce are too broad to tell a collection that information at init-time.

1 Like

The core thing is that's happening here is that it's generally a bad idea to thread a value through a fold (which is what reduce is under the covers) when you don't actually need move access to it. Sometimes the optimizer can figure that out, but not always.

The collect version is doing something closer to

fn capitalize_words_string(words: &[&str]) -> String {
    let mut full = String::new();
    words.iter().map(|x| capitalize_first(x)).for_each(|e| full += &e);
    full
}

with a stateful closure, and which therefore isn't threading the temporary String into and out of the closure again and again.

TL/DR: Ideally the optimizer would always know that things are equivalent, but sometimes it doesn't, and this is a known place where LLVM isn't yet smart enough to know how to simplify it. (At least last I dug into it deeply, for the fold_mut conversation. It's possible it's smarter already now.)

1 Like

.collect() is generic; it doesn't "figure out" anything. iterator.collect::<T>() is literally just different syntax for <T as FromIterator<_>>::from_iter(iterator).

The "change" (or rather, definition) of behavior lies in the impl FromIterator for T impl of the specific collection type; that's the one that decides how to create a Self from an Iterator.

2 Likes

I see. I would have imagined that there some sort of specialized implementantion for this use case.
I think that the collect method is really nice and is probably more idiomatic than a reduce in this use case.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.