Where is collect() implemented?

Hey, I am just learning Rust using Exercism. Could you help me understand a detail?

If input is a &str, the one-liner

input.graphemes(true).rev().collect()

reverses it as a String.

There, graphemes() comes from unicode-segmentation and provides an iterator over extended grapheme clusters. Then, the docs say rev() "reverses an iterator's direction", though I have read the source code in rev.rs and what I see is that rev() returns a new iterator that wraps the previous one, right?

Alright. What I fail to see is where is collect() implemented. Where is the logic that says "I'll concatenate what the rev() iterator yields into a String"?

The graphemes iterator contains items of type &str, and .rev() preserves the item type.

The question is thus, how does an iterator of &str items collect into a String?

The .collect() method is just a fancy wrapper[1] around calling FromIterator::from_iter; the FromIterator implementations can be found either on that page, look here in particular - or as we want to produce a String from an iterator, the page of String, too, has the relevant impl FromIterator<…> for String listed.

As with most things in Rust documentation, if you want to see the implementation, just click on the “source” button, and it takes you there, in this case, the logic that says “I'll concatenate what the rev() iterator yields (or more generallly any iterator of &str items) into a String” looks as follows

impl<'a> FromIterator<&'a str> for String {
    fn from_iter<I: IntoIterator<Item = &'a str>>(iter: I) -> String {
        let mut buf = String::new();
        buf.extend(iter);
        buf
    }
}

Ah well, that just delegates to Extend, another useful trait for collecting iterators into collections, just that with Extend the idea is that you can provide an existing collection that can be extended instead of creating a new one. Sure enough, the relevant impl Extend<&str> for String can be found using the same general approach, and we can see its source just as easily:

impl<'a> Extend<&'a str> for String {
    fn extend<I: IntoIterator<Item = &'a str>>(&mut self, iter: I) {
        iter.into_iter().for_each(move |s| self.push_str(s));
    }

    #[inline]
    fn extend_one(&mut self, s: &'a str) {
        self.push_str(s);
    }
}

So there’s your full logic. It creates a new string with String::new (in the from_iter implementation), then calls .for_each(move |s| self.push_str(s)) (in the extend implementation) on your iterator to push all the grapheme clusters (of type &str) into that String using String’s push_str method.


  1. Literally implemented as

    fn collect<B: FromIterator<Self::Item>>(self) -> B
    where
        Self: Sized,
    {
        FromIterator::from_iter(self)
    }
    
    ↩︎
7 Likes

Impressive answer, so well explained. Thanks @steffahn!

By the way, this fact that it produces a String is not clear from your one-liner alone. That one could also produce, for example, a Vec<&str> or HashSet<&str>, etc... all types that implement FromIterator<&str>.

The fact that String is chosen is determined by type inference. If the context where this one-liner appears expects an expression of type String, then type inference can help choosing the right implementation of collect so that the return type of collect matches the expected type – and in other cases, it's always possible to choose it explicitly by writing e. g. ….collect::<String>().

@steffahn good point. This one-liner is the implementation of this function:

pub fn reverse(input: &str) -> String {
    input.graphemes(true).rev().collect()
}

I guess type inference gets help from the return type in the function signature?

Yes, exactly!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.