Populating a Vec of String from string literal?

Sorry if this is a n naïve question. I'm only just learning Rust.

In the code snippets below, WORDS is a const string literal containing a long list of whitespace separated words. To collect these words into a Vec<String> the following works just fine:

let mut vec_words = Vec::new();
for word in WORDS.split_whitespace() {
    vec_words.push(word.to_string());
}

But this doesn't and I don't see any way to make it work with .to_string(), to.owned() or similar:

let vec_words: Vec<String> = WORDS.split_whitespace().collect();

it gives the error message:

error[E0277]: a value of type `Vec<String>` cannot be built from an iterator over elements of type `&str`
--> src/main.rs:15:57
|
15 |     let vec_words: Vec<String> = WORDS.split_whitespace().collect();
|                                                           ^^^^^^^ value of type `Vec<String>` cannot be built from `std::iter::Iterator<Item=&str>`
|
= help: the trait `FromIterator<&str>` is not implemented for `Vec<String>`

Is there any way to achieve this without a for loop, as iterator methods are shorter, more readable, and as I understand it, more idiomatic and (at least sometimes) more efficient?

Any help would be much appreciated. Thanks. :slightly_smiling_face:

You just need to call .map(|word| word.to_owned()) after splitting the string and before collecting. The same as in your for loop version.

3 Likes

Aha! Yes, that works. Thanks very much. It didn't occur to me to use map() here. :slight_smile:

In case anyone is curious, the words also needs to be filtered through a validation function which checks that the length of the words is as required, among various other things. So, the final version with iterator methods looks like this:

let vec_words: Vec<String> = WORDS
        .split_whitespace()
        .filter(|word| validate(word, word_len))
        .map(|word| word.to_owned())
        .collect();

As opposed to the for loop version:

let mut vec_words = Vec::new();
for word in WORDS.split_whitespace() {
    if validate(word, word_len) {
        vec_words.push(word.to_string());
    }
}

Not really much difference in length of code, but it's nice to stick to the more idiomatic form.

Alternative ways of “spelling” this, without the need to introduce an argument explicitly, include:

  • .map(ToOwned::to_owned)
  • .map(<_>::to_owned)
  • .map(str::to_owned)
  • .map(String::from)
  • .map(<_>::into)
  • etc...

I think .map(str::to_owned) looks nice :slight_smile:

9 Likes

Thanks! That's useful. I hadn't learned about these sort of shortcuts yet. :+1:

1 Like

If you also filter, then there's also the option to use filter_map together with then, in case you'd prefer that. E.g.

let vec_words: Vec<String> = WORDS
    .split_whitespace()
    .filter_map(|word| validate(word, word_len).then(|| word.to_owned()))
    .collect();

Not necessarily strictly better than

let vec_words: Vec<String> = WORDS
    .split_whitespace()
    .filter(|word| validate(word, word_len))
    .map(str::to_owned)
    .collect();

but these methods can be useful to know about nonetheless.

5 Likes

Thanks again :slight_smile: I didn't expect to learn so much from this one question!

1 Like

Nice. TIL <_>::into and <_>::to_owned.

One question. How does the compiler infer the type in these cases? Does it start with the method name into for example, and work its way backwards?

2 Likes

Actually, I'm not quite sure. I do now that these only work with trait methods. It's allowed somehow between all these more commonly-known things like str::to_owned(...) or ToOwned::to_owned or <str>::to_owned(...) or <str as ToOwned>::to_owned. This last one also supports <_ as ToOwned>::to_owned; the trait can also have parameters elided, e.g. let x: &str = <_ as AsRef<_>>::as_ref(String::new());, perhaps somewhere in all of this system, <_>::to_owned and the like became allowed, too. This is somewhat sparsely documented (if at all), so I'm not even quite sure whether this is 100% an intentional language feature, but it works so it's stable.


Here's an interesting test case...

trait Foo {
    fn foo(&self);
}
trait Foo2 {
    fn foo(&self);
}

struct Bar;

impl Foo for Bar {
    fn foo(&self) {}
}

fn test(x: Bar) {
    x.foo(); // <- works
    <_>::foo(&x); // <- doesn't work ("multiple applicable items in scope")
}

apparently it's less smart than method resolution. But that kind-of makes sense if this is a corner-case of the disambiguation syntax; if it's disambiguation syntax, it won't resolve the ambiguities for you. By this test, it seems like it's considering all the traits in scope (simple to test that traits not in scope don't apply) and all their methods, and if there's only a single one matching the name, then it works.

2 Likes

Yup -- the same way that you can't call trait methods from traits that aren't in scope.

Basically it's the same as the trait part of normal method resolution, just having skipped the inherent method part. That's the same as what happens during inference on something that's still an unknown inference variable -- you can call trait methods, but it won't look at inherent methods because that would make any conflicting inherent method anywhere an inference-breaking change.

1 Like

Left-swimming turbofish FTW

1 Like

It's not really a "shortcut", though. It basically falls straight out of the uniformity of the language: methods (inherent as well as trait) are just functions. Wherever a function-like value is expected, you can use them. There's not much special about closures (from this point of view, at least), so you usually don't need a closure if you already have a matching free function or method with the same signature.

1 Like