Idea for RFC to avoid needing to use super fish or type hint with collect()

Needing to call collect with a super fish[0], like so:

let x = iterator.collect::<Vec<String>>();

Or with a type hint:

let x: Vec<String> = iterator.collect();

feels like an unnecessary paper cut. It's merely annoying if you know how to write a super fish or correct type hint, but I think for a new rust programmer it might be more than annoying.

If so, then maybe it would be a good idea to provide easy-to-use functions which allow collecting into vecs, sets, and maps:

trait IntoVec<T>: Sized {
  fn into_vec(self) -> Vec<T>;
}

trait IntoSet<T>: Sized {
  fn into_set(self) -> BTreeSet<T>;
}

trait IntoMap<K, V>: Sized {
  fn into_map(self) -> BTreeMap<K, V>;
}

// blanket implementations can be provided for
// iterators, for example:
impl<T, I: Iterator<Item = T>> IntoVec<T> for I {
  fn into_vec(self) -> Vec<T> {
    self.collect()
  }
}

I think that since this is a very common case, it will save a lot of typing and compile time errors, and be a usability win for new rust programmers. They'll need to learn about super fish and type hints eventually, but by making it unlikely to be an early stumbling block, we can help lower the slope of the learning curve.

The choice of BTreeSet and BTreeMap instead of HashSet and HashMap is intentional. If a user needs an ordered map, using a hash map will produce an incorrect program. If a user doesn't need an ordered map, then using an ordered map will merely produce a slower-than-necessary program. Also, using ordered maps and sets have deterministic and implementation-independent iteration order, which can be useful by making bugs appear consistently, and program behavior not changing from run to run.

If it's worth it, we could also expose collecting into HashSets and HashMaps as into_unordered_X, or into_hash_X.

What do y'all think? Is this worth an RFC?

[0] ::<> (not sure if this is actually called a super fish but I'm rolling with it anyways)

3 Likes

A to_vec() is a handy function to add to iterators.

2 Likes

I like this idea. I can never seem to get the superfish syntax right on the first try and 90% of the time I'm collecting into a vec anyway.

That is all well and good, until you need to collect into something that isn't a Vec.
Maybe I'm not creative enough to think of an alternative, but I think this is something that needs to stay.

In any case, I personally prefer the type hint way, since it is easier to read.

1 Like

I felt the urge recently, hence to_vec - Rust

But generally? Is this little paper cut worth adding the extra traits/implementations to the stdlib? I don't know.

4 Likes

For what it's worth, the itertools library includes collect_vec for this purpose.

I would be in favor of bringing it into the stdlib as it is far and away the most common use of collect.

1 Like

I suggested adding FromIterator to the prelude so that many uses of items.collect::<Vec<_>>() could easily be rewritten as Vec::from_iter(items).

The main reason I haven’t written an actual RFC and PR for this yet is that lack of default type parameter fallback causes somewhat confusing errors if you use this pattern on certain types like HashSet.

3 Likes

I suggested adding FromIterator to the prelude so that many uses of items.collect::<Vec<_>>() could easily be rewritten as Vec::from_iter(items).

But to_vec() is better because it doesn't break the iterators chain...

Yeah, in my Pre-RFC, I wrote:

Note that collect may still be more readable when it appears at the end of a chain

It's not obvious to me which of to_vec or into_vec is the correct one w.r.t API guidelines, but I would lean to into_vec().

I have a literal short list of things in itertools to go into std, basically:

  • .into_vec() -> Vec<Self::Item>
  • .join(separator) -> String
  • .flatten() -> impl Iterator<Self::Item::Item>
  • .format(separator) -> impl (Formatting Traits)

To have it in std; we just need to come up with a vehicle for that -- can't add methods using Vec or String to libcore's Iterator.

1 Like

That said, I can understand if we don't want to break the floodgates open for adding methods for specific collections. After Vec there will be requests for every collection type including boxed slices and whatnot.

1 Like

Are streaming iterators going to improve design of some useful things?

I agree with moving some itertools things to std, the list could be short, but perhaps not as short as yours :slight_smile:

1 Like

BTW, you can use let x: Vec<_> = to avoid typing the full type.

1 Like

That would be my concern as well. Preselecting the list of collections to have convenience for is bound to run into this. Vec is very common so perhaps making it easier is fine and good (we already have things like vec! just for it).

Also, @rodarmor - the operator is called the turbofish although super fish sounds pretty cool :slight_smile:

That is all well and good, until you need to collect into something that isn’t a Vec.
Maybe I’m not creative enough to think of an alternative, but I think this is something that needs to stay.

Definitely, I wouldn't want to remove collection in favor of this

That said, I can understand if we don’t want to break the floodgates open for adding methods for specific collections. After Vec there will be requests for every collection type including boxed slices and whatnot.

I think that's a reasonable concern, but I think that maps, sets, and vecs are kind of primitive types that people expect to be well supported, for example getting first class support in python syntax, so it's a reasonable line to hold.

Also, @rodarmor - the operator is called the turbofish although super fish sounds pretty cool

Damn, I had a feeling I got it wrong :stuck_out_tongue:

I was thinking you might be able to use default type parameters to get around this in a backwards compatible way, giving a solution which is arguably nicer than adding specialised methods.

You could adjust the collect() function signature to look something like this:

collect<B=Vec<Self::Item>>(self) -> B 
where
    B: FromIterator<Self::Item> {...}

However after trying it out on the playground, it looks like default type parameters don't do anything when there's also a where clause... Theoretically this should work, so could that be a bug in rustc?

Type parameter default fallback is not in stable rust. The feature flag to use is #![feature(default_type_parameter_fallback)] but still the same outcome -- no effect.

1 Like

Maybe it's just me, but all this "let's add everything and the kitchen sink" mentality is exactly what ultimately spoiled C++ in the sense of being too complicated to easily reason about. Simplicity is a big part of what made C great, and I think that that is a worthwhile aspect to keep in mind.
Particularly rules of the shape "A holds, unless you want to do X in which case B or C may hold" is something I personally would like to avoid as they incur a pretty big mental overhead in day to day programming. If that means that some "papercuts" remain then that has my preference.

3 Likes

But by the same logic, iterators have fold so we don't need sum or product. My concern is more that it becomes a little too easy for beginners to fall into a Rust anti-pattern (always allocating a vector) which is the idiomatic way of doing things in more carefree languages.

And I'd argue that both sum and product belong in a 3rd party crate like itertools exactly because of that.

Now there is some validity to the concern about beginning Rust programmers. But not too much validity: anyone who programs enough in Rust won't stay a beginner for long, and this is something that the Book is especially suited to teach.

I'd also argue that Rust is a poor choice as a first language anyhow (beginning programmers are usually better served by languages like python and JS, rather than dealing with types, generics, static VS dynamic dispatch, pointer reasoning etc) so there's not too much need to focus on programming newbies either. Instead I'd argue that long term the language needs to be optimized for people who need Rust for things for which 10 years ago C and C++ were pretty much the only game in town. In other words: the language should be optimized for people who know what allocation is, what its role is in software, why you want to avoid it when you can etc. I don't expect people like that to be tripped up by something as simple as "the types don't match when trying to do iteration, I'll just allocate in the meantime", especially if there's a chapter about it in the Book.

I guess what I'm trying to say is: some problems are best solved with social solutions, not technical ones.

2 Likes