Collect_into SmallVec

A pattern that sometimes I wish to use is (and I wish to have in the std library):

some_iterator.collect_into(SmallVec::<[u32; 6]>::new())

This is similar to a regular collect() but when the iterator is short, it doesn't heap-allocate (and this should be the most common case if the SmallVec inplace buffer length is well chosen).(SmallVec has a from_iter, but it's less ergonomic).

I guess the solution is:

extern crate smallvec;
use smallvec::SmallVec;
fn main() {
    let a = (0 .. 5).collect::<SmallVec<[u32; 6]>>();
}

I'm not 100% sure what signature for collect_into() you're looking for, but it almost sounds like you want the Extend trait. With the only difference being you'd write things "backwards" (my_small_vec.extend(some_iterator) instead of some_iterator.collect_into(&mut my_small_vec)).

You mention you want to eliminate heap allocations when collecting items, but it seems like SmallVec's FromIterator iteration already uses Iterator::size_hint() to make sure there's enough space. This means it shouldn't need to do any allocation if the buffer is correctly sized and the size_hint() is accurate.

I'm curious to find out why you feel collect() and the FromIterator trait could be unergonomic?

https://github.com/rust-lang/rust/issues/45840

3 Likes

I'm still iterating the design, so far I have something like this:

#![feature(in_band_lifetimes)]

trait CollectExtras: Iterator {
    fn collect_into_vec(&mut self, buffer: &'a mut Vec<Self::Item>) -> &'a [Self::Item];
    fn copy_into_slice(&mut self, slice: &'a mut [Self::Item]) -> &'a [Self::Item];
}

impl<I: ?Sized> CollectExtras for I where I: Iterator {
    fn collect_into_vec(&mut self, buffer: &'a mut Vec<Self::Item>) -> &'a [Self::Item] {
        buffer.clear();
        buffer.extend(self);
        &buffer[..]
    }

    fn copy_into_slice(&mut self, slice: &'a mut [Self::Item]) -> &'a [Self::Item] {
        let len = slice.iter_mut().zip(self).fold(0, |count, (dest, item)| {
            *dest = item;
            count + 1
        });
        &slice[.. len]
    }
}

collect_into_vec is useful when you perform many collect() inside a loop, to re-use the same heap memory. copy_into_slice is handy to collect into a fixed-size array to avoid heap allocations (it's similar but not equal to https://crates.io/crates/collect_slice it's used as a base).