How efficitent is functional code in Rust?

I would like to turn a vector into multiple disjoint subvectors such that the elements of the source vector v are alternately put into the destination vectors x and y:

fn f<T>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    let mut x = Vec::with_capacity((v.len() + 1) / 2);
    let mut y = Vec::with_capacity(v.len() / 2);

    let mut i = 0;
    for e in v {
        if i % 2 == 0 {
            x.push(e);
        } else {
            y.push(e);
        }
        i += 1;
    }
    (x, y)
}

However, I would rather use a functional style:

use itertools::Itertools;

fn g<T>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    v.into_iter().tuples().unzip()
}

(Yes, these two solutions don't produce the exact same results if the length of v is odd but that shouldn't matter here)

So what code does the compiler generate? As far as I can tell, it could pull some really neat tricks but it could also blindly allocate a temporary vector of tuples, copy the contents of v into that temporary vector and then lastly allocate another two vectors that will hold the results.

What if I happen to write this?

fn h<T: Clone>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    (
        v.iter().step_by(2).cloned().collect(),
        v.iter().skip(1).step_by(2).cloned().collect(),
    )
}

Or even this?

fn i<T: Clone>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    v.iter()
        .step_by(2)
        .cloned()
        .zip(v.iter().skip(1).step_by(2).cloned())
        .unzip()
}

Yes, the first two solutions are more generic (don't require T: Clone) but actually all of these functions describe the same computation. If the compiler was able to see that the two iterators operate on disjoint parts of v, the calls to .cloned() were unnecessary.

So how efficient functional Rust code tends to be in general?

It is definitely not going to allocate an intermediate vector unless you put a .collect() somewhere in the middle of the iterator chain. That said, since the Tuple iterator from itertools does not provide a size hint, it will use the normal reallocation strategy rather than allocating the right size up-front.

Generally the compiler will do it using the method you see by looking at the source of the methods, except massively inlined.

2 Likes

That sounds an awful lot like "Splitting an array into evens and odds, reusing the original allocation", the example in https://doc.rust-lang.org/nightly/std/vec/struct.Vec.html#method.drain_filter, though that method is unstable.

The general form of "make two containers from an iterator" is Iterator::partition (or its slightly-nicer cousin Itertools::partition_map.

So if tuples and unzip didn't exist, you could do this:

use itertools::*;
pub fn f<T>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    v.into_iter().enumerate().partition_map(|(i, x)| {
        if i % 2 == 0 {
            Either::Left(x)
        } else {
            Either::Right(x)
        }
    })
}

Still no cloning, though it does allocate two vectors instead of the 1 that drain_filter could.

Or purely in stdlib

pub fn f<T>(v: Vec<T>) -> (Vec<T>, Vec<T>) {
    let mut index = 0;
    v.into_iter().partition(|_| {
        index += 1;
        index % 2 == 1
    })
}

exact_chunks or array_chunks were other approaches I thought of (for the sake of iterator size hints). But I think the real takeaway of this thread should be "measure and see if it's worth it"; in most circumstances I suspect simplicity will be worth more than the difference between any of these approaches. Premature optimization and all that.

2 Likes

I started using Rust like 4-5 months ago after using mostly Scala for about 8 years. I switched mostly to avoid the overhead that comes with Scala - JVM, garbage collection, large objects, large Docker images, long run times, big Google Cloud bills). That, and Scala 3 made it clear beyond doubt that Scala's priorities were primarily teaching and research, and less creating production software.

In Scala, I have been, like most Scala-users, writing heavily functional code, that is, long chains of filter, map, flatMap and the like. Any collection-like Scala object usually has these methods.

Then, in Rust, I had a Vec, and I was looking for the filter and map methods. And I thought to myself, I know they exist, I've seen them before, why I can't find them.

Then I inspected a previous code sample that was using map and filter, and then I realized that those methods belonged to Iterator, not Vec.

And suddenly, it all made sense.

5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.