Is IntoIterator<Item = T> a zero cost generic?

Hi folks,

I wonder if I pass Vec<String> into the following functions:

fn non_generic_fn(s: Vec<String>) -> Vec<String> {
    s
}
fn generic_fn<V, T>(s: V) -> Vec<String>
where
    V: IntoIterator<Item = String>,
{
    s.into_iter().collect()
}
fn even_more_generic_fn<V, T>(s: V) -> Vec<String>
where
    V: IntoIterator<Item = T>,
    T: Into<String>,
{
    s.into_iter().map(|s| s.into()).collect()
}

will I get any performance penalties for the generic versions? Or is the compiler clever enough to understand that I'm already passing Vec and therefore nothing to do?

Thank you.

You can test that in the Compiler Explorer.

The generic versions produce a simple move:

example::main::hd1eb589a1adcc77a:
        mov     rax, rdi
        movups  xmm0, xmmword ptr [rsi]
        mov     rcx, qword ptr [rsi + 16]
        movups  xmmword ptr [rdi], xmm0
        mov     qword ptr [rdi + 16], rcx
        ret

whereas the non-generic version doesn't produce anything at all, meaning it's entirely simplified away.

collect seems to go through all the motions to generate a new list, so I suppose it's too much for the compiler to simplify that part. The cost is pretty low, though.

1 Like

(Sorry, I was interrupted) Note that you can also simplify the generics. I hadn't done it in the link above, which forced me to specify the parameter T for generic_fn. The simplification below will spare you that:

pub fn generic_fn<V>(s: V) -> Vec<String>
where
    V: IntoIterator<Item = String>,
{
    s.into_iter().collect()
}

pub fn even_more_generic_fn<V>(s: V) -> Vec<String>
where
    V: IntoIterator<Item: Into<String>>
{
    s.into_iter().map(|s| s.into()).collect()
}

Finally, this was an extremely trivial example. In reality, if you do anything at all to the list's content, chances are you'll get the same code with or without generic. As you can see, it was already simplified its most basic operation: (EDIT: to clarify) moving the Vec's stack content, so the pointer, the capacity, and the length: 24 bytes on a 64-bit architecture. It doesn't move the list's content.

Note that this is not just a compiler optimization: the stdlib specializes how collect works to reuse the memory of the source iterator (the Vec in this case), and this turns the computation into effectively a noop that the compiler can then optimize out. Without the stdlib specialization however a new allocation would have to be performed.

1 Like

no, that means that you didn't have any code actually using that function, so the compiler decided to not generate any code for it since it's small and it'll generate the code when it's actually used. you can make the compiler generate code for it by adding an attribute:

// exports this function so other code can call it through an `extern` block,
// the name can conflict with C library functions and stuff, which can be UB,
// so the attribute is unsafe
#[unsafe(no_mangle)]
pub fn my_function(....) {....}
1 Like

What you said is not entirely accurate when the function is declared as public. In this case, it's easy to see when comparing to the two other options.

Now, if you complicate the code a little, like adding a simple selector:

pub fn main(sel: bool, v1: Vec<String>, v2: Vec<String>) -> Vec<String> {
    non_generic_fn(if sel { v1 } else { v2 })
    //generic_fn::<Vec<String>, String>(if sel { v1 } else { v2 })
    //even_more_generic_fn(if sel { v1 } else { v2 })
}

then you can see it has the same code for all alternatives, but that's because it's forced to do a semantic move. As I said, the original test was a trivial, degenerate case, and in reality, you're more likely to get what's above (thankfully!).

Without that, the function is simplified entirely. Perhaps this example is more convincing: you'll see that call is removed in the non-generic case but kept in the two other cases, leading to 2 moves instead of one.

pub fn call(v: Vec<String>) -> Vec<String> {
    non_generic_fn(v)
    //generic_fn::<Vec<String>, String>(v)
    //even_more_generic_fn(v)
}

pub fn main(sel: bool, v1: Vec<String>, v2: Vec<String>) -> Vec<String> {
    call(if sel { v1 }  else { v2 })
}
1 Like

Are there any sources where I can find more information on that topic?

1 Like