Pearl: Extending a `Vec` via `append` or `extend`?

fosskers · March 25, 2022, 9:16pm

When combining Vecs, I find myself flipping between Vec::append and Extend::extend without any clear motivated preference for one or the other.

If we look at Vec's implementation of Extend, we see:

fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
    <Self as SpecExtend<T, I::IntoIter>>::spec_extend(self, iter.into_iter())
}

where this leads is a bit tricky to follow, but it's somewhere in this file in std. Either way, because this operates over an Iterator, is it adding the new elements one-by-one, memory layout be damned?

append on the other hand is straight-forward:

pub fn append(&mut self, other: &mut Self) {
    unsafe {
        self.append_elements(other.as_slice() as _);
        other.set_len(0);
    }
}

/// Appends elements to `Self` from other buffer.
unsafe fn append_elements(&mut self, other: *const [T]) {
    let count = unsafe { (*other).len() };
    self.reserve(count);
    let len = self.len();
    unsafe { ptr::copy_nonoverlapping(other as *const T, self.as_mut_ptr().add(len), count) };
    self.len += count;
}

This looks to me like it's physically moving the memory from the other to the self. And if the self had been preallocated with extra room via with_capacity, then this would be more efficient, no?

Thanks for any wisdom you can offer.

scottmcm · March 25, 2022, 9:29pm

We can ask godbolt about it!

pub fn append_via_extend(dest: &mut Vec<String>, src: Vec<String>) {
    // To clean up the output a bit, assume we already reserved enough space
    assert!(dest.capacity() - dest.len() >= src.len());

    dest.extend(src.into_iter());
}

https://rust.godbolt.org/z/EhnK5z1q7

example::append_via_extend:
        push    r15
        push    r14
        push    r13
        push    r12
        push    rbx
        mov     rax, qword ptr [rdi + 8]
        mov     r12, qword ptr [rdi + 16]
        mov     rbx, qword ptr [rsi + 16]
        mov     r15, rsi
        sub     rax, r12
        cmp     rax, rbx
        jb      .LBB1_1
        mov     r13, rdi
        lea     rdi, [r12 + 2*r12]
        mov     r14, qword ptr [r15]
        lea     rax, [8*rbx]
        mov     r15, qword ptr [r15 + 8]
        shl     rdi, 3
        add     rdi, qword ptr [r13]
        lea     rdx, [rax + 2*rax]
        mov     rsi, r14
        call    qword ptr [rip + memcpy@GOTPCREL]  // <-- Oh look!
        add     rbx, r12
        mov     qword ptr [r13 + 16], rbx
        test    r15, r15
        je      .LBB1_5
        mov     ecx, 24
        mov     rax, r15
        mul     rcx
        test    rax, rax
        je      .LBB1_5
        mov     edx, 8
        mov     rdi, r14
        mov     rsi, rax
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        jmp     qword ptr [rip + __rust_dealloc@GOTPCREL]
.LBB1_5:
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        ret
.LBB1_1:
        lea     rdi, [rip + .L__unnamed_1]
        lea     rdx, [rip + .L__unnamed_2]
        mov     esi, 59
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2
        mov     rbx, rax
        mov     rdi, r15
        call    core::ptr::drop_in_place<alloc::vec::Vec<alloc::string::String>>
        mov     rdi, rbx
        call    _Unwind_Resume@PLT
        ud2
        call    qword ptr [rip + core::panicking::panic_no_unwind@GOTPCREL]
        ud2

You'll notice those jumps are forward -- so there's no loop. But there is a call to memcpy!

So between the specializations and optimizations, it noticed that it's getting data from contiguous memory, and is copying it the best way it knows how.

fosskers · March 25, 2022, 9:36pm

I didn't know about godbolt, thanks for that. Does that mean that append is still better?

LegionMammal978 · March 25, 2022, 9:41pm

They should be equivalent; if you look at the very bottom of spec_extend.rs, you'll see that extending from a slice::Iter just calls append_elements.

geeklint · March 25, 2022, 9:44pm

What you're seeing in the extend impl is "specialization" (an unstable feature) the standard library is using to make extend essentially an alias for append; the performance should be the same, thus the difference you should analyze is which is easier to understand in your codebase? Note that append works explicitly with Vecs, whereas extend works with any Iterator.

cuviper · March 25, 2022, 9:48pm

I prefer append just because it's more obvious that it will be simple, without requiring any heavy lifting by the compiler. It's like I'm doing rustc a favor, even though it's plenty smart enough anyway.

fosskers · March 25, 2022, 9:52pm

Thanks folks.

scottmcm · March 25, 2022, 10:25pm

I think the most important part is to use the one where the object ownership does what you needed.

If append works and you want to re-use the original Vec's buffer, then use it -- I think append is better than .extend(vec.drain()).

But if you just want to give away ownership of the whole thing and don't care, use .extend(vec) because that communicates it well. (extend takes IntoIterator, so that's the same as .extend(vec.into_iter()).) And trust that std/llvm will probably do something reasonable -- which we saw here it does -- and if not it'll come up in your perf tests if it really matters.

SkiFire13 · March 26, 2022, 9:09am

Btw that's not the one used when extending a Vec with another Vec. The right one is a couple of lines before, though that also uses append_elements.

Topic		Replies	Views
Why doesn't Rust have Append trait?	7	1248	September 8, 2018
Consuming Vec::append help	9	4738	July 22, 2018
Extend_from_slice efficiency	4	1407	September 29, 2024
Append potential logic error observation help	14	758	August 23, 2023
Append Vec<T> by extend()? help	4	553	April 11, 2021

Pearl: Extending a `Vec` via `append` or `extend`?

Related topics