Unnecessary stack copy generated since Rust 1.71.0

The following code

use std::mem::MaybeUninit;
extern {
    fn init(a: &mut [f64; 8]);
}

struct M {
    rows: i32,
    cols: i32,
    data: [f64; 8],
}

impl M {
    fn new(rows: i32, cols: i32) -> M {
        let mut m = M {
            rows,
            cols,
            data: unsafe { MaybeUninit::uninit().assume_init() },
        };
        unsafe { init(&mut m.data); }
        m
    }
}

#[no_mangle]
fn push(v: &mut Vec<M>, rows: i32, cols: i32) {
    v.push(M::new(rows, cols));
}

#[no_mangle]
fn push2(v: &mut Vec<M>, rows: i32, cols: i32) {
    let mut m = M {
        rows,
        cols,
        data: unsafe { MaybeUninit::uninit().assume_init() },
    };
    unsafe { init(&mut m.data); }
    v.push(m);
}

produces an unnecessary stack copy:

push:
        pushq   %r14
        pushq   %rbx
        subq    $152, %rsp
        movq    %rdi, %rbx
        # part 1: init m
        movl    %esi, 144(%rsp)
        movl    %edx, 148(%rsp)
        leaq    80(%rsp), %rdi
        callq   *init@GOTPCREL(%rip)
        # part 2: copy m on stack??
        movq    144(%rsp), %rax
        movq    %rax, 64(%rsp)
        movups  80(%rsp), %xmm0
        movups  96(%rsp), %xmm1
        movups  112(%rsp), %xmm2
        movups  128(%rsp), %xmm3
        movaps  %xmm3, 48(%rsp)
        movaps  %xmm2, 32(%rsp)
        movaps  %xmm1, 16(%rsp)
        movaps  %xmm0, (%rsp)
        movq    16(%rbx), %r14
        cmpq    (%rbx), %r14
        jne     .LBB2_2
        movq    %rbx, %rdi
        callq   alloc::raw_vec::RawVec<T,A>::grow_one::hbe3227e8b97c2791
.LBB2_2:
        # part 3: copy m from stack to heap
        movq    8(%rbx), %rax
        leaq    (%r14,%r14,8), %rcx
        movq    64(%rsp), %rdx
        movq    %rdx, 64(%rax,%rcx,8)
        movaps  (%rsp), %xmm0
        movaps  16(%rsp), %xmm1
        movaps  32(%rsp), %xmm2
        movaps  48(%rsp), %xmm3
        movups  %xmm3, 48(%rax,%rcx,8)
        movups  %xmm2, 32(%rax,%rcx,8)
        movups  %xmm1, 16(%rax,%rcx,8)
        movups  %xmm0, (%rax,%rcx,8)
        incq    %r14
        movq    %r14, 16(%rbx)
        addq    $152, %rsp
        popq    %rbx
        popq    %r14
        retq

As we can see the #part 2 in the generated assembly code is redundant. We can simply copy from the first stack location to the heap. Rustc 1.70.0 and earlier versions don't create this stack copy. And interestingly enough, our manually inlined push2 always has this stack copy. Have any (semantics) changes been made since Rust 1.70.0? Or bugs? Any comment/explanation would be welcome & appreciated ;D

Goldbot link: Compiler Explorer.

Your code has UB:

unsafe { MaybeUninit::uninit().assume_init() }
5 Likes

Good point! When I change it to avoid UB (I believe) by initializing data before creating M, there is still a difference in generated code (with 1.80 vs 1.70) but it looks minor. The extra copy is present in all cases: 1.70 and 1.80, push and push2:

I also tried fixing the UB differently by defining data as type MaybeUninit:

    data: MaybeUninit<[f64; 8]>,

and initializing data as the OP did originally, after creating M:

The interesting thing in this version is that push has the extra copy the OP complained about with 1.80 but not with 1.70. So it does seem that something changed in the compiler or LLVM.

(In this version push2 still has the extra copy with 1.70 and 1.80.)

1 Like

the strangest piece of code i ever saw.

1 Like

Thanks for your reply & time taken on this. I simplified the problem a bit since the extra stack copy was made before pushing into the vector. It boils down to the NRVO problem. I've checked the Rust MIR. Rust 1.70.0 has a clean return on the final bb1 block which differs from what Rust 1.80.0 produces. This sounds like a plausible explanation.

Your first link avoids this copy which I believe is due to turning NRVO into RVO which the rustc compiler is more capable of optimizing with.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.