Box x syntax in Box::new()?

I was trying to find out if Box::new(MyStruct { field1: val, field2: val, ...} ) will first create the struct on the stack and then copy it into the heap allocation. During that research, I saw that Box::new() is implemented as

pub fn new(x: T) {
    box x
}

Is that special syntax internal to the compiler? What does it mean? I couldn't get an answer to my question from that. I also couldn't find the handling of that code in the compiler. I don't have a clone of the repo locally to ripgrep through it, so my search consisted of a cursory browsing through the repo on github.

Can somebody point me to the right place or if you can, answer my question?

Thanks!

Yes, it's special nightly syntax to create a Box. Box is really special, but that's mostly a historical accident. Box::new can now be rewritten using the std::alloc module, but there doesn't seem to be a need to make that change.

The box syntax semantically does the following,

fn new(x: T) -> Box<T> {
    use std::alloc::{alloc, handle_alloc_error, Layout};

    unsafe {
        let ptr = alloc(Layout::new::<T>()).cast::<T>();

        // if allocation failed
        if ptr.is_null() { handle_alloc_error(Layout::new::<T>()) }
    
        ptr.write(x);
        Box::from_raw(ptr)
    }
}
6 Likes

Thanks! So am I correct to conclude that Box::new(MyStruct { fields... }) will copy the struct directly from the text section of the binary to the heap, without creating a copy on the stack first?

No, it will copy from the stack to the heap. If you don't want that I reccomend using nocopy or the nightly Box::new_uninit apis, or the nightly box syntax directly. This is because function arguments are evaluated before the function call, no exceptions, so it must construct the stack version before passing it into the function. Then because of that branch above, it can't directly write the value into the heap memory. Allocating first, then creating the value, like nocopy alleviates this problem on release mode.

It's true semantically, but not always practically. Compiler optimization is so smart so it can elide the "copy from the stack" part in many cases.

Oh, no it can't (that one branch kills all optimizations that could delay initialization, even the simplest values), esp if the value is large, the copy almost always happens

for example, let's allocate a bunch of zeros of type [u8; 100_000]

playground, godbolt

pub fn check_move() -> Box<[u8; 100_000]> {
    // 100 kb
    Box::new([0; 100_000])
}
playground::check_move:
	push	rbx
	mov	eax, 100000
	call	__rust_probestack
	sub	rsp, rax
	mov	rdi, rsp
	mov	edx, 100000
	xor	esi, esi
	call	qword ptr [rip + memset@GOTPCREL]
	mov	edi, 100000
	mov	esi, 1
	call	qword ptr [rip + __rust_alloc@GOTPCREL]
	test	rax, rax
	je	.LBB0_1
	mov	rbx, rax
	mov	rsi, rsp
	mov	edx, 100000
	mov	rdi, rax
	call	qword ptr [rip + memcpy@GOTPCREL]
	mov	rax, rbx
	add	rsp, 100000
	pop	rbx
	ret

.LBB0_1:
	mov	edi, 100000
	mov	esi, 1
	call	qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
	ud2

note the memcpy and stack check. This is the absolute simplest case, and it fails to optimize the memcpy away

using the box syntax directly doesn't have this problem because the allocation happens before the value is even created godbolt

No, for simple and shorter values the optimization do its job well.


pub fn boxed_int() -> Box<i32> {
    Box::new(77)
}

pub struct Node {
    value: i32,
    next: Option<Box<Node>>,
}

pub fn node_empty() -> Box<Node> {
    Box::new(Node {
        value: 42,
        next: None,
    })
}

pub fn node_append(prev: Box<Node>) -> Box<Node> {
    Box::new(Node {
        value: 443,
        next: Some(prev),
    })
}
example::boxed_int:
        push    rax
        mov     edi, 4
        mov     esi, 4
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB0_1
        mov     dword ptr [rax], 77
        pop     rcx
        ret
.LBB0_1:
        mov     edi, 4
        mov     esi, 4
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

example::node_empty:
        push    rax
        mov     edi, 16
        mov     esi, 8
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB1_1
        mov     qword ptr [rax], 0
        mov     dword ptr [rax + 8], 42
        pop     rcx
        ret
.LBB1_1:
        mov     edi, 16
        mov     esi, 8
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

example::node_append:
        push    rbx
        mov     rbx, rdi
        mov     edi, 16
        mov     esi, 8
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB2_1
        mov     qword ptr [rax], rbx
        mov     dword ptr [rax + 8], 443
        pop     rbx
        ret
.LBB2_1:
        mov     edi, 16
        mov     esi, 8
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

From ::uninit's documentation:

5 Likes

Is there a time frame for removing the box x syntax then?

I don't know, here's the tracking issue

1 Like