How to boxed struct with large size without stack-overflow?

xmh0511 · June 7, 2023, 2:40am

struct 100MbType{ /*xxx*/ }
struct Wrapper{
   v1: 100MbType,
   v2: [f64;1000000]
}
fn main(){
   let b = Box::new(Wrapper{ /*init*/ });
}

No exception, the code will make the stack space exhausted, which means, the code can cause a stack overflow, at least in Debug mode. In release mode, LLVM may eliminate the unnecessary copy, that is, first create the temporary object in the stack then copies the temporary object to the heap space allocated by Box, instead, it directly constructs the object in the heap space.

This concept is called guaranteed copy elision in C++. However, in Rust, we do not have this guarantee at the language level. How to solve this case in Rust?

Moreover, even though in unsafe code, that is, using std::ptr::write, which also cannot avoid first constructing the second parameter of std::ptr::write, then write/copy the data from that parameter to the destination address specified by the first parameter, which means, we cannot find an equivalent way in Rust that equals to placement new in C++.

quinedot · June 7, 2023, 2:50am

There's some discussion including workarounds in the below issue. You're correct that there is no placement new.

github.com/rust-lang/rust

Stack overflow with Boxed array

opened 03:32PM - 30 Aug 18 UTC

opus111

A-LLVM A-codegen T-compiler

This is possibly the same bug as https://github.com/rust-lang/rust/issues/408…62 Using the latest version of Rust rustc 1.27.2 (58cc626de 2018-07-18) The following code causes a stack overflow ```rust #[test] fn test_boxed() { let a = Box::new([-1; 3000000]); } ``` # Workarounds ## Using `Vec<T>` This does not have overhead. ```rust let boxed_slice: Box<[u8]> = vec![0; 100_000_000].into_boxed_slice(); let boxed_array: Box<[u8; 100_000_000]> = vec![0; 100_000_000].into_boxed_slice().try_into().unwrap(); ``` ## Unstably using `new_uninit` This requires an unstable API and unsafe, but is more flexible. ```rust let mut b = Box::new_uninit(); unsafe { std::ptr::write_bytes(b.as_mut_ptr(), 0, 100_000_000) }; let boxed_array: Box<[u8; 100_000_000]> = unsafe { b.assume_init() }; ```

xmh0511 · June 7, 2023, 2:59am

After a glance at the issue, I think the workaround in the issue is not enough generalization, it's only for arrays, however, there can be other data structures with large sizes. The guaranteed copy elision is crucial here.

quinedot · June 7, 2023, 3:04am

There's more than one workaround, some more general than arrays. But perhaps not general enough for your case.

There's no guaranteed copy elision. If you can't preallocate and write piece by piece or the like, you'll be relying on optimization to kick in.

vague · June 7, 2023, 3:10am

github.com/rust-lang/rfcs

RFC: Placement by return

rust-lang:master ← PoignardAzur:placement-by-return

opened 12:01PM - 17 Mar 20 UTC

PoignardAzur

+1115 -0

[Rendered](https://github.com/PoignardAzur/placement-by-return/blob/placement-by…-return/text/0000-placement-by-return.md). --- **Glossary:** - **GCE:** [Guaranteed Copy Elision](https://stackoverflow.com/questions/38043319/how-does-guaranteed-copy-elision-work). - **NRVO:** [Named Return Value Optimization](https://shaharmike.com/cpp/rvo/). - **DST:** [Dynamically-Sized Type](https://doc.rust-lang.org/reference/dynamically-sized-types.html).

H2CO3 · June 7, 2023, 3:25am

Why don't you copy the data piecewise, then? You don't need to write everything at once.

xmh0511 · June 7, 2023, 3:36am

The RFC seems only to guarantee the return value in the function call, however, a generalized guaranteed copy elision will include the following case:

let v = A{field1:xxx, field2:yyy};

That is, directly initialize v instead of creating a temporary object and copy/memcpy the temporary object to v.

xmh0511 · June 7, 2023, 3:38am

This is the question, which means the program may be problematic in the Debug mode.

quinedot · June 7, 2023, 3:42am

You can enable optimizations for the debug profile. (Sorry for not looking it up.)

xmh0511 · June 7, 2023, 3:46am

However, relying on the optimization behavior sometimes is equivalent to UB. Consider in the foreseeable future, GCC can compile Rust but it does not eliminate the unnecessary copy even if in the release mode because copy elision is not guaranteed by the core language.

quinedot · June 7, 2023, 3:52am

That wouldn't be UB. If it is, the compiler has a bug around not detecting stack exhaustion. (If UB results with no dependencies or unsafe, it's a compiler or std bug.)

xmh0511 · June 7, 2023, 3:56am

Sorry, saying the behavior UB may make misleading. I want to say relying on optimization is not reliable, especially, in this case, it does not only affect performance but also whether the program can run.

quinedot · June 7, 2023, 3:57am

Yeah, that's true. I don't think anyone disagrees that some way to get the guarantees is desirable.

khimru · June 7, 2023, 12:02pm

Is this practical question or an attempt to find an excuse to not use Rust?

Guaranteed copy elision is an interesting property, but it's C++17 property which means most C++ codebases don't use it.

Similarly the ability to create object on heap without unsafe code is something that is nice in theory but quite problematic in practice. But we may hope that in year 2034, when Rust would be as old as C++17, that would be a solved problem. We are not there yet, though.

kpreid · June 7, 2023, 4:11pm

Don't let the future distract you from the present. Rust may not currently have placement new or guaranteed copy elision, but that doesn't mean you can't construct data with guaranteed placement on stable Rust; you just have to do it manually.

use std::alloc;
use std::ptr::addr_of_mut;

struct Big {
    d: [u8; 10_000_000],
}

const V2_COUNT: usize = 1_000_000;
struct Wrapper {
    v1: Big,
    v2: [f64; V2_COUNT],
}

impl Wrapper {
    fn new(val1: u8, val2: f64) -> Box<Self> {
        unsafe {
            // Allocate uninitialized memory
            let ptr = alloc::alloc(alloc::Layout::new::<Wrapper>()) as *mut Wrapper;
            
            // Initialize it field-by-field
            // SAFETY: `v1.d` and `v2` constitute all of the fields in `Wrapper`,
            // and we are initializing all of the bytes of each array.
            addr_of_mut!((*ptr).v1.d).write_bytes(val1, 1);
            for i in 0..V2_COUNT {
                addr_of_mut!((*ptr).v2[i]).write(val2);
            }
            
            // Convert to safe `Box`.
            // SAFETY:
            // * `ptr` was allocated with proper size and alignment using the global allocator.
            // * All non-padding bytes of `*ptr` have been initialized above.
            Box::from_raw(ptr)
        }
    }
}

fn main() {
    let w = Wrapper::new(123, 456.7);
    assert_eq!(w.v1.d[100], 123);
    assert_eq!(w.v2[100], 456.7);
}

(To help make sure I haven't made a mistake, I've tested this code with Miri, with the sizes set a bit smaller so it completes in reasonable time.)

This is, of course, not ideal since it contains unsafe code, but it means you can still use safe Rust for the rest of your program, rather than using a language without a safe subset.

khimru · June 7, 2023, 4:21pm

That's why I asked if that's an attempt to show that Rust is not yet “done” (is there any language which is actually 100% done?) or some kind of practical issue.

In practice using unsafe to allocate memory directly on heap is cumbersome and error-prone, but doable, and you couldn't have bazillion places in your code where you allocate 100Mb types (1000 times by 100Mb means you use 100Gb and most apps couldn't afford that). And that's nice practical answer.

In theory you may use that an excuse to avoid Rust and using C or C++ — by explaining how you, personally, can't use language which doesn't perform that rare-yet-sometimes-important trick perfectly. And that couldn't be refuted if you are interested in theoretical debate, not in practical work.

kpreid · June 7, 2023, 4:56pm

I find it very rarely constructive to speculate on someone's ulterior motives for asking a question. Whether one's guess is right or wrong, It frequently causes heated arguments and rarely changes any minds. This is the users forum, help category; I made my post because it felt like there wasn't enough providing the help that was actually asked for in the original post.

Fancyflame · June 7, 2023, 5:37pm

There is no general way to solve this with existing features. The only workaround, maybe is general, is to try using Box::new_uninit to allocate a space, initialize it, and use Box::assume_init to finish, just like people do in C. And please note that they both require nightly and assume_init is unsafe. Or you could try allocating memory manually via something in module std::alloc, that doesn't require nightly but unsafe is unavoidable.

But be realistic, a structure would actually cost as much as 100MB memory has nearly 100% possibility contains an array. Why not just simply use a Vec?

SebastianJL · June 7, 2023, 6:05pm

Shouldn't that be

addr_of_mut!((*ptr).v1.d).write_bytes(val1, 10_000_000);

instead?

kpreid · June 7, 2023, 6:59pm

Good question, but the answer is no; as the documentation for write_bytes() says:

Invokes memset on the specified pointer, setting count * size_of::<T>() bytes of memory starting at self to val.

Since d is a pointer to [u8; 10_000_000], it writes 1 * size_of::<[u8; 10_000_000]>() = 10,000,000 bytes.

Also, that's part of why I put w.v1.d[100] in the assertions: if the write_bytes was writing too few bytes, then Miri would have detected a read of uninitialized memory at that point, and non-Miri runs would most likely return garbage instead of the specific expected value.

Topic		Replies	Views
How to create large objects directly in heap help	15	10422	March 23, 2019
Allocating array on heap using Box::new([;]) results in stack overflow	20	1307	May 8, 2025
How to think about baseline performance around big structs	9	492	February 12, 2025
Construct a Box[[[i32; 32]; 32]; 32]> without hitting stack	7	1131	February 3, 2022
Stack overflow instantiating struct with array members help	4	9715	June 14, 2015

How to boxed struct with large size without stack-overflow?

Related topics