How to boxed struct with large size without stack-overflow?

struct 100MbType{ /*xxx*/ }
struct Wrapper{
   v1: 100MbType,
   v2: [f64;1000000]
}
fn main(){
   let b = Box::new(Wrapper{ /*init*/ });
}

No exception, the code will make the stack space exhausted, which means, the code can cause a stack overflow, at least in Debug mode. In release mode, LLVM may eliminate the unnecessary copy, that is, first create the temporary object in the stack then copies the temporary object to the heap space allocated by Box, instead, it directly constructs the object in the heap space.

This concept is called guaranteed copy elision in C++. However, in Rust, we do not have this guarantee at the language level. How to solve this case in Rust?

Moreover, even though in unsafe code, that is, using std::ptr::write, which also cannot avoid first constructing the second parameter of std::ptr::write, then write/copy the data from that parameter to the destination address specified by the first parameter, which means, we cannot find an equivalent way in Rust that equals to placement new in C++.

There's some discussion including workarounds in the below issue. You're correct that there is no placement new.

1 Like

After a glance at the issue, I think the workaround in the issue is not enough generalization, it's only for arrays, however, there can be other data structures with large sizes. The guaranteed copy elision is crucial here.

There's more than one workaround, some more general than arrays. But perhaps not general enough for your case.

There's no guaranteed copy elision. If you can't preallocate and write piece by piece or the like, you'll be relying on optimization to kick in.

1 Like
1 Like

Why don't you copy the data piecewise, then? You don't need to write everything at once.

The RFC seems only to guarantee the return value in the function call, however, a generalized guaranteed copy elision will include the following case:

let v = A{field1:xxx, field2:yyy};

That is, directly initialize v instead of creating a temporary object and copy/memcpy the temporary object to v.

This is the question, which means the program may be problematic in the Debug mode.

You can enable optimizations for the debug profile. (Sorry for not looking it up.)

1 Like

However, relying on the optimization behavior sometimes is equivalent to UB. Consider in the foreseeable future, GCC can compile Rust but it does not eliminate the unnecessary copy even if in the release mode because copy elision is not guaranteed by the core language.

That wouldn't be UB. If it is, the compiler has a bug around not detecting stack exhaustion. (If UB results with no dependencies or unsafe, it's a compiler or std bug.)

1 Like

Sorry, saying the behavior UB may make misleading. I want to say relying on optimization is not reliable, especially, in this case, it does not only affect performance but also whether the program can run.

Yeah, that's true. I don't think anyone disagrees that some way to get the guarantees is desirable.

Is this practical question or an attempt to find an excuse to not use Rust?

Guaranteed copy elision is an interesting property, but it's C++17 property which means most C++ codebases don't use it.

Similarly the ability to create object on heap without unsafe code is something that is nice in theory but quite problematic in practice. But we may hope that in year 2034, when Rust would be as old as C++17, that would be a solved problem. We are not there yet, though.

3 Likes

Don't let the future distract you from the present. Rust may not currently have placement new or guaranteed copy elision, but that doesn't mean you can't construct data with guaranteed placement on stable Rust; you just have to do it manually.

use std::alloc;
use std::ptr::addr_of_mut;

struct Big {
    d: [u8; 10_000_000],
}

const V2_COUNT: usize = 1_000_000;
struct Wrapper {
    v1: Big,
    v2: [f64; V2_COUNT],
}

impl Wrapper {
    fn new(val1: u8, val2: f64) -> Box<Self> {
        unsafe {
            // Allocate uninitialized memory
            let ptr = alloc::alloc(alloc::Layout::new::<Wrapper>()) as *mut Wrapper;
            
            // Initialize it field-by-field
            // SAFETY: `v1.d` and `v2` constitute all of the fields in `Wrapper`,
            // and we are initializing all of the bytes of each array.
            addr_of_mut!((*ptr).v1.d).write_bytes(val1, 1);
            for i in 0..V2_COUNT {
                addr_of_mut!((*ptr).v2[i]).write(val2);
            }
            
            // Convert to safe `Box`.
            // SAFETY:
            // * `ptr` was allocated with proper size and alignment using the global allocator.
            // * All non-padding bytes of `*ptr` have been initialized above.
            Box::from_raw(ptr)
        }
    }
}

fn main() {
    let w = Wrapper::new(123, 456.7);
    assert_eq!(w.v1.d[100], 123);
    assert_eq!(w.v2[100], 456.7);
}

(To help make sure I haven't made a mistake, I've tested this code with Miri, with the sizes set a bit smaller so it completes in reasonable time.)

This is, of course, not ideal since it contains unsafe code, but it means you can still use safe Rust for the rest of your program, rather than using a language without a safe subset.

9 Likes

That's why I asked if that's an attempt to show that Rust is not yet β€œdone” (is there any language which is actually 100% done?) or some kind of practical issue.

In practice using unsafe to allocate memory directly on heap is cumbersome and error-prone, but doable, and you couldn't have bazillion places in your code where you allocate 100Mb types (1000 times by 100Mb means you use 100Gb and most apps couldn't afford that). And that's nice practical answer.

In theory you may use that an excuse to avoid Rust and using C or C++ β€” by explaining how you, personally, can't use language which doesn't perform that rare-yet-sometimes-important trick perfectly. And that couldn't be refuted if you are interested in theoretical debate, not in practical work.

2 Likes

I find it very rarely constructive to speculate on someone's ulterior motives for asking a question. Whether one's guess is right or wrong, It frequently causes heated arguments and rarely changes any minds. This is the users forum, help category; I made my post because it felt like there wasn't enough providing the help that was actually asked for in the original post.

10 Likes

There is no general way to solve this with existing features. The only workaround, maybe is general, is to try using Box::new_uninit to allocate a space, initialize it, and use Box::assume_init to finish, just like people do in C. And please note that they both require nightly and assume_init is unsafe. Or you could try allocating memory manually via something in module std::alloc, that doesn't require nightly but unsafe is unavoidable.

But be realistic, a structure would actually cost as much as 100MB memory has nearly 100% possibility contains an array. Why not just simply use a Vec?

Shouldn't that be

addr_of_mut!((*ptr).v1.d).write_bytes(val1, 10_000_000);

instead?

Good question, but the answer is no; as the documentation for write_bytes() says:

Invokes memset on the specified pointer, setting count * size_of::<T>() bytes of memory starting at self to val.

Since d is a pointer to [u8; 10_000_000], it writes 1 * size_of::<[u8; 10_000_000]>() = 10,000,000 bytes.

Also, that's part of why I put w.v1.d[100] in the assertions: if the write_bytes was writing too few bytes, then Miri would have detected a read of uninitialized memory at that point, and non-Miri runs would most likely return garbage instead of the specific expected value.

2 Likes