Why does putting an array in a Box cause stack overflow?

Thanks to their using #[inline(always)], although I admit I don't really know why that does not use a stack temporary when Box::new() does. I just know that last time I checked, using ::copyless::BoxAllocation was indeed a way to prevent temporaries existing in the stack.

This is because there is no branch between creating the value and putting it in the Box. With Box::new there is a branch to check for allocation failure. This prevents the optimization because LLVM can't put the array directly in the allocation, because the allocation may not exist. By splitting the allocation and initialization, LLVM can see there is a valid location to put the array, so it inlines it.

4 Likes

Thanks. That is indeed a great example. Reduced to perhaps the minimum code possible. It exactly demonstrates my confusion! :slight_smile:

  1. It says we have a u32 somewhere that will be known as 'a':
    "let a: u32 ".

  2. It says that 'a' is in memory somewhere and maybe uninitialized:
    std::mem::MaybeUninit::uninit()"

  3. It says never mind that it, it is initialized:
    .assume_init()

Even if whatever junk that memory contains is unknown, I am saying "assume it it initialized"

  1. But then it seems the compiler ignores my instruction to assume it is initialized and does not perform either of the subsequent tests on 'a'.

At my current understanding of things I would say this is a compiler bug. It is ignoring what I tell it.

There is much talk of LLVM and what it does here. I don't care. I want to know what the Rust language has to say about it. Not the implementation details.

Even if whatever junk that memory contains is unknown,

This contains an implicit assumption about implementation details: you believe that uninitialized means "in some valid but unknown state". This is an implementation detail. And, in fact, this assumption is incorrect: uninitialized memory may actually be in an invalid state, and this is the model used by LLVM.

There is much talk of LLVM and what it does here. I don't care. I want to know what the Rust language has to say about it. Not the implementation details.

The Rust language says, if you have not actually intialized a MaybeUninit, you must not call assume_init on it. Full stop, end of story. If you want to go beyond that, you are straying into implementation details, and since the implementation is built on the semantics of the LLVM abstract machine, that's where you're going.

6 Likes

It's not a bug, it's a feature. What you tell it is explicitly defined as a contract violation and the compiler has right to ignore it for the sake of faster code. It's obvious to not check the condition and not run the branch is faster than to check and run it. Accessing uninitialized memory is UB.

This is what UB means. The compiler has responsibility to generate correct machine code, only if the code never touches UB on execution. The compiler optimize code based on this contract. For example in C with some int x, x < x + 1 will be replaced with TRUE. Of course it's not TRUE if the x is !0, but the signed integer overflow in C is UB so the compiler doesn't care.

8 Likes

There’s a lot of confusion here over what UB actually means and how it works in practice for optimized languages like Rust and C++, so I’d strongly recommend reading this:

https://www.ralfj.de/blog/2019/07/14/uninit.html

and probably half the other posts on that blog for good measure.

1 Like

I can totally agree with that.

After all, if something is not initialized it effectively does not exist. I could well imagine the compiler can do what it likes if I subsequently reference such an non-existent thing.

BUT I have specifically written ".assume_init()"

To my naive mind the language has no right to ignore that command. It is not respecting my wish to assume it is initialized and continue as surely as if I had written "let a: u32 = 4;"

Now, I have no dispute with whatever Rust the language does. I do feel that 'assume_init()' does not have the effect it name suggests.

I don't know why you thought this method do something special, but it doesn't magically force the underlying value initialized. Please read the document carefully before writing unsafe code.

It is up to the caller to guarantee that the MaybeUninit<T> really is in an initialized state. Calling this when the content is not yet fully initialized causes immediate undefined behavior.

The naming of the method should be interpreted like this:

I promise to have initialized this value, so you can now assume it is initialized.

Of course, breaking your promises is undefined behaviour.

But, but, I did not break my promise. I created it. It's in memory somewhere. It is initialized with whatever that memory contains at the time. That is how I want it initialized. Please compiler, when I say 'assume_init' then assume it is initialized and proceed as usual. It does not.

Or, if that is not what happens change the name of 'assume_init' so that readers are not confused.

What actually does 'assume_init' do?

The doc says " Extracts the value from the MaybeUninit<T> container."

But the following safety note says " It is up to the caller to guarantee that the MaybeUninit<T> really is in an initialized state."

OK, so I conclude that 'assume_init' should be called something like 'extract_value'.

With the safety note that the value may not be initialized and hence UB.

No, no it is not. It is not intitialized with some unknown value. It is uninitialized. You must actually initialize it before you can instruct the compiler to assume it is initialized.

C works in fundamentally the same way and this program also prints neither message (when compiled with optimizations under Clang):

int main() {
    unsigned a;
    
    if (a == 0) {
        puts("Hello 1");
    }
    if (a > 0) {
        puts("Hello 2");
    }
}

The first C standard was written to allow this kind of optimization around uninitialized data, all the way back in 1989. So at least for the last 30 years, "uninitialized" has never meant "initialized to an arbitrary value" or anything like that.

5 Likes

I mean, it's a bit like if you're telling the compiler driving a car to assume there's no tree in front of it. It can assume there is no tree all it want β€” it'll still crash into the tree if there actually is a tree in the way.

4 Likes

I think that is actually my point.

If I actually have to initialize the thing before the language and or compiler agrees that it is initialized then the method called "assume_init" is misnamed. It does not do what it's name suggests it does.

I don't want to get diverted into the can of worms that C is except to say your C example is clear enough. The is an 'a'. It's not initialized. Therefor the conditionals following are undecidable. Very obviously UB.

Clearly the same applies to Rust. Which is all well and good.

Except that "assume_init" confuses things because it does no such thing.

The problem is that you are conflating your assumption about how the underlying hardware implementation behaves (which may be incorrect) with the specification of the non-hardware abstract machine to which optimizations are applied.

For example, with your logic you would expect that every register on an Intel Itanium processor contains a defined value. Surprise – the integer registers each have an associated NaT (Not a Thing) status bit that tracks whether or not they do contain a defined value.

6 Likes

Actually, I think I'm doing exactly the opposite.

I have worked on enough architectures to have realized that the real hardware is not the high level language I am using. Starting with simple issues like the size of 'int', the endianness of memory layout and so on. And so on and so on...

In this case I am a new kid on the block. I read the Rust language docs. I find 'assume_init()' does not do what it's name suggests it does.

Is this some kind of English English vs American English misunderstanding or what?

I think the confusion here arises due to the limited scope of the extra powers granted by the unsafe keyword. unsafe is the means by which a programmer tells the compiler that the programmer is committing to provide part of the proof of correctness.

unsafe is properly used in situations where the complexity of what the programmer is attempting is beyond the logical-proposition-proving capabilities of the compiler. assume_init() is supposed to be used within that context to tell the compiler that the object has been initialized even though the compiler is unable to prove that claim on its own. It is not supposed to be used to lie to the compiler; that is almost always instant UB.

Addendum: In the above, by "compiler" I meant the compiler front-end, whether for Rust or C or C++. The common compiler backend, LLVM, does not care about claims of assume_init(); it tracks every assignment and deduces independently whether initialization actually occurred or not. UB occurs when the programmer feeds garbage to LLVM; the result is usually GIGO.

4 Likes

Regarding the C example, it can be quite interesting to notice where the undefined behaviour is triggered. In C it is triggered here:

int main() {
    unsigned a;
    
    if (a == 0) { // UB! Read of uninitialized data
        puts("Hello 1");
    }
    if (a > 0) {
        puts("Hello 2");
    }
}

While in Rust it is triggered here:

fn main() {
    // UB! The value is not initialized
    let a: u32 = unsafe { std::mem::MaybeUninit::uninit().assume_init() };
    
    if a == 0 {
        println!("Hello 1");
    }
    if a > 0 {
        println!("Hello 2");
    }
}

This distinction is quite interesting. Fundamentally there are two approaches to avoiding unsoundness:

  1. Don't do things on bad data.
  2. Don't let data get into a bad state in the first place.

C prefers the former, while Rust prefers the latter. For example, you could use Vec::from_raw_parts to create a vector, and you could give it e.g. a pointer to the middle of an allocation. Naturally this will do bad stuff if you push an item and it reallocates, but technically you would not trigger unsoundness if you just index into it and then throw the vector away without running the destructor.

Still, we say that the point at which you triggered undefined behavior is when you called from_raw_parts, and not when you pushed an item to it.

Why do things this way? Basically it's because it moves the cause of the UB into the code that uses unsafe.

Regarding your loop β€” yeah, the compiler probably ends up doing what you think it did. But it's still undefined behaviour, and that means all backwards compatibility guarantees are off. The compiler could initialize them all to 10, and you wouldn't have the right to complain about that. When there's a very similar version of the code that actually has defined behaviour, you should just use that instead if you care about correctness at all.

The loop is actually an especially dangerous case, because unlike with from_raw_parts, the unsoundness comes from invariants enforced by the compiler, not just invariants humans were thinking about when they wrote the implementation of push.

7 Likes

I confess I am a little confused here, too, as assume_init seems to me to be a perfectly descriptive name for what it does. It does not make previously uninitialized memory become initialized. It does tell the compiler to assume the memory is initialized. If the assumption is wrong, the compiler may miscompile it, of course.

3 Likes

That is exactly how I expect it to be used.

I don't want to lie to the compiler here.

I assume it has created my thing somewhere in memory. There are no UB warnings on that.

I then say "assume it is initialized" with 'assume_init()'

I then find the language does not do what I say.

In all of my comments I don't care about what hardware architecture we have or the implementation details of the compiler, of which I hope LLVM is only one in the long run.

I only consider what the source text actually says and what the language thinks it means.

Acknowledged. Reading the source of alice's example would not suggest otherwise.

No. It does not.

The compiler ignores it. As demonstrated by alice's playground example above.