UnsafeCell<[u32; 42069]> does not initialize on the heap in smart pointers with opt-level = 0

Suppose I have structs S1 and S2.

#[derive(Default)]
struct<const N: usize> S1 {
  data: [u32; N],
}

#[derive(Default)]
struct<const N: usize> S2 {
  data: UnsafeCell<[u32; N]>,
}

For a sufficient great N, both structs will overflow the stack, when instantiated. Therefore one should allocate them on the heap, for example in a smart pointer let s1: Box<S1> = Default::default(). The compiler is smart and will do that on the heap right away for S1.
But with let s2: Box<S2> = Default::default() and opt-level = 0 (which is the default for non --release builds) the compiler tries to create the struct on the stack and copies it over to the heap, which overflows the stack midway.
With opt-level >= 1 the compiler regains the ability to instantiate directly on the heap.

Soooo this basically breaks non optimized builds. I know the docs of UnsafeCell say that it inhibits niche optimizations but is there any way to help the compiler understand that - even on opt-level = 0?


I can probably create a Box<S1> and std::mem::transmute<S1, S2> it but that feels wrong and requires me to keep S1 around and both in sync.

Yes, Rust has no guaranteed emplacement. The boxed-array-on-stack and equivalent are well known. You could instead do some dance with Vec::with_capacity, Vec::resize_with, and try_from.

4 Likes

Another thing you could consider is changing your debug config to opt-level = 1. Its compile-time impact is low, but makes a massive difference in codegen quality.

My default answer is that you probably want [UnsafeCell<u32>] instead of UnsafeCell<[u32; 42069]>, though.

6 Likes

Thanks!

opt-level = 1 was my current workaround. But moving UnsafeCell one layer in, is a nice solution.

Edit: You sure your second suggestion works? It failed on me just now :(

1 Like

You'll still need to go through Vec to construct it -- perhaps indirectly by using Box::from_iter.

Similar problem: Allocate a boxed array of MaybeUninit - #4 by scottmcm

Then you can .try_into().unwrap() the Box<[T]> to get the Box<[T; N]>.

1 Like

The way you use "niche optimizations" makes me feel you interpret it as "rarely used optimizations applicable only in niche use cases". That's not what this phrase means. It's a term of the art (which I, surprisingly, can't find any reference for), which means optimizing data layout based on the existence of niche values. A niche value is a bit pattern which doesn't correspond to any possible value of the type. Think 0 for NonZeroU32, or null pointer for &T. Since these values don't exist at runtime (under threat of UB), the compiler can use them to represent some other state of the data. For example, it's guaranteed that Option<NonZeroU32> is bitwise the same as u32, with 0: u32 corresponding to the None variant. Similarly, Option<&T> is the same a *mut T, with None corresponding to the null pointer.

UnsafeCell<T> doesn't allow to observe niches in T, because that would lead to unsoundness. The reason is basically that the discriminant of the wrapping Option is, conceptually, outside of the contained type and observes the usual aliasing rules. In particular, it (like all other data) cannot be mutated while the Option<T> is immutably borrowed. However, data inside of UnsafeCell<T> can be mutated under an immutable reference, and the niche optimization would overlay the discriminant of Option<UnsafeCell<T>> inside of UnsafeCell<T>, leading to a contradiction and safety violation.

6 Likes

I know the compiler gets smart on memory representation for enums.
So your Option<NonZeroU32> only requires 4 bytes. Without niche optimization this would be more (probably 8 bytes).

Anyway.. u32 as well as [u32; N] does not allow for niche optimizations.
I acknowledge that this fact might be hidden for the compile due to UnsafeCell<T>.

It is not clear to me why the compiler refuses to initlialize the array directly on the heap, especially since size and alingment are known.

It does, though? Playground, works fine if you compile with --release.

But of course, it's just an optimization, so you shouldn't rely on it for correctness, and shouldn't directly instantiate such huge structs in your code. If you expect huge buffers, use Vec, or an ArrayVec if you want to keep the fixed-size guarantee explicit.

2 Likes

Exactly thats the point! :smiley:
--release raises the opt-level >= 1 and therefore allow the compiler to figure out, he can initialize the struct on the heap.

I wondered if it`s a bug or something to be expected. And I am not brave enough to open an issue on GitHub so I thought I should ask here first. :sweat_smile:

It's not refusal or a bug. It's the lack of a feature (Rust has no guaranteed emplacement). Even in release mode, it's not guaranteed. An issue already exists (which I linked to earlier).

2 Likes

Well, it depends how you write it. Optimizations don't fix things in unoptimized builds.

But you can write it in a way that does write the values directly on the heap:

Where you can look at all the allocas (stack allocations) and see that there's no big ones, despite no optimizations.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.