Why doesn't the niche optimization work for this enum?

Playground link

Code:

#![allow(dead_code)]

enum Item<T> {
    Allocated(T),
    Free { next: *mut T },
}

struct Important {
    data: u64,
    foo: i32,
    bar: u64,
    other: f64,
    stuff: std::num::NonZeroU64,
    other_stuff: u8,
}

fn main() {
    println!("{}", std::mem::size_of::<Important>());
    println!("{}", std::mem::size_of::<Item<Important>>());
}

I would naively expect the following code to print out the same number twice, but it appears wrapping my struct in an enum makes it 8 bytes larger (40 vs 48).

Shouldn't rust be taking advantage of the fact that I'm storing a NonZero inside Important, so that it can store the enumeration discriminant there?

Edit: interestingly making next NonNull doesn't help either! Why?

raw pointer can have any value, so it doesn't matter what is in Allocated, the raw pointer field always needs a separate tag.


enum Item {
    Nothing,
    Free { next: *mut () },
}

fn main() {
    println!("{}", std::mem::size_of::<Item>());
}

prints 16 (2x usize)

In this case however, it could be laid out like so:

union {
    [u64, u64, f64, NonZeroU64, i32, i8], // Allocated
    [usize, ...] // Free
}

and the NonZeroU64 could be set to 0 in the Free variant. However, this doesn't fall under guaranteed niche optimization (which is exactly what @kornel described), so there are no guarantees that this optimization will be done, (and in fact it isn't done).

1 Like

That doesn't explain it though. Allocated is much larger than Free, so Free should be able to overlap with almost any part of Important, except wherever we stick the discriminant.

For example, next could be made to overlap with data, and then the discriminant could be stored inside stuff. All within 40 bytes.

@RustyYato I guess I was assuming that if the optimization isn't done it's because either for some reason it isn't allowed, or isn't implemented yet, and if the latter I was hoping that there is some way for me to know at a glance when I'm going to get it or not? You said it's not the "guaranteed" kind. Is their documentation somewhere of when it is guaranteed? Googled "rust guaranteed niche optimization" and didn't immediately find anything except some specific tickets about situations where it doesn't work yet.

The only guaranteed niche filling optimization is if you have an enum that fits the template below exactly, and the wrapped type has a niche. There are more enums that will have niche filling optimizations applied, but none of those are guaranteed.

enum SomeEnum {
    NoDataField, // not zero-sized field, no data at all
    Something(SomeTypeWithNiche)
}
1 Like

Can a raw pointer have any value? See:

Nikita Popov: "Pointer magic for efficient dynamic value representations".

Yes, in Rust a raw pointer can have any value a usize can have (because you can cast a usize to raw pointer). This doesn't mean you can dereference any raw pointer.

3 Likes

That's unfortunate, that means in memory/cache constrained situations it's still worth doing a lot of ugly union hacking. So basically it is guaranteed to work for Option or anything that exactly resembles Option.

What does it mean to have a guarantee here? As long as you aren't using an explicit #[repr(...)] there is no ABI, so isn't the only guarantee whatever rustc happens to implement?

Yeah it would be sweet if rust had a type level understanding of aligned/unaligned pointers like Zig does, then you could use low bits for discriminants. As I understand it rust does require that references be to aligned memory, so in theory could use the low bits in that case, though there is probably less demand for that.

Exactly

I thought this was documented in the reference, but apparently not. The nomicon has a few words about it:
https://doc.rust-lang.org/nightly/nomicon/other-reprs.html

You may be interested in

  1. It's allowed to store unaligned pointers. Otherwise the std::ptr::read_unaligned() would not make any sense.

  2. Note the article is written in 2012. At that point every x86_64 machines supports up to 48bits address, but not now. Intel added 5 level paging support since then and the linux kernel supports it since 2016. Not we can utilize 57bits, and I'm pretty sure it will be extended to the full 64bits within few years.

4 Likes

Also from the core::option module (here)

Representation

Rust guarantees to optimize the following types T such that Option<T> has the same size as T :

It is further guaranteed that, for the cases above, one can mem::transmute from all valid values of T to Option<T> and from Some::<T>(_) to T (but transmuting None::<T> to T is undefined behaviour).

3 Likes

The latest release notes (1.50) also mention a niche for Option<File> on Unix systems: Announcing Rust 1.50.0 | Rust Blog

1 Like