Option<usize> in 8 bytes via custom None value

Hey,

I want to index elements in my datastructure starting from 0 using usize. But, I also would like to be able to store a lot of Option<usize> without wasting space, i.e. I would like the Option to have zero space overhead.

I know that there is NonZeroUsize in the standard library. But since I want to start from 0, that does not really work for me. Instead I would like to use the maximum value of usize as None value. Is there a crate/type for something like this?

I know that there is the optional crate, which has the disadvantage that its Optioned type is different from the stdlib Option type. I would be nice to replace the type inside the Option, and not the Option itself.

Using rustc_layout_scalar_valid_range_end doesn't seem to work.

#[rustc_layout_scalar_valid_range_end(18446744073709551614)]
pub struct LimitedRange(usize);
error[E0658]: the `#[rustc_layout_scalar_valid_range_end]` attribute is just used to enable niche optimizations in libcore and will never be stable
  --> traitgraph/src/index.rs:13:1
   |
13 | #[rustc_layout_scalar_valid_range_end(18446744073709551614)]
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

What you're asking for isn't possible using Option. You need a type with the #[rustc_nonnull_optimization_guaranteed] annotation to get the "niche" optimisation that would make your Option<usize> fit into size_of::<usize>() bytes, and even then the non-null optimisation works by using the all-zero value as the tag.

I'd create my own custom Option<usize>(maybe called Index?) which wraps a usize and does the check, only allowing access to the index via some sort of fn get(self) -> Option<usize> method that returns None on usize::MAX_VALUE.

1 Like

Why not just store the one's-complement of the value, which will convert usize::MAX to 0 before storage, then recomplement the stored value on retrieval? All real hardware has fast tests for zero; it doesn't have equally fast tests for N bits of ones.

Alternatively, just wrapping_add(1usize) to the value before storing it, then wrapping_sub(1usize) on retrieval, or vice versa.

5 Likes

This isn't really true.

You need that (rustc internal) attribute to guarantee that the niche is applied in a deterministic way that is guaranteed for FFI, but other niches can be used and are exploited by the current compiler.

1 Like

That's interesting. Do you know how we tell the compiler about a niche?

For example a 64-bit pointer only uses 48 bits or so, and several high-performance libraries reuse the unused bits to stash away extra information. This pointer manipulation is all done with unsafe code and bitwise operations, but it'd be nice if we had a safer/more ergonomic way to get these sorts of layout optimisations.

2 Likes

Thank you for your answers! They both sound efficient.

I was pondering both a little, and realised that I probably want a zero-cost method for extracting the index where I know that it is valid. Meaning without computing the one's complement or an increment/decrement. So Michael-F-Bryan's solution seems to be the most appropriate.

Thanky you again for the quick replies!

First of all: don't pack data in the high bits of a pointer. Problem zero is that this is highly architecture dependent and immediately makes your code nonportable. Problem one is that (on x86_64) the space isn't unused, it's reserved. Problem two is that the bits have to be preserved, they aren't just all zeroes (they're a sign extension of the highest data-carrying bit). Problem three is that five-level pages are already specced and some (experimental) chips have been made following the spec for five-level tables, which mean they can address a larger address space and have fewer "unused" bits in the pointer.

Other than that:

Packing data into the unused bits of a pointer is actually a completely separate thing from niche usage. The key difference is that you can always create &mut T, so if the T exists, it can be written to by code that doesn't know its context (and thus can overwrite any padding). Niches only define invalid values which can be used when a T isn't present, such as when a different variant of an enum is the live variant.

You also don't "tell" the compiler about niches in a type. The type either has niches or it doesn't, based on whether its component parts have niches.

#[rustc_layout_scalar_valid_range_[start|end]] is the rustc-internal way of restricting what values are valid for a scalar newtype, but as the error message you posted earlier says, this is only to be used by the standard library to define privileged types.

If you want a user-defined type with invalid values, you use an enum. If you define an enum, any space left over that isn't a valid variant is a niche which rustc will attempt to use for niche optimizations (such as reducing the size of Option<YourType> to just the size of YourType).

A super hacky thing that I haven't tried but would work in theory is adding a #[repr(u8)] enum ZeroedPaddingU8 { Zero = 0 } member to a structure to add a known-zero byte to the structure, which rustc can then use for niche optimizations. (Again, unlike padding, which may be overwritten with whatever values whenever the space is written to, so cannot be used to store data, ever.)

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.