An Unusually Large Enum Union, or, Packing Pointers on 64-bit Systems

I have an enum, it looks something like this:

pub enum Large {
    This(Box<...>),
    That(Box<...>),
}

The documentation for std::mem::size_of::<...>() states that Box has the same size as usize. Union-style enums take on the size of the largest variant. In this case, both variants are of type Box, so we'd expect Large to be the same size as usize + some information (a byte?) to keep track of which variant of the enum it is.

On a 64 bit system, usize, and therefore Box, have a size of 8 bytes, so we'd expect Large to be 8-9 bytes in size. However, if you inspect the size of Large using size_of, you'd see that it's actually 16 bytes.

This makes sense - the compiler is padding the enum to be a multiple of 8 bytes. However, I still have a few questions. If I recall correctly, even 64 bit systems only use 48 bits (6 bytes) for addresses. Because of this, is there a way efficiently pack an enum of this sort into 64 bits? Such as:

Byte: 0       1       2       3       4 ... 8
Divs: |Variant--------|48-bit Pointer-- ... |

And if something like this isn't possible using an enum, how else could I do this? Would I have to resort to using a byte-array with some unsafe code or the like?

If there's a flaw in my logic, please point it out so I may revise the question to receive a more useful answer.

TL;DR: Is it possible to pack an enum around Box into usize bytes on a 64-bit system?

Not all 64 bit systems use only 48 bits for addressing. Those upper bits could be used for virtual addressing or pointer tagging, etc.

1 Like

64 bit x86 does (typically) use only 48 bits, but in a way that's designed to prevent people from accidentally using the other bits for anything else. A 64 bit pointer is required to be a sign-extended version of its 48-bit value, and CPUs actually enforce this. This "holds the door open" for CPUs to expand from 48-bit pointers to larger sizes, for scenarios that need a lot of address space.

So if you're willing to exclude your program from ever running in those scenarios, you can write code yourself to manually pack a tag into the extra bits, without using an enum. Javascript engines in browsers often do this, for example. You just have to be careful to re-set the bits to all 0s or all 1s to match bit 48 before you actually dereference the pointer.

I've done this myself here, if you'd like an example: dejavu/value.rs at main · rpjohnst/dejavu · GitHub

3 Likes

Thank you for the answers. I'm attempting to implement values for a VM stack, and I would like them all to be 64 bits in size. Do you all know of any other potential solutions?

It seems like f64 nan-tagging is the way to go. Again, thanks for the help.

Aside from putting the tag in the upper bits of a pointer, you can also put the tag in the lower bits of the pointer, by ensuring that all pointers are aligned to some minimum value. For example, if everything is aligned to 8 bytes (a common default for the global allocator), you know the bottom 3 bits will always be 0.

A third option is simply to force every value to be allocated somewhere other than the VM stack itself, and put the tag in the value. You can do this without any unsafe code or pointer tricks just by using Box<SomeEnum>.

There's also a possibility of encoding the enum variant in the low-order bits if you know it contains only pointers to data with an alignment higher than 1 byte.

How might I use the lower bits to store the tag? Could you perhaps provide an example?

Basically the same way you would use the upper bits- cast the pointer to an integer, overwrite the relevant bits with the tag, and be sure to mask them out before converting back to a pointer.

If you do want to ensure the type has same size of the ptr, you can make it a box-of-enum not enum-of-boxes.

There’s a crate for hiding tag in the alignment bits: pointer-utils/crates/ptr-union at master · CAD97/pointer-utils · GitHub.

We use this trick in rust-analyzer, where the element of a syntax tree is a pointer-sized enum of either interior node or a token.

2 Likes

I know we already have the concept of "niches" for things like the null pointer optimisation, but would it be possible for Rust to automatically do this sort of layout optimisation using a pointer's unused bits?

Assuming we have an enum below:

enum Foo {
    Bar(Box<u32>),
    Quux(Box<f32>),
}

If the discriminant is packed within the alignment of the box ptr, how can we get &Box<f32> from Foo with Quux variant?

5 Likes

Also, see the tagged-box crate.

PSA: ptr-union has a bug that means things packed into a Union2/Union4 are not being dropped. This cannot be fixed backwards-compatibly (the structures themselves do not have the required bounds for the Drop impl and a Copy impl was provided for unions of Copy pointers), so a version 2.0.0-pre has been released. (-pre is just a safety precaution, I'll release a non-prerelease version this/next week if I don't find another hole.)

The 1.0.0 line has been yanked to avoid people accidentally using it.

There is no unsoundness issue (this cannot cause undefined behavior), it just is very leak prone. To avoid leaks, you must unpack the union on 1.0.0.

This does not impact rust-analyzer, which does not (yet) use ptr-union.

Well, it's been a while, but I finally got around to implementing NaN-tagging. Here's a simplified version of what I did for those who may want to know later.

We'll store all the datatypes we want to pack in a Data enum:

pub enum Data {
    Real(f64),
    String(String),
    True,
    False,
    // snip ...
}

The IEEE standard for floating pointing numbers describes a 64-bit long data layout:

// S is sign, Q is quiet NaN flag, I is an intel flag
SExponent---QIMantissa------------------------------------------ 

When a f64 is a quiet Nan, only the following bits are set:

-1111111111111--------------------------------------------------

Look at all that free space! We can pack a pointer and some data in here with room to breathe.

// the first bit is a pointer flag, T is a tag 
// more tag bits can be used for more variants - we only have true and false, so only 1 bit is needed
11111111111111---Pointer----------------------------------------
01111111111111-------------------------------------------------T

Let's define some constants. We'll use these to mask out specific data:

const QNAN:   u64 = 0x7ffc_0000_0000_0000;
const P_FLAG: u64 = 0x8000_0000_0000_0000;
const P_MASK: u64 = 0x0000_FFFF_FFFF_FFFF;
const F_FLAG: u64 = 0x0000_0000_0000_0000;
const T_FLAG: u64 = 0x0000_0000_0000_0001;

Now, we'll implement the tag. A tag is a u64:

pub struct Tagged(u64);

We just need to write some methods to wrap the data into a tag, and unwrap a tag. We perform some bit-twiddling to put the right bits in the right places.

impl Tagged {
    pub fn from(data: Data) -> Self {
        match data {
            Data::Real(f) => Tagged(f.to_bits()),
            Data::False   => Tagged(QNAN | F_FLAG),
            Data::True    => Tagged(QNAN | T_FLAG),
            // box anything else, pack the pointer
            other         => Tagged(QNAN | P_FLAG | (
                P_MASK & (Box::into_raw(Box::new(other))) as u64
            )),
        }
    }
    
    // cont ...

If we want to access the data packed into a tag, we need to check we have the right value, then reconstruct the data. Note that this isn't actually dereferencing as per the Deref trait, because I can't figure out how to make this function return &Data without the compiler complaining.

    // ... cont
    pub fn deref(&self) -> Data {
        let Tagged(bits) = self;

        match *bits {
            n if (n & QNAN) != QNAN   => Data::Real(f64::from_bits(n)),
            f if f == (QNAN | F_FLAG) => Data::False,
            t if t == (QNAN | T_FLAG) => Data::True,
            // reconstruct box, follow pointer
            p if (p & P_FLAG) == P_FLAG => {
                let pointer = (bits & P_MASK) as *mut Data;
                unsafe { *Box::from_raw(pointer) }
            },
            _ => unreachable!("Corrupt tag data"),
        }
    }
}

If we want, we can implement some methods on Tagged to prevent extra data conversion for simple operations, like and-ing two tagged pieces of data that we know to be booleans.

Of course, this method comes with a few caveats that I haven't addressed in the above implementation here out of brevity, such as:

  • Assuring the pointer only uses the bottom 48 bits. Ideally, you'd specify a layout to guarantee this.
  • Implementing simple operations on tagged data.
  • Dropping memory a tag points to when it is dropped.
  • etc.

By the way, here's how you'd use it:

let string_a = Data::String("I lost the game".to_string());
let tagged   = Tagged::from(a_string);

if let Data::String(string_b) = tagged.deref() {
    assert_eq!(string_a, string_b);
}

That's all. If you have a question, or something's terribly wrong with my implementation*, please reach out to me.

*From the unit tests I've ran, it seems to be working as intended.

Maybe I'm missing something, but it looks like you could pass in a Data::Real(f64::NAN) and get a non-Real data back out?

Here are the some of the tests I used: Rust playground.

I'm not quite sure why, but rust has no problem unpacking tagged NaNs to get Real data. I think this has to do with the underlying way rust represents NaNs vs the quiet NaNs that Tagged checks for. I'll post my findings soon.

Ah, so Rust does doesn't flag their NaNs with Intel’s “QNan Floating-Point Indefinite”, so:

-1111111111110-------------------------------------------------- // Rust NaN
-1111111111111-------------------------------------------------- // Quiet NaN

Are distinct. As Rust might choose to include that bit in the future, A better qNaN layout to check for tagged data might be:

SExponent---QIMantissa------------------------------------------ // IEEE double
P1111111111111D--Pointer---------------------------------------T // Tagged NaN

Where P is a pointer flag, D is a Data flag, and T is tag data. Thanks for drawing that to my attention, skysch!

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.