Mapping between constant values

Hi everyone! Can someone point me in the general direction of an idiomatic way to build a static, constant map between some trivial enums and a set of constant structs?

Specifically, suppose I have this:

pub enum OldArch {
    Z80 = 0,
    M6809 = 1,
    M6502 = 2,
    M68K = 3,
}

and I'd like to map constant values of that to (references to) constant values of this:

pub struct OldArchParam {
    year: u32,
    addr_bus_width: u32,
    floats_are_also_involved: f64,
}

(All that is just for illustration, I'm not actually writing a program that handles old architectures, but tl;dr I want to map a bunch of enum variants to a bunch of structs whose members are constant integer and floats).

Since both the keys (the enum OldArch) and the values (struct OldArchData) of the map are constant and known at compile time, I'd like to have a constant, static map (of some sort -- I'm not specifically looking for a hash map, anything that I can get random access in constant-ish time and safety guarantees with is okay).

Based on stuff that was previously discussed here and elsewhere on the Interwebs, I know about enum_map but as far as I can tell it can't build the map at compile time. phf can't quite do that, either, since enum OldArch is not a type that it can hash at compile-time. I guess I could write my own macro which would "translate" my enum IDs into type-suffixed integers (which phi can hash) but it seems a little... involved?

The closest I've come is something like this:

impl OldArch {
    const fn data(self) -> &'static OldArchParam {
        match self {
            OldArch::Z80 => &OldArchParam { year: 1976, addr_bus_width: 16, floats_are_also_involved: 1.0 },
            OldArch::M6809 => &OldArchParam { year: 1978, addr_bus_width: 16, floats_are_also_involved: 2.4 },
            // Some cases omitted for brevity
        }
    }
}

...

let p = OldArch::data(OldArch::Z80);

If I got the output right, that compiles down to calling a function (OldArch::data) that makes room for a pointer on the stack, then uses an inline dispatch table to jump to a sequence which moves a pointer to the correct constant OldArchData instance into the pointer it made room for on the stack, and returns that pointer.

I would expect that, since I'm mapping between values that are constant and known at compile time, I should be able to say something like this:

let p = <Some magical incantation that gets me a reference to a static constant `struct OldArchData`>(OldArch::Z80);

which rustc would translate to just mov-ing the right value into p, without any intermediary dispatch tables being involved anywhere, and without the stack being involved for anything other than allocating p.

Am I wrong in this assumption, or am I missing a cool feature? Alternatively, am I actually misreading the compiler output (here: Compiler Explorer -- I'm reasonably fluent in assembly language for a bunch of architectures but amd64 is actually not one of them, and by the way do you know a bar where I can pay with hipster points?)

(I'm obviously open to thinking outside the C box here, which I am very likely doing -- perhaps I'm starting with the wrong abstraction here by using an enum like that?)

Thank you very much for your help!

i think that you could use the lazy_static crate for this

https://crates.io/crates/lazy_static

That mapping can be done by making a scope-local const to turn that &OldArchParam into a static reference, here i demonstrate it with a macro:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=242b1da741642878d8531abab46630d6

macro_rules! const_enum_map {
    (
        $(#[$attr:meta])*
        $vis:vis fn $method:ident($self:ident : $enum_key:ty) -> $value_ty:ty {
            $($variant:pat => $value:expr),*
            $(,)?
        }
    ) => {
        impl $enum_key {
            $(#[$attr])*
            $vis const fn $method($self: Self) -> $value_ty {
                match $self {
                    $($variant => {
                        const __VAL: $value_ty = $value;
                        __VAL
                    })*
                }
            }
        }
    }
}

pub enum OldArch {
    Z80 = 0,
    M6809 = 1,
}

#[derive(Debug)]
pub struct OldArchParam {
    pub year: u32,
    pub addr_bus_width: u32,
    pub floats_are_also_involved: f64,
}

const_enum_map! {
    /// Converts this enum to `&'static OldArchParam`
    pub fn to_value(self: OldArch) -> &'static OldArchParam {
        Self::Z80 => &OldArchParam { year: 1976, addr_bus_width: 16, floats_are_also_involved: 1.0 },
        Self::M6809 => &OldArchParam { year: 1978, addr_bus_width: 16, floats_are_also_involved: 2.4 },
    }
}

fn main() {
    dbg!(OldArch::Z80.to_value());
    dbg!(OldArch::M6809.to_value());
}

It optimizes to just a load with opt-level=1: Compiler Explorer

example::OldArch::to_value:
        movsx   rax, dil
        lea     rcx, [rip + .Lreltable.example::OldArch::to_value]
        movsxd  rax, dword ptr [rcx + 4*rax]
        add     rax, rcx
        ret

otherwise it's still a sequence of conditional jumps

3 Likes

Are your enum values guaranteed to be successive integers starting from 0? If so, you could just use a static array of payload values and index into them using the enum's integer value.

3 Likes

Hey, thanks! I've looked into lazy_static but I'm wondering if I could avoid it. I haven't actually tried it because its description says:

Using this macro, it is possible to have static s that require code to be executed at runtime in order to be initialized. This includes anything requiring heap allocations, like vectors or hash maps, as well as anything that requires non-const function calls to be computed.

Thing is, I don't see why any of the stuff above would require code to be executed at runtime. Granted, lazily building a hash map with data I supply at compile time would have to be done at compile-time, I understand that. But since both the data and the mappings are known at compile time, I would expect I could do the mapping at compile time, too.

An example of equivalent C code (but, of course, C enums are not full sum types, so the compiler's job is a lot easier) would consist of declaring a const array of OldArch::NUM_OLD_ARCHES items initialised to the right OldArchData values, and having p = &that_array[OldArch::Z80]. That really translates into mov-ing the right pointer value into p, and can be guaranteed not to yield a program that segfaults. That's the kind of result I'm aiming for.

Hey, thanks! That's exactly what I'm looking for!

In the example above, they are, and this would probably work. However, on a more general case, it shouldn't matter -- if the mapping is constant, and known at compile time, it should be possible to do it in constant time. The other, more complex answer works on this more general case -- but I think this simpler solution may be adequate in some scenarios, too! Thanks!