Pre-compute / pre-hash HashMap / HashSet keys at compile-time?

I have a HashMap<&'static str, T> whose values are inserted at run-time, but whose keys are known at compile time. For instance, I have sections of code that look like map.get("first_name"). In the interest of efficiency, is there an API by which I can pre-compute the hash of "first_name" at compile-time and .get_prehashed(hash_of_first_name)?

If you can replace the strings with an enum then you can avoid hashing entirely with

1 Like

So a struct with field names?

2 Likes

phf (key, index) + Array of Options. Expect will get inactive code looking for better match. (Can be proven wrong.)

There is a good chance that the hash calculation is already being inlined and constant-folded, have you checked whether this is the case?

Well, since you went there, I guess I have to be more concrete about my use case. I am parsing ASN.1 data that is serialized according to the Basic Encoding Rules. Values encoded using BER are represented as tag-length-value tuples in a binary format. Structs defined using ASN.1 ("SEQUENCE") can have components that are optional, and therefore, omitted from the encoding.

Hence I have something that looks like this:

pub fn _parse_component_type_list<'a>(
    ctl: &'a [ComponentSpec],
    elements: &'a [&'a X690Element],
    is_extensions: bool,
) -> ASN1Result<(usize, IndexedComponents<'a>)> {

In the above X690Element would be a tag-length-value encoding, and IndexedComponents contains a HashMap<str, X690Element> that indexes these elements by their name if they match one of the ComponentSpecs. Hence the encoding of a value of:

AlgorithmIdentifier ::= SEQUENCE {
   identifier       OBJECT IDENTIFIER
   parameters   ANY DEFINED BY identifier
}

would produce a HashMap containing keys "identifier" and "parameters" (if the encoding was correct, of course. Then, in generated code, I extract the tag-length-value encodings by their component names from this index, decode them into their respective values, then use those values to construct an AlgorithmIdentifier struct.

I actually want to get rid of this HashMap altogether, since it seems like a poor-performance approach. I am trying to implement this to return an Iterator<&'static str> that returns the component names in the order that they are evaluated, then my code generation can just use a match statement against each string to determine what to do with the tag-length-value encoding. I think that would entail less allocation.

Despite the above, I figured I'd ask the question anyway, since it seems like an easy performance win.

How would I check this?

You will need to look through the generated assembly. Godbolt is good for testing short snippets. (Make sure you are compiling in release mode/-Copt-level=3!)