How is Rust's "match" implemented?

I just came to realize that I've never stopped to think about the implementation (and the resulting machine code) of the switch / match statements. Some articles suggest that it might compile down to fancy "if-else" conditions in some cases, and "goto" jumps - in other ones. How does Rust do it?

And, perhaps more importantly, if you have many possible scenarios to match on, does it make more sense to use a HashMap, mapping from the matchable input to the desired output or a function?

You'd have to use a website like godbolt to analyze any specific case.

My personal experience in the past, analyzing, for instance, matching on a large number of strings is that it seems to mostly do a binary search strategy, first on the length of the strings and then the contents against the predicate strings with the same length.

Doing dynamic dispatch yourself with a hashmap has the potential to be faster, but only given a very large number of "arms" and very cheap to hash values (note the default hash algorithm provided by std prioritizes DOS protection over raw speed).

I wouldn't be surprised if matching on a large number of consecutive integers caused llvm to generate a jump table, although I haven't tested that.

3 Likes

It also depends on the kind of patterns that are used. One way to check is to look at the MIR output in the Rust playground. For example, matching enum discriminants will turn into a SwitchInt, which gets lowered to an LLVM switch, and the specific target codegen could implement that in many ways.

4 Likes

I think it's more about the static/dynamic distinction.

If it's a non-huge number of statically-known things, might as well use the match -- if the codegen for that turns out egregiously bad, then it's probably a rust or llvm bug that should just be fixed.

If it's a set of things only really known at runtime, then it can't be a match anyway, so the question is more whether a HashMap is good enough, or if you want something with different tradeoffs for string matching like a patricia trie.

But don't forget some middle-ground options. For something statically-known but with so many arms that the compiler might start getting unhappy with it, you can also do things like binary-searching a pre-sorted static LUT

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.