How should I make regex compile only once?

minimal (mostly) example: https://play.rust-lang.org/?gist=d5a5d6328923271b1ac4a324d6249813&version=stable&mode=debug&edition=2015

I have a regex I need to use in a loop (three lists of 200+ entries each), and that I may want to use in other functions all collected into a single mod.rs file. According to the Regex package it’s bad to redefine a regex for every instance in a loop. How should I be resolving this? Should I be creating it in a parent function then passing it as a parameter? Is there some equivalent of CONST for these sorts of “static” types? Should I retool my code to do the entire loop within a single function?

It is an anti-pattern to compile the same regular expression in a loop since compilation is typically expensive. (It takes anywhere from a few microseconds to a few milliseconds depending on the size of the regex.) Not only is compilation itself expensive, but this also prevents optimizations that reuse allocations internally to the matching engines.

Put the regex into a lazy_static: example

The lazy_static crate is the usual answer.

I’m likely to use that regex in other places too, do you know if lazy_static will also carry between functions?

lazy_static will only share uses from the same definition. You’d have to put it in a shared namespace to access from those multiple places.

Would the top of main.rs work as a “shared namespace” for modules that have been broken out into separate xyz.rs files?

And thanks for all the help, I’m still learning how all these pieces fit together

Yes, your submodules can access items from the top of main.rs, which would be the crate root. You can either import it with use MY_REGEX; (or I think use crate::MY_REGEX; in the upcoming 2018 edition), or refer to it with a full path like ::MY_REGEX

1 Like

Awesome, thanks for the help!