Creating a regular expression engine for fun is an awesome thing to do. I'd encourage anyone with interest to do it. Rust's regex crate started out as an insatiable itch of mine, and has grown quite a bit since then. I'll do my best to answer your questions, but the lack of details provided on your part basically makes it impossible to give you anything concrete.
I think this question is too broad to answer, and in particular, it's impossible to give even a reasonably accurate answer if we can't see your code.
If your code uses parametric polymorphism to be generic, then that, in and of itself, won't cause performance to suffer. However, if you're using a generic algorithm that isn't specialized to the specific properties of the thing you're searching, then you will miss out on potential optimizations. For example, if you're searching a
&[u8] or a
&str, then you can apply
memchr, which typically uses vector instructions. But if you limit yourself to searching an
Iterator<Item=char>, then you can't apply
memchr to that directly, as one example.
Since you didn't share any details of your performance comparison, this question is impossible to answer. The regex crate uses two different types of SIMD optimizations today:
- In some cases, it applies
memchr on a byte that the regex crate knows must appear in a match. This tends to be vectorized, but ultimately depends on how your platform's
libc is implemented and whether the
memchr crate uses it or not. Again, since you didn't provide any details about your comparison, it's impossible to even know which implementation your comparison uses.
- In some other cases, specifically when there is a small number of small literals where at least one must match, then it's possible for the Teddy algorithm to be applied, which also uses a vectorized algorithm. But this only works on nightly Rust and when you provide the requisite compile time flags to the regex crate. Once again, since you didn't share the details of your comparison, it's impossible to say whether this optimization is being applied in your case. One thing worth noting is that since the benchmark game exclusively uses stable Rust (justifiably so), it is impossible for this particular optimization to have any effect.
What the regex crate does today is just barely scratching the surface of what SIMD can bring to regex engines.
With all of that said, focusing on SIMD here is a classic example of missing the forest for the trees. For these optimizations to even apply in the first place, your regex library needs to be capable of literal extraction. And even then, it's still missing details about the regex engines themselves. For example, Rust's regex engine actually has several different engines, where the split is mostly motivated by performance. There are more details here and here.
It requires nightly Rust at the moment. There is an RFC recently submitted that is the result of more than a year of work and appears close to merging. The code powering this is in the
stdsimd repo and is on its way to getting into
std as an unstable feature.
Previously, most SIMD code in Rust (including the regex crate, currently) used a combination of the
simd crate (which only works on nightly Rust) and Rust's unstable platform intrinsics feature.
I don't see how anyone could answer this without being able to see it. In practice, building a production grade regex engine is a lot of work, not just in optimizations, but providing all the features that most come to expect from regex engines. I think there might be some demand for a regex crate that is faster to compile, even if it means runtime performance drops. This is something that's on my mind to do, but the regex crate internals must first be refactored (which I've already been working on for a year).
Note that the regex-redux benchmark only covers a very small slice of what a regex engine can do. To get a clearer picture (but still, perhaps, not completely accurate), you might hook into the regex crate's benchmark harness, which already handles PCRE1, PCRE2, D's regex library, RE2, Tcl and Oniguruma. You can see an example of how the benchmarks are defined here: https://github.com/rust-lang/regex/blob/9ee9943ec81c560978a51159d08ef670bb6d5d37/bench/src/sherlock.rs