Regex synthax in matching &str

michaelhugi · May 29, 2021, 10:16am

Hello

I'm writing a parser for GDTF-Format to start an open source lighting control project. GDTF is a new standard in entertainment industry to describe moving-heads and other fixtures in any possible way. Because there is not only an xml that describes it but also png, 3ds files aso. and because for practical use I will need some elements to be stored in hash-maps etc. I've decided not to use serde but implement xml deparsing manually for more flexibilitiy. I've achieved much so far with quick-xml.

Now I have a problem regarding an enum. In the GDTF spec there are multiple possible control attributes for lamps listed (like intensity, shutter, color red...) These possible attributes come in as &str. Because these attributes are important to access fast for controlling a light I've decided to setup an enum with all possible values.

I can match the &str and return the correct enum value on deparsing, but a lot of them have wildard placeholders for enumerate them in case of multiple control attributes that are the same. I've decided to set this placeholders as u8 to the enums that can use them.

Example

"Dimmer" -> AttributeName::Dimmer,
"Pan" -> AttributeName::Pan,
"Gobo1WheelShake" -> AttributeName::GoboNWheelShake(1),
"Gobo2WheelShake" -> AttributeName::GoboNWheelShake(2),
"VideoEffect1Parameter1" -> AttributeName::VideoEffectNParameterM(1,1),
"VideoEffect2Parameter1" -> AttributeName::VideoEffectNParameterM(2,1),
"VideoEffect1Parameter2" -> AttributeName::VideoEffectNParameterM(1,2)

I hope you get the idea out of these examples.

Now I could just have a long list of regex matches but the enum has over 250 items and I'm afraid that this would immensly slow down the deparsing.

I'm looking for something like this:

match my_str {
    "Dimmer" => AttributeName::Dimmer,
    "Pan" => AttributeName::Pan,
    "Gobo(\d{1})WheelShake"(n) => AttributeName::GoboNWheelShake(n),
    "VideoEffect(\d{1})Parameter(\d{1})"(n, m) => AttributeName::VideoEffectNParameterM(n,m),
}

Is there a possibility to regex match a string or any other way to solve this problem efficiently?

Any help ans suggestion appreciated.

H2CO3 · May 29, 2021, 10:25am

Why do you think that the regex matching implemented using one kind of syntax (match) would be faster than regex matching implemented by another kind of syntax (e.g. using a library)? It likely wouldn't. The exact choice of syntax doesn't influence the workings of whatever underlying algorithm realizes the search itself. (This is why people build libraries out of functions, instead of hard-coding every solution to every problem directly into the language.)

The regex crate is maintained by a Rust team member who specializes in string searching. It's fast. It even has RegexSet which you can use for matching many regular expressions at once, in one single pass of scanning the target string. You should probably start with this library and then profile your code to see if it represents a significant bottleneck that should be optimized.

michaelhugi · May 29, 2021, 11:18am

Thanks for your help. I'll try that.

system · August 27, 2021, 11:18am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Deparsing dependig on version	2	175	December 31, 2023
A good way to impl FromStr for enum? help	6	2875	March 19, 2023
Matching against discriminants of enums with fields help	3	267	March 21, 2023
How to refine parsing of associated value in enum variant? help	4	202	November 14, 2023
Match enum range	8	3619	January 12, 2023

Regex synthax in matching &str

Related Topics