use regex::Regex;
println!("command {}", command);
let re = Regex::new(r#"(?i)(What is|What's|Calculate|How much is)\s(\d)\s(\+|and|plus|\-|less|minus|x|by|multiplied by|/|over|divided by)\s(\d)"#).unwrap();
let matching = re.is_match(command);
println!("re: {:#?}\n Command: {}\n matching: {}", re, command, matching);
for caps in re.captures(command) {
println!("groups: {} {} {} {}",
caps.get(1).unwrap().as_str(),
caps.get(2).unwrap().as_str(),
caps.get(3).unwrap().as_str(),
caps.get(4).unwrap().as_str()
);
}
And I got the required output as:
command What is 1 / 2
re: (?i)(What is|What's|Calculate|How much is)\s(\d)\s(\+|and|plus|\-|less|minus|x|by|multiplied by|/|over|divided by)\s(\d)
Command: What is 1 / 2
matching: true
groups: What is 1 / 2
When I tried using the raw strings literals r#"" as:
let re = Regex::new(r#"
(?i)(What is|What's|Calculate|How much is)
\s(\d)
\s(\+|and|plus|\-|less|minus|x|by|multiplied by|/|over|divided by)
\s(\d)
"#).unwrap();
It did not work, and gave me:
Command: What is 1 / 2
matching: false
I changed End of Line Sequence but nothing worked:
I tried even retain and filter but apparantly none of them working with regex
The whole point of a raw string literal (r#"" as opposed to just "") is to preserve all characters in the literal, including whitespace, so this is doing exactly what it should be doing.
I think you're asking for a string literal that lacks escape sequences so you don't need to escape backslashes, but also ignores linebreaks, and AFAIK that's just not a thing that exists (unless someone wrote a crate for it). You're probably better off breaking your regex into multiple string literals and concatenating them.
Your first regex is already using raw string literal (r#"…"#) and as you say, it works.
Perhaps what you want is to enable "insignificant whitespace mode" in the regex library (r#"(?x)…"). Note that spaces in eg. What is won't work, and you'd need to escape them with \x20.
Alternatively, you can call Regex::new(r#"…".replace('\n', "") to quickly get rid of just newlines. This sounds like a very hacky solution, though.
'\ ' greps for the UTF-8 code-point 0x20, while '\s'greps for any character of the class :space:, including tabs, form-feeds, non-ASCII spaces in other languages, etc.