I am trying to parse Rust source code to identify String literal tokens. I am relying on the Rust Reference for String literals but I can't find a complete RegExp that would satisfy the full definition of a Rust string literal:
A string literal is a sequence of any Unicode characters enclosed within two
U+0022(double-quote) characters, with the exception of
U+0022itself, which must be escaped by a preceding
Line-breaks are allowed in string literals. A line-break is either a newline (
U+000A) or a pair of carriage return and newline (
U+000A). Both byte sequences are normally translated to
U+000A, but as a special exception, when an unescaped
\) occurs immediately before the line-break, then the
U+005Ccharacter, the line-break, and all whitespace at the beginning of the next line are ignored.
Is there any regexp master here that could build up such pattern ?