Rust Grammar as PEG

Hello,

Would you know about a PEG representation of Rust grammar ?

My goal is to express the Rust Programming langage as SSLR (which is a kind of ANTLR alternative where Left recursion is not allowed).

Most of the language is pretty easy to represent but I am curious about how to avoid Left recursion for constructs like call expressions which are expressions based on...expressions

As mentioned in another post

Parsing Expression Grammars, much like LL(*) parsers, can’t perform left recursion because it can and will lead to infinite recursion.

Please note I am note looking for a solution like Pest that would be written in Rust but for a formal representation of the Rust Grammar based on PEG

Thanks

Eric

It’s not easy to explain in a short reply, but the term to search for is “Left Factoring”. Wikipedia has a good, but extremly technical description of the algorithm:

Can PEG correctly parse the following?

if A { b } { c }

The A { b } part could be mistaken for a struct literal, but there is a rule that within an if condition struct literals are not allowed, unless they are wrapped inside something to make this case unambiguous. This means that is is parsed as an if A { b } block followed by a { c } block, not as an if with A { b } as conditional and b as body.

Yes, PEG allows both positive and negative arbitrary-length lookahead. The struct literal rule could expressly forbid being followed by {. It also uses an explicit precedence for alternations, which you might be able to leverage here.

I don't know if there's an existing PEG parser for the Rust language, but there are several existing parsers you can use for inspiration if you're okay with writing your own.

I'm guessing the best official resource is the Rust Grammar Work Group. In particular, their repo's grammar/ directory contains a representation of the language's grammar although you may need to do a bit of massaging to get it in a form that works for PEG parsers.

That repo also links to other resources like an old antlr4 grammar or rustypop. You may also want to check out the source code for syn, which is a Rust parser used by most procedural macros.

2 Likes

RustParser.bnf from IntelliJ Rust is also an interesting point on the curve of "looks like grammar -- is production ready".

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.