Crates for parsing toy programming languages

#1
  1. I am familiar with regex / cfg / flex / bison / parsec.

  2. I want to parse toy programming languages. (Not structured data like Json / XML. Not free form text. Just toy programming languages.)

  3. What are the top 3-4 crates I should play with to get a sense of the current libraries available for parsing in Rust?

0 Likes

#2

The big parsing libraries in Rust (at least that I know of) are:

All of these can be (and have been) used to parse programming languages, but I personally feel like LALRPOP is the best fit for that use case - here’s the grammar for my own toy programming language :slight_smile: That said, it does have the slight downside of making you write your own lexer in some cases, so one of the others might be more appropriate for prototyping.

0 Likes

#3

I made a parsing crate some months ago which could be useful to you: gramatica. The documentation has a few examples. I have used it to parse things like expressions for a calculator, the configuration files for a network simulator, and a small extension of Rust itself.

However, I have no record that anyone beside me uses it.

0 Likes

#4

Unorthodox opinion: pick https://docs.rs/logos/0.9.4/logos/ for generating a lexer and write a recursive descent + pratt parser by hand.

0 Likes

#5

I may be biased because i have not tried the others, but i really liked working with pest. Its website has an online editor where you can try it out: https://pest.rs/
(There was some controversy on the benchmarks there in the past, i’m not sure whether they are accurate.)

0 Likes

#6

I built a prototype for a document loader for the Azul framework. It’s basically a ‘new’ XML style parser but it was rejected because it’s probably better to use an actual XML parser, but it was otherwise pretty fun to build and a great experience with rust’s enums

0 Likes