ANTLR-Rust: anybody tried it out?

I haven't tried it yet, but looking over the documentation, I couldn't find any step-by-step instructions for ANTLR newbies. I'm somewhat experienced in Rust, and plan to try to get it working, but figured I'd ask first in case I missed some crucial documentation.

I realize it's "brand new, hot off the presses, likely buggy" and all. Just want to get a feel for how it's coming along since someday I would want to use it.

1 Like

I'm keeping an eye on it because I would like to use ANTLR with Rust in the near future, but I haven't tried it out yet. I have the impression this crate is not being developed anymore though... It's a big undertaking and I would understand if a single developer didn't have the time for that.

Have you used ANTLR in another language? It should help understand how it works. Terence Parr's book is very helpful too, I recommend it even if it's old compared to the current version of the tool. Then the crate tests give a good idea of what is generated and how the visitor and listener patterns work.

I expect it will be a little more awkward in Rust since it doesn't have inheritance, but from what I saw of the listener pattern, implementing the generated traits seems straightforward.

1 Like

Indeed, that's my current plan: work thru a bunch of other ANTLR language-targets, then use those to try to understand how to use the ANTLR-Rust runtime. What would have been nice, is even a single worked-thru example as a guide. Just one, not a bunch -- just one.

Just a (non-rhetorical) question: why would you want to use ANTLR, which uses LL(k) parsing¹, over one of the LR solutions available in Rust, eg LALRPOP?

¹ One of the main limitations of LL(k) parsing is that it doesn't support left-recursion, which often leads to the need to rewrite grammar productions, and thus to artifacts and incidental complexity in parse trees.

I have always been a fan of LL parsing. I wrote an LL(1) grammar interpreter for Coq V5.10 30 years ago, and currently maintain the Camlp5 system, again based on LL(1) grammars. I have always found that debugging LL(1) grammars is simpler than debugging LR grammars: to debug a conflict in an LR grammar has always involved (for me) executing the automaton-construction algorithm in my head, to deduce the error. And that's a PITA: I know how to do it, and sure, it's straightforward, but by comparison debugging LL(1) ambiguities is child's play.

I'm aware that for every language I've ever worked with, there are LR parser-generators, and I've used 'em in every language I've ever worked with. But I continue to believe that LL is more comprehensible and transparent to the grammar-author.

Obviously, de gustibus, etc, etc.

1 Like

I should have added: typically in LL grammar systems, there is special provision made for left-recursion: that's true in ANTLR and in the grammar interpreter system fo Camlp5. Typically these systems also have nice support for operator-precedence parsing, too, just as LALR systems do.

3 Likes

I've never had any problem with the adaptative LL(*) algorithm, and it's very efficient. Another appealing argument for me is that ANTLR separates grammar from source code, which makes for maintenable code. The listener approach is even cleaner in this regard.

For example, writing this is not an issue, and automatically manages the operator precedence:

expr:   expr ('*'|'/') expr
    |   expr ('+'|'-') expr
    |   INT
    |   '(' expr ')'
    ;

I've had a look at several alternatives, including LALRPOP, but I just can't easily develop large applications with this mix of grammar and code.

EDIT: is LALRPOP still alive anyway?

Once a parser generator has been written, any work on it will be incremental at best. Parsing theory doesn't advance quite that fast I'd say.
So it's possible that even without commits the project is quite healthy.

Let's hope it's true.

It's not so much about the advance of parser theory than the number of open issues and pull requests. My question came from that and this exchange I saw on their own gitter lobby:

FWIW, While you seem fairly set on LL(1), I've been working on a lsp implementation for the grmtools parser generator. I've at times felt that the entire editing/generating cycle is half the problem, since it requires one to context switch between editing and getting conflicts. So by deducing conflicts as you edit with the lsp they can be sometimes less of a PITA. It parses input files as well as checking the grammar itself for ambiguities.

Anyhow has still not seen a complete release including most of the work on diagnostics but it's here if you or anyone else would like to give it a try,
nimbleparse_lsp

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.