Simple language peg/pest can parse that lalrpop can not?

zeroexcuses · October 3, 2021, 11:43pm

This is a followup to Simple grammar peg/pest can parse but lalrpop can not?

Here, I am looking for a Language peg/pest can recognize but lalrpop can not.

Note, this is different from a grammar that lalrpop can not handle.

In particular, a correct solution needs to define a language L where:

there is a peg/pest grammar that recognizes L
all grammars for lalrpop fails to recognize L

(i.e. providing a grammar that recognizes L but lalrpop can't handle is not sufficient; as there might be a different grammar that recognizes L that lalrpop does handle).

Lionelf329 · October 4, 2021, 12:22am

It is also possible to build LL parsers and LR parsers from parsing expression grammars, with better worst-case performance than a recursive descent parser, but the unlimited lookahead capability of the grammar formalism is then lost. Therefore, not all languages that can be expressed using parsing expression grammars can be parsed by LL or LR parsers.

~ wikipedia

The way I interpret that is that the only way for a language to satisfy your criteria is if its PEG grammar uses predicates, so let's play with that and see where it gets us. Let L1 and L2 be CF languages. A PEG grammar that has the rule S -> &(L1 eof) L2 eof is equivalent to (L1 ∩ L2) eof, which is not necessarily CF because the set of CF languages is not closed under intersection. LALR⊆LR⊆CF, so that means lalrpop wouldn't be able to parse it.

The language you're looking for is {a^n b^n c^n eof | n∈N}. The PEG grammar probably looks something like this:

S -> &(X c* eof) a* Y eof
X -> a X b | epsilon
Y -> b Y c | epsilon

The big long explanation is as much there for me as for you, I had to learn a fair bit about PEGs to figure it out. Your previous question was easier

zeroexcuses · October 4, 2021, 12:35am

Is there a typo?

Quoting: Pumping lemma for context-free languages - Wikipedia

For example, the language {\displaystyle L={a^{n}b^{n}c^{n}|n>0}} ${isplaystyle L=a^{n}b^{n}c^{n}|n>0}$ can be shown to be non-context-free by using the pumping lemma in a proof by contradiction.

Lionelf329 · October 4, 2021, 12:36am

No typo, that's actually the point. PEG grammars can accept non-context-free languages, because context-free lookaheads that don't consume input can be used to implement language intersections.

zeroexcuses · October 4, 2021, 12:46am

You're right. I did not realize (until now) that PEG can recognize non-context-free languages.

quinedot · October 4, 2021, 1:32am

This paper lists some other examples, such as

Unary languages consisting of all words whose length is a power of 2
Unary languages consisting of all words whose length is a power of any fixed c > 0
Languages consisting of palindromes whose length is a power of 2

However it's also conjectured that

Languages consisting of palindromes of even length are not in PEG

(That's a context-free language, but it's not recognizable by LALR.)

system · January 2, 2022, 1:32am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Simple grammar peg/pest can parse but lalrpop can not?	6	1578	January 2, 2022
Rust Grammar as PEG help	6	1343	October 20, 2020
Grammar rules in pest (a PEG parser generator) help	3	1757	January 12, 2023
Lalrpop regex precedence problem help	3	781	January 12, 2023
Question for pest parser example help	4	548	December 29, 2022

Simple language peg/pest can parse that lalrpop can not?

Related Topics