Rust Parser reduce/reduce conflict

I am currently writing a parser for rust with a small subset of its features as an assignment for uni. I am getting the following reduce/reduce conflict. It makes sense to me that it is a conflict but was wondering how I could fix it, as far as I know a expressionWithBlock is never callable? As for the grammar I wrote, it basically follows the rust reference.

The conflict:

definitions/rust.y: warning: reduce/reduce conflict on tokens LPAREN, MINUS [-Wcounterexamples]
  Example: expressionWithBlock • LPAREN expression RPAREN
  First reduce derivation
    statements
    ↳ 31: statement_plus                          expressionWithoutBlock
          ↳ 33: statement                         ↳ 51: groupedExpression
                ↳ 37: expressionStatement               ↳ 59: LPAREN expression RPAREN
                      ↳ 44: expressionWithBlock •
  Second reduce derivation
    statements
    ↳ 32: expressionWithoutBlock
          ↳ 52: callExpression
                ↳ 61: expression                  LPAREN callParams       RPAREN
                      ↳ 47: expressionWithBlock •        ↳ 63: expression

The grammar in the reference doesn't exactly match the compiler's behavior in this case. According to the grammar in the reference, any expression followed by parentheses represents a call, so you could in theory write code like this:

fn f(x: usize) {
    println!("f({x})");
}

fn g(x: usize) {
    println!("g({x})");
}

fn main() {
    if true {
        f as fn(usize)
    } else {
        g as fn(usize)
    }
    (42);
}

But, if you put this in the playground, you'll see that it doesn't actually compile. Instead, you need to add parentheses around the if statement, which is nowhere in the reference's grammar. I suspect that this was added to resolve the ambiguity you're running into here.

How would you recommend I change the grammar so that I don't have that ambiguity?

Well, the compiler's behavior implies that instead of

CallExpression :
   Expression LPAREN CallParams? RPAREN

You need something like:

CallExpression :
   GroupedExpression LPAREN CallParams? RPAREN
 | PathExpression LPAREN CallParams? RPAREN

I don't know if this is sufficient to fix the ambiguity or not in practice. AFAIK, the Rust parser is hand-written, so it may have some manual ambiguity resolution or be outright non-context-free in places.

2 Likes

Thanks this did fix the problem for the LPAREN in my case. Now I only have the MINUS problem which it thought was the same but definitely is not. For this one I am sure it should be the second one, but can't figure out how to fix it.

Example: expressionWithBlock • MINUS expression
  First reduce derivation
    statements
    ↳ 30: statement_plus                          expressionWithoutBlock
          ↳ 32: statement                         ↳ 49: operatorExpression
                ↳ 36: expressionStatement               ↳ 67: negationExpression
                      ↳ 43: expressionWithBlock •             ↳ 73: MINUS expression
  Second reduce derivation
    statements
    ↳ 31: expressionWithoutBlock
          ↳ 49: operatorExpression
                ↳ 68: arithmeticOrLogicalExpression
                      ↳ 76: expression                  MINUS expression
                            ↳ 46: expressionWithBlock •

I don't really have an idea how to fix that one, but rustc seems to choke on it as well. It corresponds to something like this:

fn f(x: bool)->i32 {
    if x { 5 } else { 10 } - 3
}

Which fails with E0308: mismatched types of all things. Since type analysis happens well after parsing, I can only assume that there's some special case in the parser to "handle" this by kicking the problem farther down the line.


From a user's perspective, that code could be intended to behave in either of two different ways:

  1. Discard the result value of the if and return a constant -3:

    fn f(x: bool)->i32 {
        if x { 5 } else { 10 };
        - 3
    }
    
  2. Subtract 3 from the result value of the if:

    fn f(x: bool)->i32 {
        (if x { 5 } else { 10 }) - 3
    }
    

Edit: I filed a GitHub issue to hopefully improve the diagnostic output for this case.

2 Likes

This ambiguity is documented in the reference:

An expression that consists of only a block expression or control flow expression, if used in a context where a statement is permitted, can omit the trailing semicolon. This can cause an ambiguity between it being parsed as a standalone statement and as a part of another expression; in this case, it is parsed as a statement.

It's somewhat weird that this is allowed:

fn g(x: i32) {}

fn f(x: bool) {
    g(if x { 5 } else { 10 } - 3)
}

So fixing the grammar to remove the ambiguity is not as simple as I initially thought because whether this is allowed depends on the surrounding context of an expression.

1 Like

It's just a type error because it parses correctly, for instance this is valid code:

fn f(x: bool)->i32 {
    if x { () } else { () } - 3
}

The rule is that an expression statement that doesn't end with a semicolon must have type ().

In this case, the statement is not permitted, since function argument must be a single expression. So there's no ambiguity - if is resolved to be the part of that expression.

Yes indeed. What I meant when I said it's "somewhat weird" is that it makes the grammar context-dependent.

But it is still possible to resolve the ambiguity in a context-free way in the grammar. One would have to split Expression into two things: ExpressionWhereStatementIsPermitted and ExpressionWhereStatementIsNotPermitted.

1 Like

This technically is permitted since an if expression boils down to an expression. Which then is expression - 3 which is a possible operator expression.