Status of Rust grammar?


#1

Hello! Where can I find the most complete reference Rust grammar?

The document at https://doc.rust-lang.org/grammar.html seems both obsolete (it has proc type) and a work in progress (a ton of FIXMEs)


#2

The nightly version might be a bit newer: https://doc.rust-lang.org/nightly/grammar.html

There is also a reference grammar defined here: https://github.com/rust-lang/rust/tree/master/src/grammar


#3

Thanks!

I am a bit confused about which one is “the reference grammar”. This issued suggests that grammar.md is the reference, but src/grammar/* seems to be more recently updated.


#4

In general, this whole area needs work. I would love for someone to contribute a full grammar. src/doc/grammar is an attempt at a new grammar, src/grammar is older.


#5

My understanding is that the grammar in src/grammar is basically complete. It is testable and has good coverage, but it’s not tested automatically.


#6

You can run make check-grammar to test the grammar in src/grammar.

It is tested against all .rs files in src, except for those in src/test/parse-fail. Grammar testing was the reason why some tests were moved to parse-fail.


#7

I started work a little while ago writing a new grammar reference straight from the libsyntax source code (even found an ICE that way). I figured it would be easier to start again from scratch than attempt to work out which parts of the existing reference were and were not up-to-date.

Haven’t touched that in a while, though. If anyone else wants to pick it up, feel free to use the work in progress.


#8

Yeeahhh I worked on this for a bit but got… distracted :slight_smile: It’s an ok place to start contributing, though, if you’re up for doing a bit of research! You can even totally break everything about src/grammar and it’s ok :blush:


#9

Is this still the case? It’s still missing the grammar for no_struct_literal_expr. Any hints?


#10

My understanding is that nothing has fundamentally changed. Here’s a list of Rust grammars I know about: https://internals.rust-lang.org/t/grammar-of-rust/4094/2?u=matklad.


#11

Why is grammar not deemed fundamental? How is the language developing with an incomplete grammar? I’m not criticizing, I’m just curious. I’m trying to gather an honest view of Rust. I just hit a roadblock while learning about match and I just don’t seem to find a path towards a solid source of information. The latest missing piece is this: no_struct_literal_expr. Any hints?


#12

Yep, that’s tricky!

Consider this piece of Rust:

if foo {
    println!("foo is true");
}

clearly, this is an if expression. Now look at this:

let x = Foo { some_field: 92 }

clearly, this is a struct literal.

Now, the tricky bit:

if Foo {

Is this a complete if condition and the beginning of the “then” branch, or is this a beginning of the struct literal, which is part of the condition? Rust parser mandates the first parse. In other words, struct literals are forbidden in conditions (if, when, match), and no_struct_literal_expr means any expression except for the struct literal.

Why is grammar not deemed fundamental? How is the language developing with an incomplete grammar?

There’s a certain disconnection between a theory and practice of programming languages. In theory, you write a grammar for the language, and then generate the parser using your favorite technique. In practice, almost all popular languages are implemented using a hand-written recursive descent parser, and are not easily expressed precisely with a formal grammar.

I would very much liked to have a proper Rust grammar (I maintain an alternative parser :slight_smile: ), but Rust as a language can perfectly fine exist without one.


#13

Thanks for the time… but I was looking for a grammar construct.

I agree, I am excited about Rust, but I have trouble finding legitimate and complete docs so far. What is acceptable as no_struct_literal_expr, this needs to be properly defined somewhere. Where is it?

I’m trying to build a railroad syntax diagram and the no_struct_literal_expr is the missing link for me to gather a complete, technically sound, understanding of match.


#14

I’m trying to build a railroad syntax diagram and the no_struct_literal_expr is the missing link for me to gather a complete understanding of match.

You need to “duplicate” the grammar for expressions, and have separate rules for two cases: struct literals allowed, struct literal forbidden.

That is, at the first glance rust expression grammar should look like this:

expr ::= path | expr OP expr | expr.identifier | path { fields } 
if ::= if expr block [ else block ]

But this would be ambiguous (for LL parser), if if is followed by the struct literal.

So it looks more like this

any_expr ::= path | any_expr OP any_expr | any_expr.identifier | path { fields } 
no_struct_literal_expr ::= path | no_struct_literal_expr OP any_expr | no_struct_literal_expr.identifier 
if ::= if no_struct_literal_expr block [ else block ]

I agree, I am excited about Rust, but I have trouble finding legitimate and complete docs so far.

Yeah, Rust has a great user oriented documentation, but the reference docs are not there yet, including the grammar. Though there is a recent effort to make the reference complete: https://github.com/rust-lang/rust/issues/38643

this needs to be properly defined somewhere. Where is it?

The only authoritative source of information about the grammar is the parser: https://github.com/rust-lang/rust/blob/master/src/libsyntax/parse/parser.rs (RESTRICTION_NO_STRUCT_LITERAL in particular).


#15

I was hopping the parser would be the second tier. I’m going to use what you gave me, see if I come up with a reasonable grammar. Thanks.


#16

I know you’re not trying to be precise here, but for posterity, this definition is incorrect. A struct literal expression may also not appear on the RHS of an operator in a no_struct_literal_expr. (any_expr) is allowed. https://is.gd/seEH7Q


#17

Yep, thanks! I’ve mixed that up with with RESTRICTION_STMT_EXPR which controls

fn main() {
  // valid
  () == { () }; 

  // syntax error   
  { () } == ();

  // valid
  let _ =   () == { () }; 
}

#18

Though it is indeed the case that you can switch from no_struct_literal_expr to unrestricted expr, the better example would be parens:

any_expr ::= ... | ( any_expr ) | ....
no_struct_literal_expr ::= ... | ( any_expr) | ...

So, something like if (S { field: 0} == S { field: 1}) {} will parse OK because of the parenthesis.