Why is parenthesis required when immediately calling a closure?

fn main(){
   let f = ||{}(); // compiler complains this code
}

The suggestion advised by the compiler is to use a parenthesis to surround the closure. However, according to the syntax documented by Call expressions - The Rust Reference

CallExpression :
Expression ( CallParams ? )

CallParams :
Expression ( , Expression )* , ?

The Expression preceding (...) can be

Expression :

 ExpressionWithoutBlock | ExpressionWithBlock

Where the ExpressionWithoutBlock can be a ClosureExpression.

That is, ||{}() should be valid on syntax, which is a valid function call. However, the compiler just complains this code is ill-formed and a surrounding parenthesis is required.

I wonder why this code is ill-formed, it should be valid according to the Rust Reference.


Incidentally, if the closure is used as the Expression in a MethodCallExpression, it is permitted by the compiler.

2 Likes

The Reference isn't normative and the grammar within is informal and incomplete.

That calls a method of the return value of the expression or block following ||.

2 Likes

Come to think of it, the second part of my reply is analogous to why the parenthesis are needed; otherwise the operand of the call expression[1] is the expression/block following ||.

fn main() {
    let foo: Box<dyn FnOnce() -> i32> = Box::new(|| 0);
    let f = || { foo }();
    // fails
    // let _: Box<_> = f;
}

  1. the thing that is called ↩︎

6 Likes

in this particular case it's not that.

Why do you start with call expression? Closure comes first, then you have expression {}() which is interepreted as an empty block which is, then, called.

It's not clear why openind ( brace have to end closure expression and it doesn't end the closure expression.

Syntax parsing in Rust follows “old school” approach where form alone determines the structure of AST.

That's why, in particular, it have turbo-fish operator: that way one doesn't need to know whether i32 is a type of name of variable to know how to parse foo<i32>(1).

Indeed. And error messages generator is allowed to jump around and parse code in arbitrary fashion, because error messages can be wrong and abigious.

Parsing of code happens from left to right and if, at later stages of the compilation AST doesn't make any sense then it's not reconfigured to allow compiler to accept that code.

Experience shows that it's best compromise: if, instead of adding smarts to the error message reporting you are adding them to the compiler and teach compiler to do “what developer meant, not what developer wrote” then you end up not with nice, easy to use, language, but with a fractal of bad design, but if you don't add smarts of error messages then people start complaining that your compiler messages are cryptic and hard to understand.

1 Like

The grammar is ambiguous: if could be (||{})() or ||({}()). The precedence of parsing here doesn't seem to be specified by the reference.

5 Likes

Without the parens it means something else:

fn not_a_closure() {}

fn main() {
    let x = { not_a_closure }();
}

I spent way too long looking at this, but I think I see what you mean now.

Closures don't require braces, e.g. we can have a closure that returns ()

let f = || ();

So then the following is ambiguous:

let f = || { foo(); } ();

It could realistically be interpreted as either of these two:

let f1 = (|| { foo(); }) ();
let f2 = || { { foo(); } () };
1 Like

Yes essentially, but it's not ambiguous. It's always the second one.

1 Like

What do you mean by that? It's not about precedence, it's about greediness.

And Rust parser is greedy (exceptio probat regulam in casibus non exceptis: there are non-greedy stars in the grammar which means all other stars are greedy).

Which means that after you see || there are no ambiquity: {}() can only be parsed in one way and because parser is greedy there are no reason to stop after {}.

The grammar is, formally speaking, still ambiguous. Greedy is one way of resolving the ambiguity, but that only resolves the ambiguity for the parser, not the grammar.

The difference rarely matters in practice, but a grammar does not care about ambiguity or precedence; both potential parse trees are valid productions of the grammar.

Specifying a grammar is a convenient way to specify a parser, but in the case of an ambiguous grammar you need additional information to disambiguate the parse. Specifying greediness and/or parse precedence is such side information. Oftentimes such side information can be reified into the grammar, sometimes it can't, oftentimes doing so makes the grammar practically useless for anything other than generating a parser.

I also disagree with the concept that the presence of nongreedy repetition implies all other repetition should be considered greedy. Another valid interpretation is that all other reputation should be considered (intended) non-ambiguous, where both greedy and nongreedy (should) produce the same parse tree.

Barring formal generation or verification of a parser against a grammar, though, recursive descent parsers absolutely will tend to grow cases where formally ambiguous cases are resolved greedily. It's the natural result of the technique.

1 Like

By the way / fun fact: if your closure adds a (return) type signature, then the braces become part of, and thus delimits, the closure expression, instead of merely being an expression that happens to be a block expression as with || {}.

Thus

let f = || -> _ {} ();

compiles fine, meaning the same as

let f = ( || -> _ {} ) ();

and parenthesizing (with normal parens) would be syntactically wrong

// This not compile, with “error: expected `{`, found `(`”

let f = || -> _ ( {} () );
//              ^ expected `{`
4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.