Macros and operator precedence

I'm trying to understand how declarative macros deal with operator precedence. The following example shows a macro invocation and the expression you would get if you naively unfolded the macro by inlining the argument tokens. These behave differently, which shows that the macro is doing something smarter.

macro_rules! assert_ {
    ($e:expr) => {
        if !$e {
            println!("this doesn't print")
        }
    }
}

pub fn main() {
    assert_!(0usize != 1);
    if !0usize != 1 {
        println!("this prints")
    }
}

One may have their own guesses as to what is going on, but is this behavior actually documented somewhere so I can read more about it?

The reason I was surprised by this is that I had a variant of such a macro where the body is wrapped in a proc macro, which then parses the expansion of !$e naively again. So I'm looking for a semi-formal description of macro expansion to see if there are more surprises of that kind to be aware of.

Rust declarative macros operate on the AST (abstract syntax tree) rather than the source text. If the grammar is something like

not-expr ::= ’!’ expr

and there’s a macro in the position of expr, it’s simply impossible for it to "spread" or edit the tree outside its own node.

However, proc macros are more primitive and operate on linear token streams instead, and it’s up to the macro writer to work things out and possibly build an AST themselves – which is why the syn crate is so popular and one reason why proc macros are considered somewhat black magic and difficult to write.

In general, decl macros also have variable hygiene – they can’t touch bindings outside them unless the name is explicitly given, and cannot leak bindings outside, again unless the name is passed as a parameter. My understanding is that proc macros are not hygienic in this way either.

2 Likes

When you match an entire expression with $e:expr, the substitution of $e into the output is “wrapped in invisible parentheses” so that it always acts like a single token (as all paired delimiters[1] in Rust do) and it is not subject to precedence confusions. All macro fragment specifiers that match more than one token do this. You can bypass this protection by matching a sequence of single tokens instead:

macro_rules! assert_ {
    ($( $e:tt )*) => {
        if !$( $e )* {
            println!("this doesn't print")
        }
    }
}

This macro will make the “naive inlining” mistake when used.

Since these parentheses are invisible (they are a token-tree node but correspond to no characters), they are lost when macro expansions are printed.


  1. except for <..> ↩︎

3 Likes

Yeah, it's not super well documented. The Rust Reference section on "Macros By Example" has this to say:

When forwarding a matched fragment to another macro-by-example, matchers in the second macro will see an opaque AST of the fragment type. The second macro can’t use literal tokens to match the fragments in the matcher, only a fragment specifier of the same type. The ident, lifetime, and tt fragment types are an exception, and can be matched by literal tokens.

What it means is, once a macro_rules! pattern successfully matches some tokens as an expr, those tokens stay bundled up as an expression node permanently. Wherever they get pasted, they won't be re-parsed from scratch again. So, as you noticed, operator precedence doesn't get a chance to regroup them.

Also, it'll be an error to use them in a context where expressions aren't allowed. For example: (playground)

macro_rules! zero {
    ($e:expr) => { 0 as $e };  // error: expected type, found expr
}
1 Like

Thanks for the clarifying answers. I guess if the tt fragment type and proc macros are considered "for experts" and you can observe that kind of difference only through them, then it makes sense that this subtlety is not documented in more excruciating detail in the reference book.

The "expand macros" of the playground does print the parentheses.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.