Procedural macros on stable Rust today (kind of)

TL;DR

Use a build script with syn and quote to parse the source file for a module, find macro invocations in the AST, expand them, write generated code to $OUT_DIR, and use it with include!. See example.

Intro

html5ever is the conforming HTML parser written in Rust for Servo. It heavily uses macros to make the code easier to read and maintain, closer to how the spec is written. For example, part of the tree builder look like:

match mode {
    InBody => match_token!(token { 
        tag @ </a> </b> </big> </code> </em> </font>
              </i> </nobr> </s> </small> </strike>
              </strong> </tt> </u> => {
            self.adoption_agency(tag.name);
            Done
        }

        tag @ <h1> <h2> <h3> <h4> <h5> <h6> => {
            self.close_p_element_in_button_scope();
            if self.current_node_in(heading_tag) {
                // ...

2014: #[plugin_registrar]

Rust macros in html5ever are slides of a presentation that @kmc (who originally wrote html5ever) gave about this at the time.

Some of the macros were implemented with macro_rules! and are mostly unchanged today. Others were procedural macros as compiler plugins with #[plugin_registrar] and libsyntax. We didn’t worry too much about this stuff being unstable since the entire language was unstable at the time and any code had to be regularly ported to newer compiler version.

2015: build.rs + libsyntax + shipping generated code

Rust 1.0 shipped with the mechanism you know for marking features #[unstable] and preventing their use in compilers from the stable release channel. Of course compiler plugins with #[plugin_registrar] are unstable, but we’d like to make html5ever available to users on stable.

One of the plugins (creating a static phf::Map from a JSON file) I could replace with a build script generating a file that is then used with include!(). This scenario is well supported by rust-phf.

The other one, match_token!, is more complex and relied on libsyntax to parse its "arguments" and generate code (with quasi-quoting). I managed to make it not a plugin by instead having a build script:

  • Read the source file for a module (this macro is only used in one file)
  • Parse the source code into a token tree
  • Find invocations of the macro_rules! macro
  • Expand them
  • Serialize back to Rust syntax

But all of this still uses the libsyntax (as a library in a standalone program rather than as a compiler plugin, but still unstable) so it doesn’t work for stable users to run it on every build. So we have the generated file go to the source directory rather than the temporary $OUT_DIR, committed it to the repository, and shipped it in crates.io releases. The build script by default would check that the generated file is up-to-date (with a hash of the source file), and only with an optional Cargo feature would it re-generated the file (and depend on libsyntax).

This worked surprisingly well, but we still had code to maintain every time libsynax broke some API (which is more often than we actually need to run this code, since that file of html5ever doesn’t change a lot these days). It turns out that at some point we accidentally made CI stop building that code, so when we recently realized that was the case we had a fair amount of catching up on libsyntax to do.

So I tried something else.

Today: syn and quote crates

I rewrote this build script to do pretty much the same as before, but without libsyntax. At first I did it with only string manipulation. It worked, but was very ugly and fragile.

I then rewrote it again using the syn and quote crates (both by @dtolnay). The former can parse Rust code into an AST, and the latter provides very nice quasi-quoting. This build script uses no unstable language feature, so we can run it unconditionally and stop shipping generated code. There’s a couple hacks to work around bugs, but overall I’m pretty pleased with the result.

Tomorrow?

At the moment Macros 1.1 only supports custom derive attributes. Hopefully it will soon be updated to support procedural macros in expressions (and items, etc). At that point (and when macros 1.1 itself reached the stable channel) we’ll be able to get rid of half of this giant hack (the part parsing an entire module, walking the AST to find macro invocations, etc.)

In the meantime, though, it’s possible to hack something together and get procedural macros on stable Rust today.

20 Likes