How to parse enum macro?

dakom · December 26, 2019, 8:32am

This is kinda what prompted me to post View macro output live? - #2 by zeroexcuses but they are separate topics, so figured it's best to post this separately...

I'd like to write something like:

bridge_events![
    InitialLoad,
    ChangeTodo(EntityId, String),
]

which will turn into both of these:

#[cfg_attr(feature = "ts_test", derive(EnumIter, AsRefStr))]
#[derive(FromPrimitive, Copy, Clone, Debug)]
#[repr(u32)]
pub enum BridgeEvent {
    InitialLoad,
    ChangeTodo,
}

pub enum Event {
    InitialLoad,
    ChangeTodo(EntityId, String),
}

So it separates the enum variant name from the params to create the BridgeEvent variants and then reconstructs it (or uses the original) to just write Event variants directly.

I'd imagine the attributes above BridgeEvent don't really matter - but figured I'd include it just in case that changes things...

Yandros · December 26, 2019, 10:27am

You "just" have to manually parse what represents the syntax of an enum definition: a comma-separated sequence of variant names, where each may be followed by a parenthesized comma-saparated sequence of types (technically braces with named fields are a possibility too, but handling both within the same macro would require making it much more complex), and both sequences accept trailing commas:

macro_rules! bridge_events {(
    $(
        $VariantName:ident $(
            ( $($T:ty),+ $(,)? )
        )?
    ),+ $(,)?
) => (
    #[cfg_attr(feature = "ts_test",
        derive(EnumIter, AsRefStr),
    )]
    #[derive(FromPrimitive, Copy, Clone, Debug)]
    #[repr(u32)]
    pub
    enum BridgeEvent {
        $(
            $VariantName ,
        )+
    }

    pub
    enum Event {
        $(
            $VariantName $(
                ( $($T ,)+ )
            )? ,
        )+
    }
)}

Playground

See The Little Book of Rust macros for a guide regarding them.

dakom · December 27, 2019, 6:10am

Thanks!

Right - it should support braces too... is there not a way to capture "everything after the variant name" to cover both cases (parens and braces)?

Yandros · December 27, 2019, 8:25pm

Yes, both ( ... ) and { ... } can be captured in the same rule (with the :tt "category"), but , is also a :tt (everything kind of is), meaning that using

    $VariantName:ident $( $assoc:tt )? ,

is ambigous, so we need to manually parse this using different rules and recursion ("recursive muncher" pattern, c.f., the aforementioned Little Book Of Macros):

You start by setting up the muncher with this kind of "entry point" rule:

// == ENTRY POINT ==
(
    $($input:tt)*
) => (bridge_events! {
    // a sequence of brace-enclosed variants
    @variants []
    // remaining tokens to parse
    @parsing
        $($input)*
});

Hoping to end with something like:

// Done parsing, time to generate code:
(
    @variants [
        $(
            {
                $VariantName:ident $( $variant_assoc:tt )?
            }  
        )*
    ]
    @parsing
        // Nothing left to parse
) => (
    #[cfg_attr(feature = "ts_test",
        derive(EnumIter, AsRefStr),
    )]
    #[derive(FromPrimitive, Copy, Clone, Debug)]
    #[repr(u32)]
    pub
    enum BridgeEvent {
        $(
            $VariantName ,
        )*
    }

    pub
    enum Event {
        $(
            $VariantName $( $variant_assoc )? ,
        )*
    }
);

(At @variants, it is the use of outer braces that allows to unambiguously parse a sequence of stuff despite there being optional trailing tokens)

And now, the recursion / stepping logic:

// VariantName
(
    @variants [
        $($variants:tt)*
    ]
    @parsing
        $VariantName:ident
        $(, $($input:tt)*)?
) => (bridge_events! {
    @variants [
        $($variants)*
        {
            $VariantName
        }
    ]
    @parsing
        $( $($input)* )?
});

// VariantName(...)
(
    @variants [
        $($variants:tt)*
    ]
    @parsing
        $VariantName:ident ( $($tt:tt)* )
        $(, $($input:tt)*)?
) => (bridge_events! {
    @variants [
        $($variants)*
        {
            $VariantName ($($tt)*)
        }
    ]
    @parsing
        $( $($input)* )?
});

// VariantName { ... }
(
    @variants [
        $($variants:tt)*
    ]
    @parsing
        $VariantName:ident { $($tt:tt)* }
        $(, $($input:tt)*)?
) => (bridge_events! {
    @variants [
        $($variants)*
        {
            $VariantName { $($tt)* }
        }
    ]
    @parsing
        $( $($input)* )?
});

If the accept-it-all entry-point rule is the first one, the macro will indefinitely recurse within it; that's why it has to be the last rule.

Playground

Yandros · December 27, 2019, 8:44pm

Also, regarding the features, like accepting attributes such as docstrings on the enum, as well as the ergonomics: "decorating" an enum definition rather that your own syntax of a sequence, you can make your macro be used like this:

bridge_events! {
    /// Some docstring on the `Event` enum
    pub
    enum Event {
        InitialLoad,
        ChangeTodo(EntityId, String),
    }
}

by having the following macro (I have dropped the support for variants with braces for the sake of simplicity):

macro_rules! bridge_events {(
    $( #[$meta:meta] )* // captures attributes and docstring
    $pub:vis // (optional) pub, pub(crate), etc.
    enum $EnumName:ident {
        $(
            $VariantName:ident $(
                ( $($T:ty),+ $(,)? )
            )?
        ),+ $(,)?
    }
) => (
    paste::item! {
        #[cfg_attr(feature = "ts_test",
            derive(EnumIter, AsRefStr),
        )]
        #[derive(FromPrimitive, Copy, Clone, Debug)]
        #[repr(u32)]
        $pub
        enum [< Bridge $EnumName >] {
            $(
                $VariantName ,
            )+
        }
    }

    $(#[$meta])*
    $pub
    enum $EnumName {
        $(
            $VariantName $(
                ( $($T ,)+ )
            )? ,
        )+
    }
)}

Where paste::item! { /* item definition here */ } is a macro that allows to use the
[< Stuff To Concatenate >] syntax inside the item definition to concatenate them. See paste - Rust

The syntax

bridge_events! {
    /// Some docstring on the `Event` enum
    pub
    enum Event {
        InitialLoad,
        ChangeTodo(EntityId, String),
    }
}

can even be further improved with the macro_rules_attribute! crate, which would let you write:

#[macro_rules_derive(bridge_events!)]
/// Some docstring on the `Event` enum
pub
enum Event {
    InitialLoad,
    ChangeTodo(EntityId, String),
}

Yandros · December 27, 2019, 9:10pm

To merge both the nicer call-site syntax and the support for variants with braces, without making the macro become a mess, you can decide to use a proc_macro_derive macro rather than a macro_rules! macro.

This requires using a whole (helper) crate just for the definition of the #[derive(...)] macro, but lets you write:

#[derive(Bridge)]
/// Some docstring on the `Event` enum
pub
enum Event {
    InitialLoad,
    ChangeTodo(EntityId, String),
}

at the call site.

For that, you can:

create your helper crate (named, for instance, <your_crate>-proc_macro):

cargo new --lib --name <your_crate>-proc_macro ./proc_macro/

Add this to your Cargo.toml

[dependencies.proc_macro]
package = "<your_crate>-proc_macro"
version = "<version of your main crate>"
path = "./proc_macro/"

and this to your src/lib.rs:
```
#[macro_use] extern crate proc_macro;
```

Add the following to ./proc_macro/Cargo.toml (and set the version to match your main crate's):

[lib]
proc-macro = true

[dependencies]
# proc-macro2 = "1.0.*"  # needed to use TokenStream2 and/or Span 
quote = "1.0.*"
syn = { version = "1.0.*", features = [ <required syn features> ] }

And then, you can have your #[derive(...)] definition in ./proc_macro/src/lib.rs:

Click to expand

extern crate proc_macro;

use ::proc_macro::TokenStream;
use ::quote::quote;
use ::syn::{
    Data,
    DeriveInput,
    Error,
    Ident,
    parse_macro_input,
    spanned::Spanned,
};

#[proc_macro_derive(Bridge)] pub
fn bridge_events (input: TokenStream) -> TokenStream
{
    let input = parse_macro_input!(input as DeriveInput);
    let span = input.span();
    let enum_data = if let Data::Enum(it) = input.data { it } else {
        return Error::new(
            span, "Expected an `enum`",
        ).to_compile_error().into();
    };
    let enum_name = input.ident;
    let bridge_enum_name = Ident::new(
        &format!("Bridge{}", enum_name),
        span,
    );
    let variant_names =
        enum_data
            .variants
            .into_iter()
            .map(|variant| variant.ident)
    ;
    TokenStream::from(quote! {
        #[cfg_attr(feature = "ts_test",
            derive(EnumIter, AsRefStr),
        )]
        #[derive(FromPrimitive, Copy, Clone, Debug)]
        #[repr(u32)]
        pub
        enum #bridge_enum_name {
            #( #variant_names , )*
        }
    })
}

See ::syn's documentation for more info.

dakom · December 29, 2019, 3:26pm

A lot to chew over... not sure how deep I'm going to dive into properly understanding this at the moment - but will surely reference it (and hopefully revisit each time I come across that comment!). Thanks for the help!

In the meantime... from a very rough glance, it looks like the proc_macro version is both simpler to read (just normal Rust) and has nicer ergonomics for the consumer...

dakom · December 30, 2019, 8:55am

On further thought, I think what I really need is a CLI tool... so there will be one "source of truth" with regular Rust enums, and the tool will then output Typescript and Rust code based on that. (what I need is more customization than what comes out of the box with wasm-bindgen)

Is there a standard way to get from a blob of text (e.g. reading in enums.rs) into the tokens that syn needs to work with? Any other tips for going this route are appreciated too!

Yandros · December 30, 2019, 2:56pm

Yes, syn can work with ::proc_macro2::TokenStream, which itself implements FromStr, meaning that you can ::std::fs::read_to_string() a file and then .parse::<::proc_macro2::TokenStream>() it.

Playground

dakom · December 30, 2019, 3:31pm

Is the playground example reading itself and displaying the TokenTree?! That's got to be some award-winning meta/inception post!

system · March 29, 2020, 3:31pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Macro parse enum with wrapped type help	7	258	February 14, 2025
Enum variant name as &str without Debug,Display,proc macro or heap allocation help	12	3149	August 30, 2024
Extracting only the name of a type with macro_rules help	7	1220	November 8, 2023
Iterate over enum variants in macro help	2	548	October 21, 2024
Use attribute proc macro to generate enum variants	6	1491	October 4, 2024

How to parse enum macro?

Playground

Related topics