Filter identifiers in macro

Hello !

First, the actual problem:

Summary

I want to define a macro that needs to define some items in order to work. Specifically, it look like this:

macro_rules! my_macro {
    ($input_type:ty) => {
        mod visible_module {
            use super::*; // to be able to manipulate $input_type

            struct InnerType {  // internal to the macro
                inner: $input_type,
            }
            
            static INNER_STATIC: InnerType = /* */; // internal to the macro

            pub fn manipulate_static(/* */) -> /* */ { // function that the user will see and use
                /* */
            }
        }
    };
}

So I want users to only be able to use visible_module and manipulate_static.
But this breaks if, for example, the user inputs a type named InnerType.

So I want to give InnerType a name that isn't used in $typ. To do this I thought I might use the paste crate like so:

macro_rules! define_inner_type {
    ($typ:ty) => {
        define_inner_type! {
            @filter_idents ($typ) ($typ) ()
        }
    };
    // put `$id` in the `$filtered` list
    (@filter_idents
        ($original_type:ty) ($id:ident $($to_filter:tt)*) ($($filtered:ident)*)
    ) => {
        define_inner_type! {
            @filter_idents ($original_type) ($($to_filter)*) ($id $($filtered)*)
        }
    };
    // discard `$other`
    (@filter_idents
        ($original_type:ty) ($other:tt $($to_filter:tt)*) ($($filtered:ident)*)
    ) => {
        define_inner_type! {
            @filter_idents ($original_type) ($($to_filter)*) ($($filtered)*)
        }
    };
    (@filter_idents
        ($original_type:ty) () ($($filtered:ident)*)
    ) => {
        paste! {
            struct [<__ $($filtered)*>] { // non-conflicting name for the struct !
                inner: $original_type,
            }
        }
    };
}

However this does not work: it always matches on $other:tt rather that $id:ident.

Now my question is: is there a way to define a macro that filters identifiers from its input, like

macro_rules! filter_idents { /* what to put here ? */ }
filter_idents! { Option<i32> } // becomes 'Option i32'

?

Or, is there another solution to my problem ?

It's usually fine for macros to just fail if they have bad input. The user will see a message like error[E0428]: the name 'InnerType' is defined multiple times, and should be able to figure out what to do.

If you this is in a public-facing library crate, you might want to consider using a proc macro, so that you have better control over the errors.

If you just want to avoid the name clash, you can just mangle your names a bit to decrease the likelihood of a conflict. Just stick a __ on the front C++ std implementation style and you'll basically never (but not guaranteed never) clash with an outside name.

You can avoid the infinite type problem happening, though, by using paste! to create a structure name derived from the wrapped type (such that its name is always distinct from the wrapped type).

macro_rules! my_macro {
    ($input_type:ty) => {
        mod visible_module {
            use super::*; // to be able to manipulate $input_type

            paste! {
                pub fn manipulate_static() {
                    #![allow(nonstandard_style)]

                    struct [< __my_macro__internal__ $input_type >] {
                        inner: $input_type,
                    }

                    static [< __my_macro__internal__ $input_type >]:
                        [< __my_macro__internal__ $input_type >] = todo!();

                    /* implement the function */
                }
            }
        }
    };
}

With "full macro hygiene," (i.e. "macros 2.0"), this wouldn't be necessary, but who knows if or when that's coming.

I agree that name clash is not that big an issue, but I would like to avoid it if possible... (and finding a solution for this might be useful in the future anyways :slight_smile:)
In particular I am beginning to write proc macros that themselves generates macro calls, and conflicts in this case may be much more confusing, and maybe unavoidable, like if two macro authors chose the same internal name... (although I never encountered any such conflict, so I guess it is fine :sweat_smile:)

All in all this would seem... 'cleaner' to me if I could make name conflict entirely impossible :sweat_smile:

This was my idea, except that this does not work if $input_type is more than a single identifier (like Option<i32>).
So I was trying to extract the identifiers that make up the type, but I couldn't find a way to do it...

Why the nested macro call fails

This is due to "invisible grouping" / wrapping in "invisible parenthesis": this is something which happens to all macro captures / metavariables / transcribers but for :tt, :ident, and :lifetime. So, in your case, once you capture, say, Option<i32> as a :ty, everytime you emit that metavariable, you are actually emitting ( Option<i32> ), with invisible and type-tagged parenthesis. And ( ... ) won't thus match the $id:ident case you had, but the $other:tt fallback.

Quoting The Reference:

For the general case, you can use:

to work around that restriction. But using a proc-macro helper can be deemed a bit too heavy-weight, and is indeed not necessary in your case.


Some workarounds to palliate your issue

Rather than capturing a :ty, you can (sometimes) capture a $($_:tt)* repetition:

macro_rules! define_inner_type {
    // put `$id` in the `$filtered` list
    (@filter_idents
        $original_type:tt
        ( $id:ident $($to_filter:tt)* )
        ( $($filtered:ident)* )
    ) => (
        define_inner_type! {
            @filter_idents
            $original_type
            ( $($to_filter)* )
            ( $id $($filtered)* )
        }
    );

    // discard `$other`
    (
        @filter_idents
        $original_type:tt
        ( $other:tt $($to_filter:tt)* )
        $filtered:tt
    ) => (
        define_inner_type! {
            @filter_idents
            $original_type
            ( $($to_filter)* )
            $filtered:tt
        }
    );

    (@filter_idents
        ( $original_type:ty )
        ( /* Nothing left */ )
        ( $($filtered:ident)* )
    ) => (
        ::paste::item! {
            struct [<__ $($filtered)*>] { // non-conflicting name for the struct !
                inner: $original_type,
            }
        }
    );

    (
        $($typ:tt)*
    ) => (
        define_inner_type! {
            @filter_idents
            ( $($typ)* )
            ( $($typ)* )
            ()
        }
    );
} use define_inner_type;

You could also (ab)use the fact the caller cannot have a type named visible_module either, since it would also break with your macro. Thus, you could also use visible_module instead of InnerType to name that.

1 Like

Thank you for the explanation! Using visible_module again seems like a good solution here.

1 Like