Proc macros full path best practices

Procedural macros or code generation build scripts typically should produce context-independent outputs. In practice, this means that, as a rule of thumb, all name paths should be explicit and unambiguous, and they shouldn't rely on the "call site" context:

quote! {
    let foo = ::std::option::Option::Some(bar);
};

This isn't an issue for small programs, but for complex code generation systems, manually managing full paths becomes challenging. At least in terms of maintenance. I've been working on such tasks for quite a long time, but I still haven't come up with a truly convenient solution.

What I usually do is maintain an independent module inside my codegen crate (proc macro or build script) where I put all external paths my program ever uses. Then I refer to this registry module throughout the crate.

// registry module:

pub(crate) struct Facade;

impl Facade {
    pub(crate) fn option() -> TokenStream { quote!(::std::option::Option) }
    pub(crate) fn deserialize() -> TokenStream { quote!(serde::de::Deserialize) }
}

// usage site:

fn codegen_smthng() -> TokenStream {
    let face_option = Facade::option();
    let face_de = Facade::deserialize();

    quote!( let x = #face_option::Some(<y as #face_de>::deserialize()) )
}

This approach partially solves the maintenance problem: every time I need to change or refactor an external dependency, I check the registry module first. Plus, it makes my quote! expressions more readable and less prone to typos. However, this approach isn't very convenient in other aspects. Namely, I have to access this "Facade" registry explicitly from almost every codegen function.

Another alternative is to use isolated scopes with imports (quote!({ use ::std::option::Option; })), but this approach also has drawbacks, especially when the codegen system consists of dozens of small, mostly independent procedures.

I'd like to ask the community: how do you usually deal with this problem?

Thanks in advance,
Ilya

I always use the fully qualified path approach where applicable and never had any issues with it.
For well-known stuff from the standard library's prelude, such as Option or Some, I also use those directly at times.
Also note, that in your example, the serde crate is missing a leading :: for the fully qualified syntax. If the user's library has a serde submodule, this may result in surprising behavior or compilation errors.

It's a viable, but rather verbose approach. Quote'd code tends to bloat into long lines, which reduces readability: both for the programmer and for code formatters. More importantly, you lose a clear picture of what and where you actually use. For example, your solution relies on Std's HashMap everywhere, but one day you might decide to replace it with AHashMap (from the ahash crate). It's easy to miss refactoring points, especially in less frequently executed codegen paths. That said, for small to moderately sized codegen crates, this isn't really an issue, explicit fully qualified paths work just fine. The problem arises when the codebase grows.

A proc macro library generally should not generate paths like ::serde::de::Deserialize because that requires the user of the macro to have declared a serde dependency — unless the macros are literally only for that purpose.

Instead, the proc-macro should generate paths into its own non-macro library which re-exports all needed items. This can then, if you choose, also include re-exporting std (or core) items.

If you follow this pattern, then your code generation can be simpler because there is only one thing to declare up front:

...
fn codegen_smthng() -> TokenStream {
    // define this *single* value in some common place like your Facade
    let lib = quote! { ::my_library::_macro_helpers };

    quote! {  let x = #lib::Some(<y as #lib::Deserialize>::deserialize()); }
}

Moving the entire nomenclature routing to the associated crate is a good idea in many aspects. One advantage it gives the author is the ability to specialize the generated code at the crate's compile-time stage. For proc-macro third-party dependencies, this is unavoidable anyway, so re-exporting the Std/Core routing to a hidden crate module makes sense too.

Some drawbacks (which sometimes could even be considered features) of this approach are:

  1. It creates an extra layer of indirection at compile time, which does not make the compiler's life easier. This factor is almost negligible in many cases, but I'm not entirely sure about large code-generation outputs. For example, for build scripts we probably don't even need a hidden re-exporting module in the first place.

  2. A more important factor is that some codegen parts could intentionally be public. One example is a function-like proc macro that generates a public function with a signature. With this approach, the end user would see a path to the hidden (#[doc(hidden)]) crate module in the function's public interface, e.g. pub fn foo(x: my_crate::__hidden::Option<f32>). The same is true for build-script outputs. [1]

To sum up, I think using the crate's re-export module is perfectly fine and aligns with common practice across many Rust crates. It just sometimes addresses specific scenarios with certain limitations.


  1. In both cases, my_crate::__hidden does not necessarily need to be hidden. It could intentionally be a public module of the crate, with well-defined documentation explaining why it exists and what it does. Still, it might look confusing to the end user why the public API refers to a non-conventional wrapper for Std/Core items. ↩︎

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.