There is no clean way to handle this.
-
Using static
memory (or worse, static mut
! Don't use that: use an AtomicUsize
or smth like that but never use static mut
) may break at any point, since you can't know how and when your proc-macro is called: it could have its global memory refreshed in between invocations, for instance.
-
Accessing the fs may be a tiny bit more robust than static
memory, but:
-
some compilation environments could end up sandboxing proc-macros. That would break crates such as sqlx
, so I no longer think this may happen as a global thing, but rather, an opt-in thing for dependents. Still, you'd become a crate unusable for that kind of users;
-
Caching: the very invocation of a proc-macro may be cached, so working off the fs could yield unexpected results if, for instance, some proc-macros are called multiple times and others not (on top of concurrency issues as well). That being said, in nightly
, and thus, hopefully in future stable Rust, there is / will be the ::proc_macro::tracked_path::path
API (yes, it's a mouthful, I hope they change it to tracked::path
or something; we don't see ptr::ptr_addr_of!
or ptr::NonNullPtr
but ptr::addr_of!
and ptr::NonNull
, for instance). With it, you'll be able to register your interest in a file contents, for hopefully a more cache-friendly behavior. That being said, while that works for an externally-loaded file (like sqlx
does), I'm still skeptical of it working properly for a "cached state" approach.
-
Env vars. Mutating those is currently observable in between compiler invocations. While an interesting theoretical quirk of the current approach, this is horrible, I hope nobody uses that.
-
Loosen the "value pre-fetched during Rust's compile-time" to "value pre-fetched during linking time or during life-before-main". For this, crates such as inventory - Rust or linkme - Rust can be quite useful. The caveat is their reduced portability (e.g., inventory
does not work off the shelf / well on Wasm).
The more robust solutions
The key idea is: handle all your "annotated items" within a single sweep. Thanks to that, no need for global state: a local state within that sweep serves us just fine 
1. A single proc-macro invocation sees all the annotated items.
This is the approach taken by cxx - Rust, for instance. Choose an inline mod
ule, an impl
block, or something along those lines, expect that the macro be called on it (or, similarly, just take a function like macro to define your own scope), and have all the potentially annotated items (e.g., your type definitions) be syntactically / lexically present inside that scope:
my_preprocessor! {
#[derive(Foo)] // <- Foo is not a real macro, it's a syntactical marker for `my_preprocessor to handle `Bar`.
struct Bar ...
struct Ignored...
#[derive(Clone, Foo, Debug)] // <- another one detected!
struct Quux
}
// or
#[my_preprocessor]
mod ... {
#[derive(Foo)]
struct Bar ...
...
}
This works surprisingly well, but does come with the caveat of requiring that all the annotated stuff be located within a single file.
If you want to support definitions scattered across multiple files, then you'll need to use:
2. A build.rs
script which scans your code
With a build.rs
script, you are in control of a single sweep of your codebase, should you implement one. The caveats being having to implement one, and, mainly, not being able to handle macro-generated modules, cfg
-gated modules, or other advanced shenanigans. So it's not super versatile either, but it already supports handling definitions scattered across multiple files.
With it, the build.rs
script can generate the necessary stuff in helper files, to be emitted in the OUT_DIR
. If such files, alone, are not able to do all the work, they can, at the very least, provide the results of the global state. A proc-macro could then just expect those files to be present, and emit-include!
them.
These solutions are still a bit brittle if the user code does stuff too confusing for the basic code scanning logic, but with some cooperation from the caller / user of your framework, that's something that can be dealt with / managed. Compare that to proc-macros which may have, at some random upgrade of the compiler, inconsistent global state or inconsistent access to the fs, and you end up with a source of problems that neither you nor the caller can do much about.