In my proc macro crate I recently added a feature that can potentially cause generating a lot of code and added a test for this feature. On my system with 32G RAM this test currently uses more than 30G memory in
cargo check and
cargo test --release so I'm unable to run the test.
To see what the issue is I created another branch where I made these changes:
- The proc_macro function with type
fn(TokenStream) -> TokenStreamnow becomes
fn(MyInputType) -> TokenStream, where
MyInputis the type that I parse the input
TokenStreamto (link to code). I also remove the
proc_macroattribute and export the function.
- I add an executable that as argument takes a file path, parses that file as
syn, then calls the proc macro function I updated in the previous step. The returned
The difference between input parsing method should not matter as the input is tiny.
So the difference between the proc_macro version and the version described above is I make the proc_macro function a normal library function that returns a
TokenStream. This version is then called with the input that causes proc macro to use more than 30G of RAM.
Interestingly, benchmark version runs in 3-4 seconds and uses very little RAM, even though it's run with the same argument.
I think what this means is that all that RAM + time is needed for generating AST or IR in the compiler from the
TokenStream returned by the proc_macro.
Now my questions are:
- Does my hypothesis make sense?
- How do I profile/debug this?
Full repro instructions:
git clone https://github.com/osa1/lexgen
- Switch to the branch with the feature:
git checkout char_predicates
- Run the tests (this will use 30G+ RAM):
cargo testthe time and RAM used in this test is for compiling the proc_macro usage in the test described above
- Now switch to the profiling/debugging branch:
git checkout char_predicates_profiling
- Run the test:
cargo run --release --bin lexgen_test bench_data_2. If you remove
--releaseit takes a lot of time but it uses very little memory.
Edit: Right after posting this I realized that this could also be caused by optimizations in the compiler, e.g. maybe LLVM is using all the RAM trying to optimize the generated code. The question of how to profile/debug this is still open.
Edit: This is the generated code: gist:7d981da22dbfb4ae4b0c21c1f49d58cb · GitHub given that
cargo expand doesn't use that much memory when generating this, I'm guessing this is an issue with optimizations.