How to get start/end line number of functions(with InnerAttribute) in non-inline module from attribute proc macro

Hi everyone!

I am a newbie to Rust programming. As a exercise, I wanted to write a program like a little lint which would check the length of functions, and if the length is greater than the default max length, emit a warning.

So I used attribute proc macro and used it as a InnerAttribute( just did not want to write the attribute for each single function).

Here is the cargo.toml file for my attribute proc macro lib named lint_fn_len:

[package]
name = "lint_fn_len"
version = "0.0.0"
authors = ["David"]
description = "exericse for attribute proc macro "
edition = "2018"

[lib]
proc-macro = true

[dependencies]
quote = "1"
proc-macro2 = "1.0"
syn = { version = "1.0.72", features = ["full"] }

Here is the lib.rs file for lint_fn_len:

#![feature(proc_macro_span)]

use proc_macro::TokenStream;
use quote::ToTokens;
use syn::{parse_macro_input, ItemFn, ItemMod, spanned::Spanned};

#[proc_macro_attribute]
pub fn my_attribute(_args: TokenStream, input: TokenStream) -> TokenStream {
    let input1 = input.clone();
    let mod1 = parse_macro_input!(input1 as ItemMod);

    for item1 in mod1.content.unwrap().1 {
        if let Ok(func1) = syn::parse::<ItemFn>(item1.to_token_stream().into()) {
            let sp1 = func1.block.span().unwrap();

            println!("end line number: {} , start line number: {}", sp1.end().line, sp1.start().line);
            let numbers = sp1.end().line - sp1.start().line;

            if numbers > 5 {
                println!(
                    "Function {} is too long, count of line number is {} > 5 which is default max",
                    func1.sig.ident.to_string(),
                    numbers
                );
            };
        }
    }
    input
}

Here is the cargo.toml file for long_fn project which is the user of attribute proc macro:

[package]
name = "long_fn"
version = "0.1.0"
edition = "2018"

[dependencies]
lint_fn_len = {path = "../lint_fn_len"}

And here is the main.rs for long_fn:

#![feature(custom_inner_attributes)]
#![feature(proc_macro_hygiene)]

use lint_fn_len::my_attribute;

mod mymod;

fn main() {
    mymod::kk();
}

and here is the module(mymod.rs) which is actully using the attribute proc macro in the long_fn project:

#![my_attribute]
pub fn kk() {
    println!("kk 1");

    println!("kk 2");
}

The output of compiler is as following:

cargo.exe build --color=always --message-format=json-diagnostic-rendered-ansi --package long_fn --bin long_fn
   Compiling proc-macro2 v1.0.26
   Compiling unicode-xid v0.2.2
   Compiling syn v1.0.72
   Compiling quote v1.0.9
   Compiling lint_fn_len v0.0.0 (C:\Users\libin\IdeaProjects\lint_fn_len)
   Compiling long_fn v0.1.0 (C:\Users\libin\IdeaProjects\long_fn)
end line number: 6 , start line number: 6
    Finished dev [unoptimized + debuginfo] target(s) in 42.01s
Process finished with exit code 0

Surprisely, the start/end line number(6/6) were surely wrong for the mymod.rs file. They were just start/end line number for the line "mod mymod;" in main.rs which were not exactly what I wanted. I wanted the start/end line number for the function in mymod.rs.

As another try, when I used inline module with the same code instead(but of course without the extra file mymod.rs), the resulting start/end line number was correct, they were the start/end line number of the function in main.rs.

Why could it not work with extern module? Is there anywhere I made some mistakes?

How could I get the start/end line number of functions in the extern module for this case?

Thanks in advance for any help!

Is it a unimplemented feature in Rust?

Indeed, there are two "not really implemented" features of proc-macro attributes:

  • attributes on non-inline modules;

  • inner attributes.

As you can see, your use case hits both :sweat_smile:.

Since you are willing to use nightly / unstable Rust, I'd suggest that you yourself grab the actual source code of the non-inline module1, to work around these limitations (note that proc-macros performing file-system access are not great, but in your case it's kind of justified, since you are polyfilling what the compiler would end up doing anyways).

1 By that I mean that you work off #[attrs] mod modname;, and emulate rust's logic yourself (as much as possible, there will be cases you won't be able to handle, such as #[cfg_attr(…, path = "overridden.rs")]):

  1. You start off the span of this very input, to know whence you are called: path/to/origin_file.rs;

  2. You parse the #[attrs] to see if you spot any #[path = "path/to/overridden.rs"].

    • If there is any such attribute, you know the path of the resulting file; this will very very often be a relative path, that you must compute off path/to/origiin_file.rs.

    • If there are no such attributes, you must try to see if with #[path = "modname.rs"] or #[path = "modname/mod.rs"] it would work; and then you are back to the previous point.

Once you have the path to bundled file, you'll just have to call ::syn::parse_file() on that path to obtain the true inout of the non-inline module, and work off that.

In order to solve some hygiene issues, you may be required to override the mod modname; statement with mod modname { <actual contents you already identified> }. In doing so, you can even then recurse with calls to yourself on submodules.


Speaking of that,

Note that you are defeating the very point of an AST when you do this: you are only supposed to parse a tokenstream once, when the macro is called; that's the "beauty" of Abstract Syntax Trees. Thus, when working with ::syn, be ready to spend a significant amount of time on its documentation, especially looking at the big enums (green colored) that appear in some of the fields of an AST entry. Indeed, that enum will cover all the possible variants for the sub-entries :slightly_smiling_face:.
More concretely, once you have access to a high-level AST entry, such as ItemMod, you can simply pattern match on its entries:

for item1 in mod1.content.unwrap().1 {
    match item1 {
        Item::Fn(func1) => {
            let sp1 = …;
            …
        },
        Item::Mod(submod) => { /* add a call to yourself there */ },
        _ => { /* Else nothing to do */ },
    }
}

In that regard, you surely haven't chosen the easiest exercice out there :sweat_smile: , quite the opposite! Maybe, for starters, limit yourself to an attribute that you manually slap on each function; that way the logic you already wrote will Just Work™ :slightly_smiling_face:

1 Like

You are so kind and so helpful! :slightly_smiling_face:

I will try my best to understand fully with what you have given.
it may take some time though!

FYI:
I had tried some normal attribute macro exercises and made them right(at least it is what I myself thought at that time) before preceeding to this one, so this was why I began to try innerAttribute, and etc, to learn more and to feel more comfortable about attribute macro programming.

Thank you so much for your advices!

1 Like

Hi,
These two days, I finially had time to correct my exercise mentioned above. According to the advices Yandros provided, there were mainly two places to work on:

  1. Using a recurred call to handle nested module neatly;
  2. Using syn::parse_file() to handle non-inline module.

This time I used outterAttribute instead, and as it is just a exericise, I did not spend time to make the program robust and more general.

For point 1, I changed the code(mentioned below), and I began to enjoy the beauty of AST, and it all worked nicely, Thank you, Yandros!

For point 2, everything worked well except the line number of functions in the non-inline module.

Here is the modified code for the lib.rs file of lint_fn_len project:

#![feature(proc_macro_span)]

use proc_macro::TokenStream;
use quote::ToTokens;
use syn::{parse_macro_input, ItemMod, Item};
use std::fs::File;
use std::io::Read;
use proc_macro::Span;
use quote::spanned::Spanned;

#[proc_macro_attribute]
pub fn my_attribute(args: TokenStream, input: TokenStream) -> TokenStream {
let input1 = input.clone();
let mod1 = parse_macro_input!(input1 as ItemMod);

let input_content;
let file_parsed;
let items;

if  Option::is_none(&mod1.content) {
    let mod_filename = format!("{}\\{}.rs", Span::call_site().source_file().path().
        parent().unwrap().to_str().unwrap(), mod1.ident.to_string());
    let mut file = File::open(&mod_filename).unwrap();
    let mut file_content = String::new();
    file.read_to_string(&mut file_content).unwrap();
    file_parsed = syn::parse_file(&file_content).unwrap();
    //println!("{:#?}", file_parsed);

    items = &file_parsed.items;
} else {
    input_content = mod1.content.unwrap();
    items = &input_content.1;
}

for item1 in  items  {
        match item1 {
            Item::Fn(func1) => {
                let sp1 = func1.__span();
                println!("End line number: {} , start line number: {}", sp1.end().line, sp1.start().line);
                let numbers = sp1.end().line - sp1.start().line;
                if numbers > 5 {
                    println!(
                        "Function {} is too long, count of line number is {} > 5 which is default max",
                        func1.sig.ident.to_string(),
                        numbers
                    );
                };
            },
            Item::Mod(mod2) => {
                my_attribute(args.clone(), mod2.into_token_stream().into());
            },
            _ => {},
        }
}
input

}

And here is the compiler output for long_fn project:

> cargo.exe build --color=always --message-format=json-diagnostic-rendered-ansi --package long_fn --bin long_fn
>    Compiling lint_fn_len v0.0.1 (C:\Users\libin\IdeaProjects\lint_fn_len)
>    Compiling long_fn v0.1.0 (C:\Users\libin\IdeaProjects\long_fn)
> End line number: 5 , start line number: 5
>     Finished dev [unoptimized + debuginfo] target(s) in 8.17s
> Process finished with exit code 0

Surely, the start line number and end line number of the function were not correct(for mymod.rs file). Actually the two line numbers "5" we got were the exact line number of the line where the attribute(my_attribute) was put in the main.rs file of the long_fn project as showed below:

#![feature(proc_macro_hygiene)]

use lint_fn_len::my_attribute;

#[my_attribute]
mod mymod;

fn main() {
mymod::kk();
}

But why? The syn::parse_file() call was not related to the "src\main.rs" file at all, it only used for "src\mymod.rs" file, and the "src\mymod.rs" file did not reference anything in any other file except the println!() macro.

I aslo tried to print all the details of the AST(println!("{:#?}", file_parsed);), and it seemed that there was nowhere I can find someting helpful.

I also checked the syn crate document about the Spanned trait implemented for Syn::ItemFn struct, and noticed the following:

pub trait Spanned {
/// Returns a Span covering the complete contents of this syntax tree
/// node, or [Span::call_site()] if this node is empty.
///
/// [Span::call_site()]: proc_macro2::Span::call_site
fn span(&self) -> Span;

Now I think it is maybe because the node is empty even I used syn::parse_file() seperately for the module file.

Is there still something missing, or somewhere I have made some mistakes?
Does anyone know the reason why there were two"5"s for the line number?

Many thanks for any help!