Pattern match line-by-line in macro_rules?

I am working on a command line library for rust https://github.com/rust-shell-script/rust_cmd_lib . Right now it can run command like this : run_cmd!(du -ah . | sort -hr | head -n 10); by doing pattern match trick in macro_rules to filter "-" and "--" from token stream. However, I can not use the same trick to implement something like this:

group_cmds! { // return Err(...) if any command fails
    do_A
    do_B
    do_C
}

Is there a way to do pattern match line-by-line in macro_rules?

AFAIK, the line separator between tokens in macro is treated the same as would any other whitespace. So, you perhaps should group tokens in each line somehow?

I would use semicolons to separate the different commands. That skips the problem around macros counting newlines as just another whitespace character, plus it's valid bash syntax.

group_cmds! { 
    do_A;
    do_B;
    do_C;
}
1 Like

Thanks for the replies. yeah, I finally made it work, which is similar to my previous solution. The problem is that ';' is also part of invisible token, so I still need use some pattern match trick to make macro_rules happy.

This link https://danielkeep.github.io/tlborm/book/mbe-macro-rules.html has some tricks to do pattern repetitions, however I can only print the commands, not being able to run with another macro.

If you really intend on offering a nice syntax that is hard to parse with macro_rules! macros, I suggest you transition to using procedural macros

Setting up a procedural macro crate

First of all, a proc_macro crate can only export procedural macros, nothing else.
Moreover, a function-like procedural macro (i.e., those with the same calling interface as a macro_rules! macro), are not yet supported in stable Rust, unless you use ::proc-macro-hack.

Both ::proc-macro-hack and the need to sometimes export custom types or traits lead to the two-crate pattern.

The two-crate pattern

The idea is that although a proc_macro crate can only export / offer procedural macros, a regular crate depending on it can reexport these procedural macros, while also being able to export classical stuff. Hence the idea of using an internal helper crate where the proc_macros are defined, and using a "front crate" that depends on it, reexports the stuff: that's the crate users of the library will depend on.

Directory structure

For the my_super_crate respository, you can have the following directory structure:

.
├── Cargo.toml
├── my_super_crate
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
└── my_super_crate-proc_macro
    ├── Cargo.toml
    └── src
        └── lib.rs
  • ./Cargo.toml
    [workspace]
    members = ["my_super_crate", "my_super_crate-proc_macro"]
    

./my_super_crate-proc_macro/

./Cargo.toml

[lib]
proc_macro = true

# ...

[dependencies]
proc_macro2 = "1.0.0"
proc-macro-hack = "0.5.9"
quote = "1.0.0"
syn = { version = "1.0.0", features = [ ... ] }  # syn usually requires extra features

./src/lib.rs

extern crate proc_macro;
use ::proc_macro::TokenStream;
use ::proc_macro2::{
    Span,
    TokenStream as TokenStream2,
};
use ::syn::{*,
    parse::{Parse, Parser},
    punctuated::Punctuated,
    Result,  // explicitly shadow std's Result
};
use ::std::*;

macro_rules! unwrap {(
    $res:expr
) => (
    match $res {
        | ::std::result::Result::Ok(inner) => inner,
        | ::std::result::Result::Err(err) => return err.to_compile_error().into(),
    }
)}

struct Cmd {
    foo: Ident, // data extracted from parsing a single command
    // etc.
}

impl Parse for Cmd {
    fn parse (input: ParseStream<'_>) -> Result<Self>
    {Ok({
        let foo: Ident = input.parse()?;
        // etc.
        Self {
            foo,
            // etc.
        }
    })}
}

#[::proc_macro_hack::proc_macro_hack] pub
fn group_cmds (input: TokenStream) -> TokenStream
{
    type Cmds = Punctuated<Cmd, Token![;]>; // sequence of `Cmd`s separated by `;`
    let parser = Cmds::parse_terminated; // syn parser for that
    let cmds: Vec<Cmd> = unwrap!(parser.parse(input)).into_iter().collect();
    let resulting_code = unwrap!(my_helper_function(cmds));
    TokenStream::from(quote! {
        #resulting_code
    })
}

fn my_helper_function (cmds: Vec<Cmd>) -> Result<Expr>
{Ok({
    let thing = stuff(cmds);
    if some_bad_condition(thing) {
        return Err(Error::new(
            Span::call_site(), // are of code to be highlighted when erroring
            format!("Expected something else"),
        ));
    }
    let forty: Expr = parse_quote! {
        40
    };
    let two: Expr = parse_quote! {
        2
    };
    let forty_two: Expr = parse_quote! {
        #forty + #two
    };
    forty_two
})}

./my_super_crate/

./Cargo.toml

[package]
version = "x.y.z"  # keep in sync with dependencies proc-macro

# ...

[dependencies]
proc-macro-hack = "0.5.9"

[dependencies.proc_macro]
package = "my_super_crate-proc_macro"
path = "../my_super_crate-proc_macro"
version = "x.y.z"  # keep in sync with `../my_super_crate-proc_macro/Cargo.toml`

./src/lib.rs

#[::proc_macro_hack::proc_macro_hack] // attribute to reexport a function-like proc_macro
pub use ::proc_macro::group_cmds;

// stuff

Publishing the crate

Since my_super_crate depends on my_super_crate-proc_macro, the latter needs to be published before the former:

shell $ (cd my_super_crate-proc_macro && cargo publish)
shell $ (cd my_super_crate && cargo publish)

TL,DR

Instead of having to go through complicated recurive loops and tt munchers with a macro_rules! macro, you can go procedural and get to program a parser (c.f., the Parser trait) instead of "hacking" one.

References

2 Likes

If you do decide to go with procedural macros I found this proc-macro workshop video super helpful.

Thanks for the detailed reply, it didn't show a lot advantages for the above simple task. I will probably consider using proc macro for defining new shell functions which could simply execute a group of commands, and return either CmdResult or FunResult. Something like this:

def_fun! {
     let foo = $1;
     let bar = $2;

     cat ${foo} | grep ${bar}
     ...
}

However, I am not quite familiar with proc macros, so if someone could contribute implementing some ideas, it would be much appropriated.