Hello!
TL;DR
What's the best way to convert "r#\"foo\\\"#"
to "foo\\"
?
I'm quite new to procedural macros (I started coding one today) and I have a question which is probably silly and obvious but I can't seem to find a way to solve it easily.
What I'd like to do is to "unquote" string literals, i.e. obtain their contents and not their string representation. Let me explain.
Let's say that I want to code a function-like procedural macro that takes a string literal and substitutes it with a new string literal obtained from it, for instance by adding *
both at the beginning and at the end. I want it to accept both string literals and raw string literals.
Here are a few examples of desired expansions:
my_macro!("foo") // => "*foo*"
my_macro!(r"foo") // => r"*foo*"
my_macro!(r#"foo"#) // => r#"*foo*"#
my_macro!("foo\\") // => "*foo\\*"
my_macro!(r"foo\") // => r"*foo\*"
my_macro!(r#"foo\"#) // => r#"*foo\*"#
Let me add as a further constraint that I'd prefer to use only proc_macro
and not other dependencies such as proc_macro2
, syn
, quote
...
Disregarding any error management, the naive solution seems to be the following code:
#[proc_macro]
pub fn my_macro(tokens: TokenStream) -> TokenStream {
let literal = match tokens.into_iter().next() {
Some(TokenTree::Literal(l)) => l,
_ => panic!(),
};
let new_string = format!("*{}*", literal);
let new_literal = proc_macro::Literal::string(&new_string);
TokenTree::Literal(new_literal).into()
}
From the documentation of proc_macro::Literal
, the only way to obtain a representation of the specific literal seems to be to use their impl Display for Literal
, either via formatting (as I did in the example) or via impl ToString for Literal
.
However, the string that we get with these methods is not the contents of the literal that was passed in, but rather a string representation of the literal itself as it appears in the source code. That is, if "foo"
was present in the source code, then the string that we get is "\"foo\""
. More explicitly, running the expansions above leads to:
my_macro!("foo\\") // => "*\"foo\\\\\"*"
my_macro!(r"foo\") // => "*r\"foo\\\"*"
my_macro!(r#"foo\"#) // => "*r#\"foo\\\"#*"
If I instrument the procedural macro with some debug printing,
#[proc_macro]
pub fn my_macro(tokens: TokenStream) -> TokenStream {
let literal = match tokens.into_iter().next() {
Some(TokenTree::Literal(l)) => l,
_ => panic!(),
};
eprintln!("{:#?}", literal);
eprintln!("{s:?} --- which represents --> {s}", s = literal.to_string());
let new_string = format!("*{}*", literal);
let new_literal = proc_macro::Literal::string(&new_string);
TokenTree::Literal(new_literal).into()
}
during the macro expansion I can see that the proc_macro::Literal
has some interesting private fields:
kind
: it isStr
for a simple string literal ("..."
), orStrRaw(n)
for a raw string literal (r#"..."#
) withn
number signs (#
) on each side;symbol
: this seems to represent the actual characters in the original source that are contained between the quotation marks (if the source is"foo\\"
, thesymbol
is"foo\\\\"
);
however I cannot find a way to access neither of the two, which presumably could be helpful.
I might be missing something obvious, but it seems to me that the proc_macro
's public API lacks some sort of functionality to access the contents of the literals that it parses.
Finally...
Question
What is the best way to obtain the contents of a string literal that is passed to a procedural macro, without re-implementing from scratch the parsing of string literals?