NOTE: sorry to crosspost issue #279 here but, since the repo has received very little attention from its owner in the past months / years, I hope to receive more help here
Hello! For the context, I am writing a TeX file parser, and I use Logos==0.12.1
to create a lexer.
You can find a (reduced) version of my Token
implementation:
use logos::Logos;
#[derive(Logos, Debug)]
pub enum Token {
#[token(r"\")]
Backslash,
#[token(r"\\")]
DoubleBackslash,
#[token(r"\begin")]
EnvironmentBegin,
#[token(r"\end")]
EnvironmentEnd,
//#[token(r"\begin{document}")] // <- the part that creates problems
DocumentBegin, // <-
#[regex(r"\\[a-zA-Z]+")]
MacroName,
#[error]
Error,
}
When I run my test suite, see reduced version below, it works fine. However, when I add the enum variant DocumentBegin
, which should not be matches by anything in within the test suite, I get an error saying that \begin
now matches token Backslash
. This does not make sense to me, since Blackslash
should only match \\
, and nothing else.
If I comment out the Backslash
token, then \begin
matches MacroName
, but not EnvironmentBegin
.
After reading the documentation, testing on different versions of Logos
, I still cannot understand why this does not work as expected.
Can someone help me about this?
Test suite
#[cfg(test)]
mod tests {
use super::*;
use logos::Logos;
macro_rules! assert_token_positions {
($source:expr, $token:pat, $($pos:expr),+ $(,)?) => {
let source = $source;
let positions: Vec<std::ops::Range<usize>> = vec![$($pos),*];
let spanned_token: Vec<_> = Token::lexer(source)
.spanned()
.filter(|(token, _)| matches!(token, $token))
.collect();
let strs: Vec<_> = Token::lexer(source)
.spanned()
.map(|(token, span)| (token, source[span].to_string()))
.collect();
assert_eq!(
spanned_token.len(), positions.len(),
"The number of tokens found did not match the expected number of positions {strs:?}"
);
for (pos, (token, span)) in positions.into_iter().zip(spanned_token) {
assert_eq!(
pos,
span,
"Token {token:#?} was found, but expected at {pos:?}"
);
}
};
}
#[test]
fn token_backslash() {
assert_token_positions!(r"Should match \+, but not \\+", Token::Backslash, 13..14,);
}
#[test]
fn token_double_backslash() {
assert_token_positions!(
r"Should match \\, but not \",
Token::DoubleBackslash,
13..15,
);
}
#[test]
fn token_environment_begin() {
assert_token_positions!(r"\begin{equation}", Token::EnvironmentBegin, 0..6,);
}
#[test]
fn token_environment_end() {
assert_token_positions!(r"\end{equation}", Token::EnvironmentEnd, 0..4,);
}
#[test]
fn token_macro_name() {
assert_token_positions!(
r"\sin\cos\text{some text}\alpha1234",
Token::MacroName,
0..4,
4..8,
8..13,
24..30,
);
}
}
Test outputs
Without DocumentBegin
variant
running 5 tests
test tests::token_environment_begin ... ok
test tests::token_backslash ... ok
test tests::token_environment_end ... ok
test tests::token_double_backslash ... ok
test tests::token_macro_name ... ok
test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
With DocumentBegin
variant
running 5 tests
test tests::token_double_backslash ... ok
test tests::token_backslash ... ok
test tests::token_environment_end ... ok
test tests::token_macro_name ... ok
test tests::token_environment_begin ... FAILED
failures:
---- tests::token_environment_begin stdout ----
thread 'tests::token_environment_begin' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `1`: The number of tokens found did not match the expected number of positions [(Backslash, "\\begin"), (Error, "{"), (Error, "e"), (Error, "q"), (Error, "u"), (Error, "a"), (Error, "t"), (Error, "i"), (Error, "o"), (Error, "n"), (Error, "}")]', src/main.rs:80:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
tests::token_environment_begin
test result: FAILED. 4 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
With DocumentBegin
variant, without Backslash
variant
NOTE:
token_backslash
test is now ignored (since we removed the variant)
running 5 tests
test tests::token_backslash ... ignored
test tests::token_environment_end ... ok
test tests::token_macro_name ... ok
test tests::token_environment_begin ... FAILED
test tests::token_double_backslash ... ok
failures:
---- tests::token_environment_begin stdout ----
thread 'tests::token_environment_begin' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `1`: The number of tokens found did not match the expected number of positions [(MacroName, "\\begin"), (Error, "{"), (Error, "e"), (Error, "q"), (Error, "u"), (Error, "a"), (Error, "t"), (Error, "i"), (Error, "o"), (Error, "n")]', src/main.rs:81:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
tests::token_environment_begin
test result: FAILED. 3 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.00s