I am writing a parser for a toy assembly language (SIC/XE), and am using LALRPOP for that purpose. But, I've encountered a rather inexplicable error. Here is a rule in LALRPOP to match hex literals like this- x'23'
pub NumLit: i32 = <matched:r"x'([-+]?\d[0-9a-fA-F]*)'"> => {
println!("{}", matched);
let mut a = matched[2..].to_owned();
println!("{}", a);
a.retain(|c| c != '\'');
println!("{}", a);
println!("{:#?}", a.into_bytes());
let dfd = i32::from_str_radix(matched, 16);
println!("{:#?}", dfd);
dfd.unwrap()
};
This is a rule defined in LALRPOP's own notation, as you may have noticed. Basically, it runs a regex on the input to tokenise it, and that regex is defined at that part with matched:
in it. LALRPOP just copies whatever code is in the brackets to the output parser module.
so, if we want to parse x'23'
, we would invoke this rule, and it would first print x'23'
, then 23'
, and finally 23
, after which it parses the string as a hex number, or it is supposed to. Somehow, using this rule to parse x'23'
causes it to get an invalid digit error.
here's the output when this rule is invoked in a test-
x'23'
23'
23
[
50,
51
]
Err(
ParseIntError {
kind: InvqalidDigit
}
)
thread 'tst' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: InvalidDigit }', libcore/result.rs:1009:5
As you can see ,the string only has two proper ascii digits which should parse properly...