Greetings!
The previous version of the lexer for my assembler project failed so hard when I was able to test it that I switched to using a library called logos
to handle actually making the lexer. With it, I'm only really making an enum that logos will use to make the lexer and some callback functions. There's only three of these callbacks left to implement, and all of them deal with a variable-length substring, unlike every other callback which deals with a fixed-length substring.
As an example, here's the code I have for a fixed-length token: an absolute hexadecimal address.
fn addr_abs_hex(lex: &mut Lexer<Token>) -> Result<AddressInfo, ()> {
let slice: &str = lex.slice();
let poss_addr: Result<i32, ParseIntError> = i32::from_str_radix(&slice.substring(slice.len() - 6, slice.len()), 16);
match poss_addr {
Ok(t) => {
if t > 16777215 || t < 0 {
Err(())
} else {
Ok(AddressInfo { addr_type: AddressType::Absolute, base: NumBase::Hexadecimal, val: t })
}
},
Err(e) => Err(())
}
}
If you couldn't tell, the pertinent information in the string slice is at the end of the string slice. With all the fixed-length substrings, that's fine because I know how long it is. With something that can be any length, that poses a problem. I can't just scan forward from the start of the string, as I can potentially run into an identifier or string that the lexer has already tokenized. So, I need to scan the string back to front. In addition to that, when it comes to trying to find a string within the string, I need to check for an escaped "
, i.e. \"
. Which, as you may guess, complicate things.
I would like some help with this, as it has been stumping me. Thank you for your time, and have an awesome day!