Sorry for the title, I could not figure out what to put in it.
I am working on building a toy language in rust; and I am just starting with the lexer (scanner). I have a list of characters of the source code
, and I am trying to loop over them and return Token
struct of each token. For brevity, I have removed that creating and returning tokens system, and created this minimal reproducible script which is showing the problem I am facing.
If I have source Script → 1+2-3
I should get 5 tokens, "1" , "+" , "2" , "-" , "3"
But My code is retuning only three , "1" , "2" and "3".
Here is the code →
#[derive(Debug , Default)]
pub struct Scanner{
input : Vec<char>,
pos : usize,
read_pos : usize,
ch : char
}
impl Scanner {
pub fn new(input : &str) -> Self{
let mut sc = Self{
input : input.chars().collect::<Vec<char>>(),
pos : 0 ,
read_pos : 0,
..Default::default()
};
sc.read_char();
sc
}
pub fn eof(&self) -> bool { //is at the end of input
self.pos >= self.input.len()
}
fn read_char(&mut self) { //advance and set next char to self.ch
if self.read_pos >= self.input.len(){
self.ch = '\0';
}else{
self.ch = self.input[self.read_pos]
}
self.pos = self.read_pos;
self.read_pos += 1;
}
fn get_token(&mut self){
match self.ch {
'+' => println!("TOK->PLUS->+"),
'-' => println!("TOK-MINUS->-"),
'0'..='9' =>{ //digits
let curpos = self.pos; //starting position of the number token
while !self.eof() && self.ch.is_ascii_digit() {
self.read_char()
}
let result : String = self.input[curpos..self.pos].iter().collect();
println!("TOK-NUMBER->{}" , result);
}
_ => println!("TOK->ILLEGAL->{}" , self.ch)
}
self.read_char()
}
}
fn main() {
let mut s = Scanner::new("10+200-100");
while !s.eof() {
s.get_token()
}
}
Can anyone suggest what I am doing wrong? Why the symbols are getting skipped.