I'm currently following along with this tutorial on writing a tiny compiler from scratch: Super Tiny Compiler, which I found via an old blog post from Johnathan Turner, Programming Language and Compilers Reading List
Right now I'm pretty much at the beginning, the lexer portion. I was wondering if someone could point out a way where I can discard all the if statements and use a match statement instead. I believe it will require me to use the regex crate, because I'd have to match on things more complex than a single character (for instance a long number, or white space). Thing is, that will require several different regex's to be matched, and I don't know how to do that in a match statement.
So far this is what I have with just IFs:
fn main() {
println!("Hello, world!");
tokenizer("This is a test()");
}
const PAREN: &'static str = "paren";
const NUMBER: &'static str = "number";
struct Token<'a> {
ttype: &'a str,
value: char
}
/// Take a string of code and tokenize it
/// (add 2 (subtract 4 2)) => [ { type: 'paren', value: '('}, ...]
fn tokenizer(input: &str) {
let mut current = 0; // current location of cursor
let mut tokens: Vec<Token> = Vec::new(); // array for tokens
let input_vec: Vec<char> = input.chars().collect();
while current < input.len() {
let mut cur_char: char = input_vec[current];
if cur_char == '(' {
tokens.push(Token{ ttype: PAREN, value: cur_char});
current = current + 1;
continue;
}
if cur_char == ')' {
tokens.push(Token{ ttype: PAREN, value: cur_char});
current = current + 1;
continue;
}
if cur_char.is_whitespace() {
current = current + 1;
continue;
}
if cur_char.is_numeric() {
let mut num_value: String = String::new();
while cur_char.is_numeric() {
num_value.push(cur_char);
current = current + 1;
cur_char = input_vec[current];
}
tokens.push(Token { ttype: NUMBER, value: num_value});
contine;
}
}
What I'd like to do is get rid of all the:
if cur_char == whatever
and replace them with:
match cur_char {
'(' => do something,
')' => do something,
whitespace => do something,
Number => do something
}