Best way to get number after and before operator here?

I have

fn getnum(seg:&str,isfirst:bool) -> isize {
if isfirst {
 return match seg.split(&['+', '-','*','/'][..]).last() {
   Some(x) => parse(x.to_string()),
   None => panic!("Not happening")
} else {
  return match seg.split(&['+', '-','*','/'][..]).next() {
    Some(x) => parse(x.to_string()),
    None => panic!("Not happening")


fn parse(toparse: String) -> isize { toparse.parse::<isize>().unwrap() }

I am creating a calculator in rust (the code above is a small portion of the overall code).
getnum takes in a segment of an equation (stripped of whitespace), split before or after the determined operator to split on (the next part to evaluate based on PEDMAS). My code only does this for positive numbers though. Because it splits on a hyphen (and pluses, asterisks, and backslashes), things will go wrong when inputting an expression with negative numbers: image

We can input an equation such as 1*-1+2.
It will determine the next portion to be evaluated is the asterisk at index 1. It splits the vector on the asterisk, turns it into a str and runs the function getnum on the strs to get the number being computed on the asterisk. isfirst determines whether it is the upper portion of the equation or the lower (if upper then it will get the first value of the array of numbers , if the lower it will get the last of the array of numbers). For example the lower portion of the equation inputted into the function would be 1 and the upper would be -1+2.
It would split both and get the first and last respective elements of those arrays. (-1 and 2)
I tried making a regex (see here). It works successfully, but there aren't any Rust crates (as far as I know) that support PCRE regex, as I learned from trial an error with 2 regex crates

// Error: Invalid regex
let re = Regex::new(r"(?<!(\+|\-|\*|^))[+\-*/]").expect("Invalid regex");

Please tell me the best course of action and/or a code implementation for it?
Thank you.

There are two general ways to approach this problem, depending on whether you consider it a lexing or parsing problem:

  1. While tokenizing, look ahead each time you find a - to see if it's followed by a digit. If it is, you consume the whole thing and emit a Number token. Otherwise, you emit a Minus token.
  2. While parsing, when you encounter -, first decide whether it's a "binary minus" (as in a - b) or "unary minus" (as in -a). Look at the context: if the last thing parsed was an operator, or this is the beginning of an expression or parenthesized subexpression, it has to be the unary operator; if the last thing parsed was an operand, the - has to be the binary operator.

Either one will work fine. Rust itself took option 2, but other languages such as Ruby use option 1.

There is a well-known algorithm for parsing infix expressions, the shunting-yard algorithm. I recommend implementing that, rather than trying to make an ad-hoc parser based on splitting strings. You can still solve the problem of - either before or during parsing, but if you base your implementation on that wiki page it might be easier to do option 2 (I think you can just look at the last thing you put in the output queue).

1 Like

Thanks for the suggestion, I should be thinking of this calculator as a programming language a bit more :slight_smile: