Rust syntax: matches!()

  1. This code is determining whether the lexical character is a special character. Is this the correct way to write the syntax or is there a more efficient way to do so? or a more correct way to do so...

  2. '%', The rust error index does not make clear why this does not run correctly when testing. Any thoughts as to why this may be the case?

Thank you kindly.

pub fn is_special_character() -> bool {
    matches!(*self,
        '"',
        '%',
        '&',
        "'",
        '(',
        ')',
        '*',
        '+',
        ',',
        '-',
        '.',
        '/',
        r"\ ",
        ':',
        ';',
        '<',
        '=',
        '?',
        '[',
        "??(",
        ']',
        "??)",
        '^',
        '_',
        '|',
        '{',
        '}',
        '$',
        "`",
    )
}

The syntax for matches!() looks like matches!(some_character, '"' | '%' | '&').

It's essentially a shorthand for a match statement:

match some_character {
  '"' | '%' | '&' => true,
  _ => false,
}

You're also going to run into compile errors because you are mixing character literals (e.g. '%') with string literals (e.g. "??(").

A character literal has the type char and can only contain a single UTF-8 codepoint, it must use single quotes ('). On the other hand, a string literal has the type &'static str and can contain zero or more UTF-8 codepoints enclosed by double quotes (").

That means even if you rewrite your matches!() to use | for separating patterns, you're still going to get a type error because you are comparing a char against a string literal.

error[E0308]: mismatched types
 --> src/lib.rs:6:15
  |
3 |         c,
  |         - this expression has type `char`
...
6 |             | "'"
  |               ^^^ expected `char`, found `&str`
  |
help: if you meant to write a `char` literal, use single quotes
  |
6 |             | '\''
  |               ~~~~

error[E0308]: mismatched types
  --> src/lib.rs:15:15
   |
3  |         c,
   |         - this expression has type `char`
...
15 |             | r"\ "
   |               ^^^^^ expected `char`, found `&str`

My advice is to ask yourself what it means to be a "special character".

If a special character is a single UTF-8 codepoint (i.e. a "single character") then you should switch everything over to compare a char against character literals and remove multi-character patterns like "??)" and friends.

Otherwise, if a special character is actually a sequence of one or more characters, you should be comparing a &str against string literals.

1 Like

Specifically how does it "not run correctly"? As pointed out above, this shouldn't even compile, so if you are running some code and it's behaving unexpectedly, then you are definitely running something else.

Thank you kindly for your reply. I changed it to the following:

pub fn is_special_character(c: char) -> bool {
    matches!(c, '\u{0022}' // '"'
        | '\u{0025}'    // '%'
        | '\u{0026}'    // '&'
        | '\u{0027}'    // '\''
        | '\u{0028}'    // '('
        | '\u{0029}'    // ')'
        | '\u{002A}'    // '*'
        | '\u{002B}'    // '+'
        | '\u{002C}'    // ','
        | '\u{2212}'    // '-'
        | '\u{002E}'    // '.'
        | '\u{002F}'    // '/'
        | '\u{005C}'    // '\''
        | '\u{003A}'    // ':'
        | '\u{003B}'    // ';'
        | '\u{003C}'    // '<'
        | '\u{003D}'    // '='
        | '\u{003F}'    // '?'
        | '\u{005B}'    // '['
        // | // "??("
        | '\u{005D}'    // ']'
        // | // "??)"
        | '\u{005E}'    // '^'
        | '\u{005F}'    // '_'
        | '\u{007C}'    // '|'
        | '\u{0028}'    // '{'
        | '\u{007D}'    // '}'
        | '\u{0024}'    // '$'
        | '\u{0027}'    // '`'
    )
}

Do you recommend that the multi-character matches be written in a separate function?

Thank you kindly

You are correct. I had to open a new .rs file and compile it. Picked up the error, and referenced them agains the Rust Error Codes. Which did help.

Thank you kindly

It's a higher level question than that: you're comparing a character, so it will never be equal to multiple characters. You need to figure out what you're actually trying to do here: probably trying to find the longest matching prefix.

If I'm to play Clippy (the Microsoft one) for a second: it looks like you're trying to write a parser. There's a lot of help out there for this: full courses, books, blogs, libraries and so on. But at it's core, parsing looks something like this:

fn parse_something(input: &str) -> Result<(Something, &str), SomeError>;

If you see what you expect in the start of input, you create a value representing it in whatever way you want, then return it and the input following that, otherwise you create an error value representing why you couldn't parse it. This is really boiling it down, but it should at least get you on the right track. If you find yourself writing a lot of code, take a look at the libraries for building parsers over here: Parser tooling — list of Rust libraries/crates // Lib.rs

5 Likes

Thank you kindly Simon. I am writing a lexer - still in the scanning phase. I do have some examples to work with. It's alot of fun. The core reason why I ask these questions is to get up to speed with the Rust Syntax. I have the theory in my mindseye -- but still getting to grasp with the syntax. Im building the lexer in stages. I will be moving onto the input buffer and look ahead operations soon. I will also be building the error functionality for the lexer. Once Ive done that I will be building a Comandline tool to display the errors/ success/ tokens in the terminal/cmd. Hopefully I should be finished by the end of this week -- then able to move on to the parser(as a separate stand along crate). Thank you kindly for your insight.

A char in Rust has a lot of associated functions to test whether it belongs to certain classes of characters (e.g. is_digit, is_control, is_alphanumeric). I would recommend relying on these established categories.

1 Like

Thank you kindly!