How to improve the parsing code for JSON

Hi, I'm writing a JSON (not compliant) parser with a top-down parsing method to practice my Rust programming.
Here is the (full repo)[GitHub - Celthi/parsing-rs: Experiment writing JSON parser with Rust]. There are several tests to test the parser and the lexer. It seems it is working, but the code written is not as idiomatic as I think it is. For example, for parsing a object, the code is,

fn parse_object<'a, 'b>(
    lexemes: &'a [Lexeme<'b>],
) -> Result<(Value, &'a [Lexeme<'b>]), &'static str> {
    if lexemes.len() < 2 || lexemes[0]._type != b'{' {
        return Err("Not a object.");
    }
    // empty object
    if lexemes[1]._type == b'}' {
        return Ok((Value::Object(HashMap::new()), &lexemes[2..]));
    }
    let mut m = HashMap::new();
    let mut lexemes = &lexemes[1..];
    loop {
        if let (Value::String(s), lexeme) = parse_string(lexemes)? {
            if lexeme[0]._type != b':' {
                return Err("colon expected.");
            }
            lexemes = &lexeme[1..];
            let (value, lexeme) = parse_value(lexemes)?;
            m.insert(s, value);
            if lexeme[0]._type != b',' {
                lexemes = lexeme;
                break;
            }
            lexemes = &lexeme[1..];
        }
    }
    if lexemes.len() < 1 || lexemes[0]._type != b'}' {
        return Err("right bracket expected.");
    }
    return Ok((Value::Object(m), &lexemes[1..]));
}

I was thinking to improve this code snippet by extracting the following code into a function

        if let (Value::String(s), lexeme) = parse_string(lexemes)? {
            if lexeme[0]._type != b':' {
                return Err("colon expected.");
            }
            lexemes = &lexeme[1..];
            let (value, lexeme) = parse_value(lexemes)?;
            m.insert(s, value);
            if lexeme[0]._type != b',' {
                lexemes = lexeme;
                break;
            }
            lexemes = &lexeme[1..];

But it seems I need to return three things: the lexemes have not parsed, the key and value of the object have parsed, and error if there is any.

Could you give some ideas (general or specific) and I will try them to improve the code to practice my skills? Appreciate any comments! Or recommend that I read other well-written JSON parsers with the same techniques (top-down with tokenizing).
I have read the serde_json library, but I find it is not using a top-down parser, so I cannot do an apple and apple comparison.

The (not compliant) JSON parser first tokenizes the string and parses the tokens with a top-down parsing method.

I use the Lexeme to represent a token because my first version of the parser uses the Token and this is my second iteration of the parser.

The top-down parsing is using one lookahead.
And the parser treat all the number as f64 because I don't want to deal with the number at this stage.

Some things to start with...

  • Use an enum for Lexeme::_type and replace e.g. your chain of ifs in parse_value with a match
  • Return Results from get_string and get_string_in_quote instead of just returning out-of-order indices
  • There's a lot of indexing and manual looping that could be replaced
  • Make use of is_ascii_digit and friends

That sounds great. I Will do this weekend~

Thanks for your kind and enlighteninging suggestions. I have made several commits to follow your advice except I return Option instead of Result from get_string
The new code is in GitHub - Celthi/parsing-rs: Experiment writing JSON parser with Rust

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.