[Solved] Nom5 : parse a string containing escaped quotes and delimited by quotes

Hi all,

I'm a newbie to Nom and parsing in general. I'm trying to parse a string delimited by quotes that might contains escaped quotes.

I already take a look at the json example provided in the nom repository but it doesn't suits my needs. If a string contains any white space or punctuation char the parsing fails.

So far I've tried this but it doesn't works either :

use nom::bytes::complete::tag;
use nom::bytes::complete::take_while1;
use nom::sequence::delimited;
use nom::IResult;
use nom::bytes::complete::escaped;
use nom::character::complete::one_of;

fn parse_str(i: &str) ->IResult<&str, &str> {
  escaped(
    take_while1(|c: char| c.is_alphanumeric() || c.is_ascii_punctuation() || c.is_whitespace()),
    '\\',
    one_of(r#""n\"#)
  )(i)
}

fn delimited_str(i: &str) -> IResult<&str, &str> {
    delimited(
        tag("\""),
        parse_str,
        tag("\""),
    )(i)
}

#[cfg(test)]
mod parser_test {
    use super::*;

    #[test]
    fn test_delimited_str() {
      let input = r#""Lorem, ipsum dolor \"sit\" amet!?""#;
      assert_eq!(delimited_str(input), Ok(("", ("Lorem, ipsum dolor \"sit\" amet!?"))));
    }
}

How do I parse a string containing escaped quotes and delimited by quotes with Nom?

Thanks

One thing to decide is if you want the delimiters in the result. If you want to remove the delimiters, the return type must change because it is not possible to return a reference to the original buffer:

fn quoted_string(buf: &str) -> IResult<&str, String> {
    let qs = preceded(tag("\""), in_quotes);
    terminated(qs, tag("\""))(buf)
}

This function says a quoted string starts with a quote and ends with a quote. Then all the magic is left to the in_quotes function.

fn in_quotes(buf: &str) -> IResult<&str, String> {
    let mut ret = String::new();
    let mut skip_delimiter = false;
    for (i, ch) in buf.char_indices() {
        if ch == '\\' && !skip_delimiter {
            skip_delimiter = true;
        } else if ch == '"' && !skip_delimiter {
            return Ok((&buf[i..], ret));
        } else {
            ret.push(ch);
            skip_delimiter = false;
        }
    }
    Err(nom::Err::Incomplete(nom::Needed::Unknown))
}

Example in playground

This second function loops through the input characters and skips '\'. If two '\' appear together then the skip_delimiter flag is used to make sure the second '\' is in the result.

3 Likes

It's work ! Thanks a lot :+1: