Parsing space separated hex values in nom

I am trying to parse space separated hex values in vec![], not able to figure out how to solve this.

Hex
0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff

Result
[0, 1, 3, 3, 10, 13, 27, 93, 24, 24, 24, 26, 4, 19, 127, 128, 254]

Other format
Hex
<B[18] 0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff>

I have tried this for

fn hex_to_u8(input: &str) -> IResult<&str, u8> {
    let (input, _) = tag("0x")(input)?;
    let (input, hex) = take_until(" ")(input)?;
    let (input, _) = tag(" ")(input)?;

    Ok((input, u8::from_str_radix(hex, 16).unwrap()))
}

Rust Playground link

// parse hex value 0x00 to 0xff into u8 value, which are separated by space, using nom
use nom::{
    bytes::complete::{tag, take_until},
    character::complete::multispace0,
    combinator::map_res,
    multi::many1,
    IResult,
};

fn hex_to_u8(input: &str) -> IResult<&str, u8> {
    
}

fn parser(input: &str) -> IResult<&str, Vec<u8>> {
    let (input, result) = many1(hex_to_u8)(input)?;
    Ok((input, result))
}

fn bin_parser(input: &str) -> IResult<&str, Vec<u8>> {
    let (input, _) = multispace0(input)?;
    let (input, _) = tag("<B")(input)?;
    let (input, _) = multispace0(input)?;

    let (input, _) = tag("[")(input)?;
    let (input, _) = take_until("]")(input)?;
    let (input, _) = tag("]")(input)?;

    let (input, _) = multispace0(input)?;

    let (input, result) = many1(hex_to_u8)(input)?;

    let (input, _) = multispace0(input)?;

    let (input, _) = tag(">")(input)?;

    Ok((input, result))
}

fn main() {
    let input =
        "0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff";
    let (_, result) = parser(input).unwrap();
    println!("{:?}", result);

    let bin = "<B[18] 0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff>";
    let (_, result) = bin_parser(bin).unwrap();
    println!("{:?}", result);

    let bin = "<B 0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff>";
    let (_, result) = bin_parser(bin).unwrap();
    println!("{:?}", result);

    let bin = "<B[0]>";
    let (_, result) = bin_parser(bin).unwrap();
    println!("{:?}", result);

    let bin = "<B>";
    let (_, result) = bin_parser(bin).unwrap();
    println!("{:?}", result);
}

You didn't write what problem you're having, so I can only guess. I think the problem is in hex_to_u8 which requires a space after it. I made the space optional.

fn hex_to_u8(input: &str) -> IResult<&str, u8> {
    let (input, _) = tag("0x")(input)?;
    let (input, hex) = alphanumeric1(input)?;
    let (input, _) = opt(tag(" "))(input)?;

    Ok((input, u8::from_str_radix(hex, 16).unwrap()))
}

The third testcase starting with "<B 0x00" still failes, and as you didn't give the expected output, I don't know what to do here.

Rust playground link with test code

// parse hex value 0x00 to 0xff into u8 value, which are separated by space, using nom
use nom::{
    bytes::complete::{tag, take_until},
    character::complete::multispace0,
    combinator::map_res,
    multi::many1,
    IResult,
};

enum Type {
    BIN,
}

pub struct Item {
    pub ty: Type,
    pub data: Option<Vec<u8>>,
}

fn hex_to_u8(input: &str) -> IResult<&str, Vec<u8>> {
    let (input, _) = multispace0(input)?;
    let (input, hex) = take_until(" ")(input)?;
    let (input, _) = multispace0(input)?;

    let result = hex
        .trim()
        .split(" ")
        .into_iter()
        .map(|x| u8::from_str_radix(&x[2..], 16).unwrap())
        .into_iter();

    Ok((input, result.collect()))
}

fn parser(input: &str) -> IResult<&str, Option<Vec<u8>>> {
    let (input, result) = many1(hex_to_u8)(input)?;
    Ok((input, Some(result.into_iter().flatten().collect())))
}

fn bin_parser(input: &str) -> IResult<&str, Vec<u8>> {
    let (input, _) = multispace0(input)?;
    let (input, _) = tag("<B")(input)?;
    let (input, _) = multispace0(input)?;

    let (input, _) = tag("[")(input)?;
    let (input, _) = take_until("]")(input)?;
    let (input, _) = tag("]")(input)?;

    let (input, _) = multispace0(input)?;

    let (input, result) = many1(hex_to_u8)(input)?;

    let (input, _) = multispace0(input)?;

    let (input, _) = tag(">")(input)?;

    Ok((input, result.into_iter().flatten().collect()))
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test() {
        let input =
            "0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff";
        let (_, result) = parser(input).unwrap();

        let bin = Item {
            ty: Type::BIN,
            data: Some(vec![
                0x00, 0x01, 0x03, 0x03, 0x0a, 0x0d, 0x1b, 0x5d, 0x18, 0x18, 0x18, 0x1a, 0x04, 0x13,
                0x7f, 0x80, 0xfe, 0xff,
            ]),
        };

        assert_eq!(result, bin.data);

        let bin = "<B[18] 0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff>";
        let (_, result) = bin_parser(bin).unwrap();

        let bin = Item {
            ty: Type::BIN,
            data: Some(vec![
                0x00, 0x01, 0x03, 0x03, 0x0a, 0x0d, 0x1b, 0x5d, 0x18, 0x18, 0x18, 0x1a, 0x04, 0x13,
                0x7f, 0x80, 0xfe, 0xff,
            ]),
        };

        assert_eq!(result, bin.data.unwrap());

        let bin = "<B 0x00 0x01 0x03 0x03 0x0a 0x0d 0x1b 0x5d 0x18 0x18 0x18 0x1a 0x04 0x13 0x7f 0x80 0xfe 0xff>";
        let (_, result) = bin_parser(bin).unwrap();

        let bin = Item {
            ty: Type::BIN,
            data: Some(vec![
                0x00, 0x01, 0x03, 0x03, 0x0a, 0x0d, 0x1b, 0x5d, 0x18, 0x18, 0x18, 0x1a, 0x04, 0x13,
                0x7f, 0x80, 0xfe, 0xff,
            ]),
        };

        assert_eq!(result, bin.data.unwrap());

        let bin = "<B[0]>";
        let (_, result) = bin_parser(bin).unwrap();

        let bin = Item {
            ty: Type::BIN,
            data: Some(vec![]),
        };

        assert_eq!(result, bin.data.unwrap());

        let bin = "<B>";
        let (_, result) = bin_parser(bin).unwrap();

        let bin = Item {
            ty: Type::BIN,
            data: None,
        };

        assert_eq!(result, bin.data.unwrap());
    }
}

One can't write a parser from examples and unit tests are not a specification. Please describe the data format you have in words, and explain the meaning of each part like the [18] for example. I don't feel comfortable writing more code because I'm unsure what you actually require and don't want to deliver code that accidentally works.

Here’s a simple version that basically works (playground):

pub fn bin_parser(input: &str) -> IResult<&str, Vec<u8>> {
    let length = delimited(tag("["), digit1, tag("]"));

    let byte = map(
        preceded(tag("0x"), hex_digit1),
        |s| u8::from_str_radix(s, 16).unwrap()
    );

    let mut bin_item = delimited(
        preceded(multispace0, tag("<B")),
        tuple((
            opt(preceded(multispace0, length)),
            many0(preceded(multispace0, byte)),
        )),
        preceded(multispace0, tag(">"))
    );

    let (input, (_length, bytes)) = bin_item(input)?;
    Ok((input, bytes))
}

But you should really consider what sort of validation you want. This ignores the “length” field, and it doesn’t place any restrictions on the number of hex digits per hex number.

If you don’t understand this code, please let me know and I can explain it more. I think that if you learn this way of using nom, your parsers will be easier to write and understand and modify.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.