Help with combine example?

Hi all,

I am doing advent of code to learn Rust basics. I'm using each problem set as a way to learn as much about Rust and popular crates as I can. Most of the problem sets require some basic parsing, and can usually be solved with .split() and .parse() from the stdlib but I'm interested in learning how to use the various parsing crates out there.

For this input

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green

I'm wanting to parse (u64, Vec<Vec<(String, u64)>>) where the components I am extracting are highlighted in the following:

Game [1]: [[[3] [blue]], [[4] [red]]; [[1] [red]], [[2] [green]], [[6] [blue]]; [[2] [green]]]

So I wrote a little parser using combine because it's got a similar API to Haskell and Scala parsers I've used in the past. This is what I came up with

use combine::easy::ParseError;
use combine::parser::char::{char, digit, letter, space, spaces, string};
use combine::{many1, sep_by, sep_by1, EasyParser, Parser};
...
fn parse(input: &str) -> Result<(u64, Vec<Vec<(String, u64)>>), ParseError<&str>> {
    // Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green

    let space1 = space().skip(spaces());
    let num = many1(digit()).and_then(|s: String| s.parse::<u64>());
    let draw = num.skip(space1).and(many1(letter()));

    let parser = string("Game")
        .and(space1)
        .with(num)
        .skip(char(':'))
        .skip(space1)
        .and(sep_by(
            sep_by1(draw, char(',').and(space1)),
            char(';').and(space1),
        ));

    return Ok(parser.easy_parse(input)?.0);
}

but now I'm getting compiler errors that I don't know how to fix, the error being very long but starts:

   |
66 |     return Ok(parser.easy_parse(input)?.0);
   |                      ^^^^^^^^^^ the trait `Extend<(u64, _)>` is not implemented for `Vec<(String, u64)>`
   |

Normally in other languages I would define sub-parsers and enforce types on them, to help me isolate where I may have introduced a problem, and to help out with inference. How can I do that here to help debug this? e.g. I'd like to type

    let num: Parser<u64> = many1(digit()).and_then(|s: String| s.parse::<u64>());

but it wants me to provide lots of other type parameters that make this quite unwieldy to use

   |
53 |     let num: Parser<u64> = many1(digit()).and_then(|s: String| s.parse::<u64>());
   |              ^^^^^^^^^^^ help: specify the associated types: `Parser<u64, Output = Type, PartialState = Type>`

Your approach is sound here, you're actually very close! :slight_smile:

let num: Parser<u64> = ...

Needs to be (something like)

let num: Parser<u64, Output = Type, PartialState = Type> = ...

But filling in the Types properly. u64 looks like the Output to me, the rest I'd have to check the combine docs to know what they meant... but I expect we can let the compiler figure them out for us (Pro tip!)

let num: Parser<_, Output = u64, PartialState = _> = ...

Putting _ is basically telling rust "something goes here, you figure it out".

I'm not able to test right now what that outputs, but does it help you make progress?

1 Like

thanks for the help!

That gives me

53 | ...m: Parser<_, Output=u64, PartialState = _> = many1(digit()).and_then(|s: String| s.parse::<u64>(...
   |       ---------------------------------------   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `dyn Parser`, found `AndThen<Many1<..., ...>, ...>`
   |       |
   |       expected due to this
   |

which I think is because the Parser is the trait but I actually have an AndThen struct in hand. The types are definitely getting very complex at this point, what's really needed to do the kind of debugging I'm used to is a nice type alias where I can go

    let num: P<u64> = many1(digit()).and_then(|s: String| s.parse::<u64>());

but I'm not sure if that's possible because of the nesting and how that would collapse things with known sizes to be dynamic.

Ah of course, that makes sense. It is building up a complex nested set of types which will implement Parser.

I think the problem is probably around sep_by(), but I don't have a good way to debug that I'm afraid. The struct type is too complex but the trait object is too simple.

I expect that this Extend trait is used to build up the output into a Vec, and it is struggling to do that with all the nesting...

Might be worth looking into the docs for extend and/or sep_by and it's like in combine

1 Like

ah ok, well that is unfortunate. This sounds like a great example of an idea that works well in Haskell not translating to Rust, so I'll move onto the more popular nom and pest parsers as they sound like they might be more idiomatic.

I'm now retrying with nom and it has much the same API

use nom::{
    bytes::complete::*,
    character::complete::*,
    multi::separated_list1,
    //number::complete::u64,
    //    combinator::map_res,
    //    sequence::tuple,
    IResult,
    Parser,
};
...

fn parse(input: &str) -> IResult<&str, (u64, Vec<Vec<(String, u64)>>)> {
    // Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green

    // there doesn't seem to be a way to ignore the output of any sub-parser,
    // there is only "and". We could create helpers ~ <~ ~> to workaround this.

    let draw = u64.and(space1).and(alpha1);
    let game = separated_list1(tag(",").and(space0), draw);
    let games = separated_list1(tag(";").and(space0), game);
    let all = tag("Game").and(space1).and(u64).and(space0).and(games);

    Ok(all.parse(input)?)
}

Before I go ahead and implement things that ignore the left/right side... is there something I missed from the API docs? I thought I read them quite thoroughly.

fn parse(input: &str) -> IResult<&str, (u64, Vec<Vec<(String, u64)>>)> {
    // Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green

    // there doesn't seem to be a way to ignore the side of an "and"
    let left = |(a, _)| a;
    let right = |(_, b)| b;
    let swap = |(a, b)| (b, a);

    let draw = u64
        .and(space1)
        .map(left)
        .and(alpha1.map(|s: &str| s.to_string()))
        .map(swap);
    let game = separated_list1(tag(",").and(space0), draw);
    let games = separated_list1(tag(";").and(space0), game);

    let mut all = tag("Game")
        .and(space1)
        .and(u64)
        .map(right)
        .and(tag(":")).map(left)
        .and(space0).map(left)
        .and(games);

    Ok(all.parse(input)?)
}

but that left / right stuff is nasty!

Found it, that's preceded in nom::sequence - Rust and terminated in nom::sequence - Rust

It would be nice if they were available as methods on Parser but this is fine.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.