How should I write a GTP parser generator?


#1

The Goal

I want to write a library that generates a parser for a Go Text Protocol grammar

  1. Different programs support different subsets of the protocol, so it would be nice if you could generate a different parser for each one automatically.
  2. It requires being able to print out all of the commands the current program supports. Commands can contain symbols like - and they’re not required to be snake case.
  3. I would like the parser to spit out a parse of each command as an enum, so I don’t have to match on &strs manually.

Current solutions for GTP in particular don’t really fit my use case because they already do IO. I would like to just parse the commands.

Each command looks like

numbers? space command-name (space argument)* and they’re separated by LF

Possible Design

Have the user of the library write a macro like:

gtp!(
    KnownCommand = "known_command"
    ListCommands = "list_commands"
    MyCommand = "my-command" int string
    Quit = "quit"
)

Then the macro should generate an enum like

enum Command {
    KnownCommand,
    ListCommands,
    MyCommand(i32, String),
    Quit,
}

then when passed a String like “1 quit” the parser should return Ok(Command({id: Some(1), parse: Command::Quit})) and when passed a String like “my-command” it should return Error(ParseError::NotEnoughArguments) or something like that.

I should also be able to query the parser for the list of all commands. This means printing terminals, in other words if I ask it for a list for that parser, it should print out “known_commands\nlist_commands\nmy-command\nquit”. Similarly, when queried whether it knows a command “my-command” it should return true.

Any libraries to help with this?

I’ve looked at libraries that deal with parsers, but they seem so complicated and I don’t know if any would actually help me, or if it would actually make the code more complicated. Are there any libraries that would help, and how would I use them?


#2

One option is to use nom to create your parser, and if you want users to be able to generate their own commands then you’d end up defining macros which create macros.

If it were me I’d be tempted to use code generation and a build script. You could do it in two stages where the first stage just transforms strings into some basic Command struct using a normal parser. Then the second stage uses a generated match statement (or equivalent) to turn the stringly-typed commands into your generated enum.

If you want some help with this let me know, I’ve played with parsers before and this sounds like quite an interesting challenge. Plus it might be a nice alternative to the existing Rust parser libraries if it can be easily generalised.


#3

I actually wrote a library that converts strings to enum variants already

Although, I changed my mind and I want to print out the terminal symbols in each rule instead of the enum variant (because the rules can have a string representation with a hyphen in it)

Can this be done with similar macro (ab)use, or is my more complicated design going to run into limitations? What would I use a build script for?

Every time I look at nom I don’t feel I understand how to use it

edit: before I lose this https://is.gd/SSrkOD

I would also have to pass in the name of the enum inside the macro which isn’t a problem


#4

I just tried implementing this and it was surprisingly easy!

First off I wrote a thing which breaks the provided line into an optional number followed by a vector of strings (i.e. the command name and its arguments, if any). This gets stored into a RawCommand.

pub struct RawCommand {
  pub count: Option<u32>,
  pub name: String,
  pub args: Vec<String>,
}

My parser implementation definitely isn’t the most elegant or performant, but considering this is more of a fun experiment for me I’m not super concerned.

Then I wrote a macro which takes in something that looks like an enum definition and will then create an enum that can be created using som_raw_command.into().

custom_command!(enum MyCommand {
  Play,
  ShowBoard,
})

fn main() {
  let line = "10 play black D5";
  let got: MyCommand = go_text_protocol::parse(line).unwrap();
  assert_eq!(got, MyCommand::Play);
}

Next up I want to create a buffered reader which generic over anything which implements Read and C: From<RawCommand> and you can then treat it as a stream of commands read from some source.

The thing which makes all of this work really nicely is how the parse() function is generic over anything which implements From<RawCommand>.

All the code is on GitHub. The macro isn’t as nice as I’d like it to be, but it’s very much a work in progress.


#5

Not that easy. I started my own implementation of the macro:

I had to do it in a tuple to macro it up correctly