I want to parse text into a struct with a lot of optional fields. To make things easier, I would like to use an existing parser combinator crate.
Problem:
All crates I looked at, including nom
, combine
, glue
, pom
, etc. only offer "output-only" parsers, i.e. each parser is a function Fn*(Input) -> Result<(Input, Output), Error>
. With this, I would first need to parse the fields into an intermediate type, before I can create the final struct from those. For example:
// The struct we want to parse into.
#[derive(Default)]
struct Struct {
field1: Option<String>,
field2: Option<String>,
// ...
}
// This enum is just boilerplate.
enum StructField {
Field1(String),
Field2(String),
// ...
}
// And this implementation of From is unnecessary as well.
impl From<Vec<StructField>> for Struct {
fn from(struct_fields: Vec<StructField>) -> Self {
let mut result = Self::default();
for struct_field in struct_fields {
match struct_field {
Field1(value) => result.field1 = Some(value),
Field2(value) => result.field2 = Some(value),
// ...
}
}
}
}
// Here the actually meaningful code starts.
// For simplicity I ignore all error handling in this example.
fn parse_struct(input: Input) -> (Input, Struct) {
let (input, fields) = nom::many0(parse_field)(input);
(input, fields.into())
}
fn parse_field(input: Input) -> (Input, StructField) {
alt((parse_field1, parse_field2 /*, ...*/))(input)
}
fn parse_field1(input: Input) -> (Input, StructField) {
value = // parse field1 ...
(input, StructField::Field1(value))
}
fn parse_field2(input: Input) -> (Input, StructField) {
value = // parse field2 ...
(input, StructField::Field2(value))
}
Things that don't work:
I tried to circumvent this by using (RealInput, &mut Output)
as Input
type in nom
, but many combinators (rightfully) require Input: Clone
, thus making this approach impossible. I expect that all other listed parser combinator crates have the same limitation, as they all offer methods similar to many0
. Such methods allow an internal parser to fail, and thus need to back up (clone) the input before passing it to the fallible parser.
Ideal solution:
If a crate would additionally offer parsers structured as Fn*(Input, &mut Output) -> Result<Input, Error>
or Fn*(Input, Output) -> Result<(Input, Output), Error>
, it would be much easier to parse the struct with optional fields, as I can first create the struct with all fields set to None, and then populate the fields as they appear in the input. I would not need to create the enum-boilerplate and the code would be much shorter in return. Also I can handle errors (illegally formatted struct field values) directly where they appear, as opposed to handling them only when combining the fields into a struct.
Example of the ideal solution:
#[derive(Default)]
struct Struct {
field1: Option<String>,
field2: Option<String>,
// ...
}
// For simplicity I ignore all error handling in this example.
// This method has the structure of a "classic" parser.
fn parse_struct(input: Input) -> (Input, Struct) {
let mut struct = Struct::default();
let (input, fields) = nom::ref_mut_many0(parse_fields)(input, &mut struct);
(input, fields.into())
}
// This and the following methods have the structure of the proposed parsers.
fn parse_fields(input: Input, output: &mut Struct) -> Input {
alt((parse_field1, parse_field2 /*, ...*/))(input, output)
}
fn parse_field1(input: Input, output: &mut Struct) -> Input {
output.field1 = // parse field1 ...
input
}
fn parse_field2(input: Input, output: &mut Struct) -> Input {
output.field2 = // parse field2 ...
input
}
Does anyone know a parser-combinator crate that offers an interface like this? If not, is there a reason why not, e.g. my descibed "ideal solution" being flawed in some way I do not see?