Hello everyone,
I have a very nice piece of code using serde that works great. However, it has an issue that is causing more and more trouble as time goes on: terrible error messages.
The root of the problem is that the specification I'm trying to parse uses several alternative formats. Here is the type hierarchy I implemented to do this, from deepest to shallowest:
#[derive(Copy, Clone, Debug, PartialEq, PartialOrd, Deserialize)]
#[serde(try_from = "String")] // calls to a custom TryFrom<String> impl
pub struct TimeRange(f32, f32);
#[derive(Debug, Deserialize)]
pub(crate) struct FullInfo {
/* list of fields, including "key" and "time range" */
}
#[derive(Debug, Deserialize)]
#[serde(untagged)]
pub(crate) enum ClipInfo {
/// The full format: a bunch of specified options, all in the struct fields.
Full(FullInfo),
/// The compact format: (key, timerange) pairs with all other options defaulted.
Compact(HashMap<String, TimeRange>),
}
/// The top-level file format. A single map of (filename, info) pairs.
#[derive(Debug)]
/* manual Deserialize impl as well, to e.g. verify path existence */
pub(crate) struct EditDecisionFile(HashMap<String, Vec<ClipInfo>>);
As you can see, the format of the "compact" and "full" is not the same. Seems perfect for an untagged enum, right?
It is, except error messages are ruined. That union means if anything anywhere inside has a parsing error -- a bad time range, a non-string, an accidental sequence instead of a map, a formatting mistake -- I only ever get one error message: data did not match any variant of untagged enum ClipInfo
, with the pointer for line and column at the beginning of the item.
I have tried many things, but I always hit a wall.
- I do not know how to "fold together" the different formats of items into a single struct when the compact format has arbitrary keys.
- I used to manually parse out all the fields in the "full" format, with an
extra
field to detect if it was really compact format instead. That was a lot of work and error prone. The untagged union seems literally designed to avoid that. - I could add an
Error
variant to theenum
, but that would get messy. In particular, I would need more than one, as different "successfully parsed incorrect value" types would be needed for different errors (e.g. "not a map" requiresString
, while "invalid timerange value" requiresHashMap<String, String>
). - I thought
#[serde(other)]
could solve this problem, but it doesn't work on untagged enums.
Finally, I even tried resorting to implementing my own untagged union, where I try to parse the full version and then fall back to the Compact version -- passing any error messages trying to parse that back up. But I can't do that, because serde doesn't expose the fallible construct inside the untagged enum logic that I need. Something like:
let unp: UnparsedValue = map.try_next_value()?; // error on EOF, syntax error, etc.
if let Ok(value) = unp.try_parse_as::<FullInfo>() {
ClipInfo::Full(value)
} else {
ClipInfo::Compact(unp.try_parse_as::<HashMap<String, TimeRange>>()?)
}
Any guidance on how I approach this? Clearly a redesign is needed, but what?