granit-parser can parse both YAML and JSON and provides span information. This crate supports surrogate pairs and other edge cases what were traditionally different between YAML and JSON. It supports both char-based and byte-based coordinates. No hacks.
If your goal is to get exact error location while doing exactly serde deserialization, you can also use serde-saphyr that builds on the top of granit-parser and would report precise error location with snippet, same as Rust compiler does. It will not report locations of tokens where there is no error, but you can use it with garde or validator, these two tools may cover your domain specific cases. If you use very aggressive Serde renames, flattening then Spanned may be required for domain specific diagnostics.
Here is how to get spans from granit-parser:
let yaml = "name: Alice\nitems:\n - book\n - pen\n";
for next in Parser::new_from_str(yaml) {
let (event, span) = next?;
let source = span
.byte_range()
.map(|range| &yaml[range])
.unwrap_or_default();
println!(
"{event:?}: chars={}..{}, bytes={:?}, start={}:{}, end={}:{}, indent={:?}, source={source:?}",
span.start.index(),
span.end.index(),
span.byte_range(),
span.start.line(),
span.start.col(),
span.end.line(),
span.end.col(),
span.indent,
);
}
The output:
StreamStart: chars=0..0, bytes=Some(0..0), start=1:0, end=1:0, indent=None, source=""
DocumentStart(false): chars=0..0, bytes=Some(0..0), start=1:0, end=1:0, indent=None, source=""
MappingStart(0, None): chars=0..0, bytes=Some(0..0), start=1:0, end=1:0, indent=None, source=""
Scalar("name", Plain, 0, None): chars=0..4, bytes=Some(0..4), start=1:0, end=1:4, indent=Some(0), source="name"
Scalar("Alice", Plain, 0, None): chars=6..11, bytes=Some(6..11), start=1:6, end=1:11, indent=None, source="Alice"
Scalar("items", Plain, 0, None): chars=12..17, bytes=Some(12..17), start=2:0, end=2:5, indent=Some(0), source="items"
SequenceStart(0, None): chars=21..21, bytes=Some(21..21), start=3:2, end=3:2, indent=None, source=""
Scalar("book", Plain, 0, None): chars=23..27, bytes=Some(23..27), start=3:4, end=3:8, indent=None, source="book"
Scalar("pen", Plain, 0, None): chars=32..35, bytes=Some(32..35), start=4:4, end=4:7, indent=None, source="pen"
SequenceEnd: chars=36..36, bytes=Some(36..36), start=5:0, end=5:0, indent=None, source=""
MappingEnd: chars=36..36, bytes=Some(36..36), start=5:0, end=5:0, indent=None, source=""
DocumentEnd: chars=36..36, bytes=Some(36..36), start=5:0, end=5:0, indent=None, source=""
StreamEnd: chars=36..36, bytes=Some(36..36), start=5:0, end=5:0, indent=None, source=""