Hey folks! I am looking for some advice on how to handle a situation.
I am writing some tests that take some TOML input. One of the fields in the TOML is a path. The issue is that TOML considers \X to be a unicode escape, and so Windows paths don't work. I am currently doing something like this:
let config = format!(
r#"
path = {:?}
"#,
path
);
which abuses the fact that debug print:
includes ""s
turns \ into \\
This works, but feels kind of gross. Does anyone have any advice for doing this in a better way? Thanks!
(There are more keys and values than just path, hence the raw string, I've elided those parts as they're not that important.)
This is testing some code that uses a proper deserializer; I think the test works a bit better by having the raw values, because if we used a serializer here, then we're just testing a round trip, rather than the behavior.
If you don't want to write test data by hand but don't trust your serializer, you can use snapshot tests (e.g. insta). Serialize the data using your serializer, confirm that the result is valid, and tests will alert you if the serializer's behavior changes in the future.
You could use escape_default which uses escaping that is almost compatible with TOML's escape sequences. You'll need to do a little more work if you want to support paths containing non-ASCII characters.
You should also just be able to use forward slashes. All of the WIN32 file handling functions accept forward slashes. IIRC it's cmd.exe that doesn't like forward slashes, so this won't work if you're using those paths with a system-like call that invokes cmd but otherwise they should work fine.
Come to think of it, if you want to escape arbitrary strings for TOML (e.g. both WIndows paths and Linux ones) it would be simple enough to write your own escape code based on the TOML table above. Something like:
fn toml_escape(path_str: &str) -> String {
let mut escaped = String::with_capacity(path_str.len());
for char in path_str.chars() {
match char {
'\u{08}' => escaped.push_str(r"\b"),
'\t' => escaped.push_str(r"\t"),
'\n' => escaped.push_str(r"\n"),
'\u{0C}' => escaped.push_str(r"\f"),
'\r' => escaped.push_str(r"\r"),
'"' => escaped.push_str(r#"\""#),
'\\' => escaped.push_str(r"\\"),
c if c <= '\u{1f}' => {
let hex = to_hex(c as u8);
escaped.push_str(r"\u00");
escaped.push(hex[0] as char);
escaped.push(hex[1] as char);
},
c => escaped.push(c)
}
}
escaped
}
fn to_hex(n: u8) -> [u8;2] {
let digits = b"0123456789ABCDEF";
[
digits[(n >> 4) as usize],
digits[(n & 0xF) as usize]
]
}
That's pretty off-the-cuff so doubtless it could be written more efficiently. Especially if you were to write directly to the formatter.