How to output C style escaped unicode in JSON

amy.k · March 2, 2022, 2:26am

I'm working on writing a program that processes the JSON for the reMarkable tablet. There is a field in the JSON (iconCode) that outputs a Unicode character in escaped form. I want to use serde / serde_json to deserialize this file, add custom templates, and serialize it back out. Here is a trimmed example of the template.json file:

{
    "templates": [
        {
            "name": "Burndown",
            "filename": "burndown",
            "iconCode": "\ue9fe",
            "categories": [
                "Life/organize"
            ]
        }
    ]
}

When I've attempted to round-trip this, I have only been able to get it to output as "iconCode": "" or in Rust's escape form of \u{e9fe}. When I use those versions, the remarkable does not display the template correctly in the selection UI for the first option and fails to correctly display any template with the second.

I have a full example with tests in this branch if that is helpful. I feel like this should be possible, but none of the combinations I've tried so far have ended up successful. I've tried String, Vec<u8>, newtype wrapper around String with a custom Serialize instance, and other approaches.

My understanding so far is that this RFC removed the ability to represent \u#### style escapes in Rust strings and that's causing the straightforward approach to not be successful.

RedDocMD · March 2, 2022, 2:46am

Why not output "\\ue9fe" from the wrapper type while implementing Deserialize on it?

krdln · March 2, 2022, 11:45am

That means that remarkable's json handling is simply broken, as "\ue9fe" and "" are equal as the spec is concerned.

Perhaps its parser is attempting some other encoding? Crazy idea – push '\u{feff}' char at the front of the file? Another crazy idea – something in LOCALE settings?

What is this straightforward approach you've tried? Is this escape_unicode? Yeah, this escapes in Rust's escaping style which doesn't match json's. Perhaps just try writing escaping manually (as a post-processing step after json is serialized). Playground.

    let mut out = String::with_capacity(s.len());
    for c in s.chars() {
        if c as u32 <= 127 {
            out.push(c);
        } else {
            write!(out, r"\u{:4x}", c as u32);
        }
    }

(this still relies on json serializer to handle all the other escapes, and just escapes non-ascii utf charactes)

Edit: this loop won't cover full unicode range! See @quinedot's response below

amy.k · March 2, 2022, 11:40pm

A post-processing step worked, thanks!

Why not output "\\ue9fe" from the wrapper type while implementing Deserialize on it?

This results in "\\ue9fe" in the JSON, unfortunately.

That means that remarkable's json handling is simply broken, as "\ue9fe" and "" are equal as the spec is concerned.

Can't really disagree with you there. I wish they would open source their software. It would let me troubleshoot some of these issues more directly.

Thank you both for your help.

quinedot · March 3, 2022, 2:11am

Be careful -- since the escaping uses 16-bit code units, some characters are going to require surrogate pairs.

Edit: encode_utf16 looks useful here.

quinedot · March 3, 2022, 10:19pm

I thought it might be tricky/tedious to target JSON strings so as to be robustly correct, but as it turns out JSON whitespace (and thus all non-string data) is supposed to be ASCII.

Here's an updated playground to handle the surrogate pair cases. You could alternatively make a newtype that implements Display.

Topic		Replies	Views
Idiomatic escaping of Unicode sequences help	2	181	April 6, 2025
How to parse an escaped JSON non UTF-8 string while keeping it's binary representation? help	8	385	July 17, 2025
How to avoid additional char escape from stdin? help	7	2669	October 31, 2018
Prevent backslash escape from serde_json when deserializing help	2	1585	April 25, 2020
[Solved] Serde deserialize str containig special chars help	3	4597	April 13, 2019

How to output C style escaped unicode in JSON

Related topics