Serialize bytestrings in `serde` differently depending on the output format

fjarri · August 30, 2021, 11:44pm

I have a struct some of whose fields are relatively long bytestrings. When serialized to JSON they result in integer arrays, which is not very efficient space-wise.

I know it is possible to use a custom function (either by writing a custom serializer, or with #[serde(serialize_with)]), but it will apply to all output formats. So if I use, say, a b64-encoded string to represent the bytes, it will be stored that way to binary formats like BSON too. I wonder if it is possible to use one serializer (b64 in this case) for text-based formats (like JSON or YAML), and just store bytes directly in BSON or other binary formats. Or does this kind of discrimination not fit into serde data model?

There's a crate serde_bytes which claims to "enable optimized handling of &[u8] and Vec<u8>", but it calls serialize_bytes internally, which results in integer arrays in JSON.

cole-miller · August 31, 2021, 1:12am

I think you could write your own wrapper newtype like this:

struct Bytestring(Vec<u8>);

impl Serialize for Bytestring {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        if serializer.is_human_readable() {
            // serialize as base64
        } else {
            serializer.serialize_bytes(&self.0)
        }
    }
}

Assuming that Serializer::is_human_readable describes the distinction in formats that you care about. Of course you might want a matching Deserialize impl as well.

fjarri · August 31, 2021, 4:23am

Thanks, I wasn't aware of is_human_readable(). Although it seems that it has a default implementation returning true, and some package writers don't feel like overriding it - for example, BSON (from bson-rust) is still considered "human-readable".

cole-miller · August 31, 2021, 5:07am

Submit a PR!

fjarri · August 31, 2021, 6:06am

Actually, I checked, and it is set to false in 2.0 (currently in beta). I'm using the stable 1.2 where it's still true. So I take my complaint back

system · November 29, 2021, 6:07am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Binary serialization in custom format help	18	4218	September 21, 2021
Do serde formats guarantee that serialized str can be deserialized as bytes? help	8	2082	January 12, 2023
Is there an existing crate for conditional `serde` serialization or byte slices? help	5	779	December 18, 2021
Any way to make serde serialization of a hash type vary based on format? help	3	141	December 21, 2023
How to store binary files in your own format? help	17	1380	June 29, 2023

Serialize bytestrings in `serde` differently depending on the output format

Related Topics