Serialize bytestrings in `serde` differently depending on the output format

I have a struct some of whose fields are relatively long bytestrings. When serialized to JSON they result in integer arrays, which is not very efficient space-wise.

I know it is possible to use a custom function (either by writing a custom serializer, or with #[serde(serialize_with)]), but it will apply to all output formats. So if I use, say, a b64-encoded string to represent the bytes, it will be stored that way to binary formats like BSON too. I wonder if it is possible to use one serializer (b64 in this case) for text-based formats (like JSON or YAML), and just store bytes directly in BSON or other binary formats. Or does this kind of discrimination not fit into serde data model?

There's a crate serde_bytes which claims to "enable optimized handling of &[u8] and Vec<u8>", but it calls serialize_bytes internally, which results in integer arrays in JSON.

I think you could write your own wrapper newtype like this:

struct Bytestring(Vec<u8>);

impl Serialize for Bytestring {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        if serializer.is_human_readable() {
            // serialize as base64
        } else {

Assuming that Serializer::is_human_readable describes the distinction in formats that you care about. Of course you might want a matching Deserialize impl as well.

1 Like

Thanks, I wasn't aware of is_human_readable(). Although it seems that it has a default implementation returning true, and some package writers don't feel like overriding it - for example, BSON (from bson-rust) is still considered "human-readable".

Submit a PR!

Actually, I checked, and it is set to false in 2.0 (currently in beta). I'm using the stable 1.2 where it's still true. So I take my complaint back :slight_smile:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.