I have a struct some of whose fields are relatively long bytestrings. When serialized to JSON they result in integer arrays, which is not very efficient space-wise.
I know it is possible to use a custom function (either by writing a custom serializer, or with #[serde(serialize_with)]), but it will apply to all output formats. So if I use, say, a b64-encoded string to represent the bytes, it will be stored that way to binary formats like BSON too. I wonder if it is possible to use one serializer (b64 in this case) for text-based formats (like JSON or YAML), and just store bytes directly in BSON or other binary formats. Or does this kind of discrimination not fit into serde data model?
There's a crate serde_bytes which claims to "enable optimized handling of &[u8] and Vec<u8>", but it calls serialize_bytes internally, which results in integer arrays in JSON.
I think you could write your own wrapper newtype like this:
struct Bytestring(Vec<u8>);
impl Serialize for Bytestring {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
if serializer.is_human_readable() {
// serialize as base64
} else {
serializer.serialize_bytes(&self.0)
}
}
}
Assuming that Serializer::is_human_readable describes the distinction in formats that you care about. Of course you might want a matching Deserialize impl as well.
Thanks, I wasn't aware of is_human_readable(). Although it seems that it has a default implementation returning true, and some package writers don't feel like overriding it - for example, BSON (from bson-rust) is still considered "human-readable".