I am pretty new to Rust, but been developing software for nearly 30 years in other languages. My current task is to support (de-)serialization of a model for a given json structure.
The approach I am thinking about is that each struct with such a situation requires a custom (de-)serializater. Does anyone maybe have a hint for me where I should be looking for? Does serde or another serde crate already provide support in some way to achieve this?
One trick I've used with great success is to manually implement deserialize/serialize via a temporary struct.
When serializing, we can use map() and std::slice::from_ref() to let us go from a Option<&T> to a Option<&[T]> (a reference to one item is always a valid reference to a slice of 1 item).
The Deserialize implementation is a bit longer because you need to handle the possibility of multiple headers/payloads, but it's the same general idea.
impl<'de> Deserialize<'de> for Datagram {
fn deserialize<D: Deserializer<'de>>(de: D) -> Result<Self, D::Error> {
#[derive(Deserialize)]
struct Repr {
header: Option<Vec<Header>>,
payload: Option<Vec<Payload>>,
}
let Repr { header, payload } = Repr::deserialize(de)?;
let header = match header {
Some(mut headers) if headers.len() == 0 => Some(headers.remove(0)),
Some(_) => todo!("Figure out how you want to handle multiple headers"),
None => None,
};
let payload = match payload {
Some(mut payloads) if payloads.len() == 0 => Some(payloads.remove(0)),
Some(_) => todo!("Figure out how you want to handle multiple payloads"),
None => None,
};
Ok(Datagram { header, payload })
}
}
I should have mentioned that there are literally dozens of structs that would need this So this approach sadly feels quite cumbersome as each serialize and deserialize implementation requires struct specific code.
Would there maybe a way to generalize this? Maybe using key-values and check if the struct contains a property identical to a key? Sadly I am way too new to the language
Is the problem just that certain fields are serialized as an array with a single object instead of using the object?
If so, you can use the #[serde(with = "some::module")] syntax to tell serde, "when deserializing this field, use some::module::serialize() and some::module::deserialize().
#[derive(Serialize, Deserialize)]
pub struct Datagram {
#[serde(with = "single_element_array")]
pub header: Option<Header>,
#[serde(with = "single_element_array")]
pub payload: Option<Payload>,
}
// common module that can deserialize any type, T, via a single-element array.
mod single_element_array {
use serde::{Deserialize, Deserializer, Serialize, Serializer};
fn serialize<T, S>(value: &T, ser: S) -> Result<S::Ok, S::Error>
where
T: Serialize,
S: Serializer,
{
let repr: &[T] = std::slice::from_ref(repr);
repr.serialize(ser)
}
fn deserialize<'de, T, D>(de: D) -> Result<T, D::Error>
where
T: Deserialize<'de>,
D: Deserializer<'de>,
{
let repr = <Option<[T; 1]>>::deserialize(de)?;
repr.map(|[value]| value)
}
}
You still need to add the #[serde(with = "single_element_array")] attribute to each of these funny fields, but because single_element_array::serialize() and single_element_array::deserialize() are generic you won't need to duplicate the code.
It is not just single fields, it is the whole struct. So the struct should be serialized as an array with every field being using as an element in the array with the field name as the key.
the serialization would result in datagram actually being an array with 2 items, the first one contains datagram.header and the second one contains datagram.paylod:
Ah okay, so instead of using an object with key-value pairs (e.g. {"header": ..., "payload": ...}) they store it as an array where each element is a {"key": "value"} pair?
I feel like you should be able to define your own serde::Deserializer where the deserialize_struct() method will know how to handle the arrays... But that sounds like a lot of work.
If you are okay with some runtime overhead, you could always load the document into a loosely-typed serde_json::Value then transform it to be more like a normal object. From there, serde_json::from_value() lets you deserialize the Value to your type.
Ah okay, so instead of using an object with key-value pairs (e.g. {"header": ..., "payload": ...} ) they store it as an array where each element is a {"key": "value"} pair?
Correct.
Runtime overhead shouldn't be an issue. And if it would be, that can be dealt with later. Right now it would be good to have something working and then go from there.
I am not sure I understand what you mean by
then transform it to be more like a normal object.
One thought I had was to deserialize the array and then process over each item and check wether the key is part of the struct and call deserialization of that element. And when serializing take every public element in the struct and put it into an array and then serialize the array.
Probably have to dive more into serde how it works and do that manually.
Avoided a custom deserializer but instead used post-processing. Serialize the model with serde, then deserialize it to serde_json::Value and walk through it to rearrange the objects into arrays and serialize it again. The performance hit shouldn't be a problem for the start. If it will be, this can still be changed