Understanding deserialization concept among programming language

I'm new to programming, and I am trying to learn about serialization and deserialization with using Serde. It seems to me that, when I deserializing an object, I have to give some information about the return type. But in dynamic language, suce as Python pickle or Julia Serialization, I just read the serialized object, which contains the type information in itself, so I don't have to (or I cann't) give the information about the return type.

So I'm curious, is it a matter of the type of programming laungage? Is that true for all static programming language I have to give return type information when I deserialize a serialized object, and in dynamic programming language I don't?

I don't have time to give you detailed explanation. But you need to stop treating programming language as a black box magic, and think about the reasons with logic.

Anyway, you can deserialize json into an abstract Value in Rust.

Serde has been designed to work with both "self describing" formats like JSON and CBOR and "non self describing" formats. A self describing format includes enough information that you can decode something from the format without knowing what "shape" it has. JSON is a simple example. Just by looking at it, you can tell whether a JSON file has an array or an object in it.

In contrast a non self describing format aims to save space by not requiring the format to keep track of details the program already knows about. The decoder for a non self describing format needs to know whether it's decoding an array or an object in order to succeed, but it's also possible to successfully decode something that turns out to be nonsensical with non self describing formats. Since serde is designed to work with both kinds of formats, it has to have all of the type information available to function even if it doesn't need it for a particular format.

Deserialization libraries in languages which have good support for dynamic typing generally prefer to throw a decoded object at you and let you use the language's support for runtime type checks to get values of the correct types.They don't need a middleman like serde to help with type conversions, because the language already has good support for doing those kinds of things. Since self describing formats are much more common, you just don't see the APIs that would let those languages work with non self describing formats all that often.

As @zirconium-n notes, you can get an "dynamically typed" JSON value from serde_json and manually decode it to create your type. It's just kind of annoying in a language like Rust, so it's easier to go through serde which takes care of the boilerplate.

3 Likes

Sometimes, dynamically types languages are also destribed as still technically being strongly and statically typed, yet only having a single type that is used for all values. With this interpretation, one could argue that their deserialization story is simpler than Rustʼs mainly because for that single universal type, the designers of the language have already sufficiently defined how serialization and deserialization should work, and when deserializing, it's always implicitly clear that you want to deserialize into a value of that single all-encompassing type. So, if you only used a single type in Rust for all the data you want to serialize deserialize, then your de-/serialization story could be just as straightforward. Of course, it wouldn't necessarily be clear which type to choose...

If you do need to deserialize a bunch of different data into a single, extensible "dynamic" type, then using a trait object type and defining the de-/serialization with typetag - Rust could be an option. If you don't ever want to provide any information on how de-/serialization is defined for your custom structs (beyond adding the basic derive(Serialize, Deserialize) onto the struct), then you merely need to restrict yourself to struct definitions where such a derive works without problems. Maybe such restrictions aren't all that different either compared to the restrictions on what kind of data "types" you can define in certain dynamic programming languages, though perhaps I'm mistaken on that; after all, I have very little experience in using Python, Julia, or the like.

1 Like

If I can get the TypePath of the needed annotation as a String, can I make a use of it?
The following is the code I can imagin(can not compile):

use serde_json;
use serde::{Serialize, Deserialize, de::DeserializeOwned};

pub fn get_type_str<T>(_: &T) -> String {
    let res = std::any::type_name::<T>();
    String::from(res)
}

pub fn serd_json_type<T>(x: &str) -> T 
where 
    for<'a> T: Deserialize<'a>,
{
    let res: T = serde_json::from_str(x).unwrap();
    res
}

#[derive(Serialize, Deserialize)]
pub struct MyType(pub i32, pub i32);

use syn;
use syn::TypePath;

macro_rules! get_data {
    ($type_name: expr, $x: expr) => {
        serd_json_type::<syn::parse_str::<TypePath>($type_name)?>($x).unwrap()
    }
}

fn main() {
    let my_type = MyType(1, 1);
    let my_type_type = get_type_str(&my_type);
    let my_type_str = serde_json::to_string(&my_type).unwrap();
    let res = get_data!(my_type_type, my_type_str);
}

So, when I serialize my_type, I also get its TypePath my_type_type, but as a String. When I deserialize the object my_type_str, I convert my_type_type into a real TypePath. Can it work?

Thanks for your help, may you take a thought at this kind solution?

No. You can't use runtime information like the value of my_type_type for type annotations, which must be known at compile time.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.