Recursive #[serde(deserialize_with)] definition

Pzixel · January 25, 2022, 10:28pm

Hi everyone. I have a problem which is too big to ask here so I have a following MRE which may not show WHY i need this but I assure you I do.

Consider following snippet:

#[derive(Deserialize)]
struct Foo {
    #[serde(deserialize_with = "deserialize_my_i32")]
    bar: i32,
    #[serde(deserialize_with = "?????")]
    other_bars: Vec<i32>,
    also_bars: Vec<i32>
}

fn deserialize_my_i32<'de, D>(deserializer: D) -> Result<i32, D::Error> where D: Deserializer<'de> {
    Ok(10)
}

Here we go, I have my custom deserialize for i32 which performs addition logic (e.g. parses from different bases etc) and i want some fields to (resursively!) use it and some not. The problem is this code will fail because it only parses i32 but I apply this attribute to Vec. Desired result here is vec full of 10s. But another field should use default deserializer and be unaffected.

Newtypes has desired behavior: they work recursively. But they tend to double every model in my codebase which is pain so I'd like to avoid it if possible.

I'm sorry if I didn't formulate it correctly, it's late night here and I've lost clarity of my mind a bit

In a nutshell, I'd like this code to compile:

use serde::Deserializer;
use serde::Deserialize;

fn my_cool_des_u8<'de, D>(deserializer: D) -> Result<u8, D::Error> where D: Deserializer<'de> {
    todo!()
}

fn my_cool_des_i16<'de, D>(deserializer: D) -> Result<i16, D::Error> where D: Deserializer<'de> {
    todo!()
}

#[derive(Deserialize)]
struct Foo {
    #[serde(deserialize_with = "my_cool_des_u8")]
    small: u8,
    #[serde(deserialize_with = "my_cool_des_i16")]
    avg: i16,
    #[serde(deserialize_with = "my_cool_des_u8")]
    big: Vec<u8>,
    #[serde(deserialize_with = "my_cool_des_i16")]
    huge: Vec<i16>,
    unrelated: Vec<i16> // note no attribute here - should use default deserializer
}

jonasbb · January 25, 2022, 11:39pm

You can try the serde_with crate. The syntax is slightly different, but you can use it with most std collection types.

You can write your example like this:

use serde::Deserializer;
use serde::Deserialize;

struct MyCoolDesi16;
impl<'de> serde_with::DeserializeAs<'de, i16> for MyCoolDesi16 {
    fn deserialize_as<D>(deserializer: D) -> Result<i16, D::Error>
    where
        D: Deserializer<'de>,
    {  
        todo!()
    }
}

#[serde_with::serde_as]
#[derive(Deserialize)]
struct Foo {
    #[serde_as(deserialize_as = "MyCoolDesi16")]
    avg: i16,
    #[serde_as(deserialize_as = "Vec<MyCoolDesi16>")]
    huge: Vec<i16>,
    unrelated: Vec<i16> // note no attribute here - should use default deserializer
}

Pzixel · January 26, 2022, 8:57am

Thank you stranger so much. It's really pretty awesome.

Unfortunately it doesn't work for custom types like I'd like to deserialize something in fnv::FnvHashSet. But it doesn't implement required traits and I cannot neither due to orphan rules. But it will work for 99% cases I suppose

jonasbb · January 26, 2022, 12:42pm

The DeserializeAs trait is always implementable for a local type and allows you to "extend" any type without worrying about orphan rules. Concretely, this means you can write this:

struct MyCustomHashSetDeserialize<T>;
impl<'de, T> serde_with::DeserializeAs<'de, FnvHashSet<T>> for MyCustomHashSetDeserialize<T>;

// usage
#[serde_as(deserialize_as = "MyCustomHashSetDeserialize<MyCoolDesi16>")]
huge: FnvHashSet<i16>,

In the case of fnv::FnvHashSet, which is only an alias for std::collections::HashSet this is hopefully just an oversight of not accounting for the hashing type parameter. An issue would be a good first step to get that fixed.

Pzixel · January 26, 2022, 2:08pm

The DeserializeAs trait is always implementable for a local type

Well this is exactly the point. How does one implement custom deserializer for say SmallVec<[i32; 4]>?

I wonder if it's even possible to implement on serde_with level without GATs...

jonasbb · January 26, 2022, 5:33pm

use serde::*;
use std::marker::PhantomData;

struct Arr<T, const N: usize>(PhantomData<[T; N]>);
unsafe impl<T, const N: usize> smallvec::Array for Arr<T, N> {
    type Item = T;
    fn size() -> usize {
        N
    }
}

struct SV<T>(PhantomData<T>);
impl<'de, TAs, T, const N: usize> serde_with::DeserializeAs<'de, smallvec::SmallVec<Arr<T, N>>> for SV<TAs>
where
    TAs: serde_with::DeserializeAs<'de, T>,
{
    fn deserialize_as<D>(deserializer: D) -> Result<smallvec::SmallVec<Arr<T, N>>, D::Error>
    where
        D: Deserializer<'de>,
    {
        let v: Vec<T> = Vec::<TAs>::deserialize_as(deserializer)?;
        Ok(smallvec::SmallVec::from_vec(v))
    }
}

#[serde_with::serde_as]
#[derive(Debug, serde::Deserialize)]
struct D {
    #[serde_as(deserialize_as = "SV<_>")]
    sv: smallvec::SmallVec<Arr<i16, 4>>,
}

let v = serde_json::json!({
    "sv": [1, 2, 3, 4, 5]
});
serde_json::from_value::<D>(v)?

I used Arr<T, N> here instead of [T; N], since SmallVec does not seem to support const generics yet. I don't know if the unsafe impl is even correct here. Of course you can simply duplicate the impl DeserializeAs for each desired array size.

The DeserializeAs implementation is mostly the same as Deserialize. So if you want to avoid the intermediate Vec you can of course use a Visitor which creates the SmallVec directly.

Pzixel · March 6, 2022, 5:56am

Hi. I finally made it to try it and for some reason it doesn't work for me. Could you please elaborate why this doesn't work as expected?

[src/model/api.rs:104] serde_json::from_value::<TestA>(serde_json::json!({
            "a" : "10", "b" : "0x55"
        })) = Err(
    Error("invalid type: string \"10\", expected u32", line: 0, column: 0),
)
[src/model/api.rs:105] serde_json::from_value::<TestB>(serde_json::json!({
            "a" : "10", "b" : "0x55"
        })) = Ok(
    TestB {
        a: NiceSerializer(
            10,
        ),
        b: NiceSerializer(
            85,
        ),
    },
)

Pzixel · March 6, 2022, 6:19am

Oh, apparently attributes order does matter and if serse_as is going first it works as expexted...

kekronbekron · March 6, 2022, 1:23pm

Hi @Pzixel, check if this can do what you want:
GitHub - jam1garner/binrw: A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
Docs: binrw - Rust

jonasbb · March 6, 2022, 2:21pm

Yes, attribute order matters, since proc_macros are expanded in order. serde_as rewrites some serde attributes, so it must always run before the derive.

<Cow<'de, str> as Deserialize<'de>>::deserialize probably does not what you want. You will always get a Cow::Owned, so it is simpler to replace it with deserializing into String directly. For avoiding allocations you need a Visitor.

You can directly implement serde_with::DeserializeAs for NiceSerializer to simplify your code. You can use the existing Deserialize as the basis. That way you don't need the Into implementations and you can simplify the attribute to #[serde_as(deserialize_as = "NiceSerializer<u32>")].

I hope those comments help you out a bit.

system · June 4, 2022, 2:22pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Serde: Is it possible to define type to be used for deserialize with deserialize_with? help	4	1052	May 22, 2022
Serde deserialize_with in generic struct field help	4	415	May 8, 2024
Serde's serialize_with and deserialize_with for vectors and slices help	13	10004	December 18, 2021
Need help with #[serde(deserialize_with)]	5	24985	January 12, 2023
Simpler way to write serde custom deserialization? help	2	777	February 14, 2020

Recursive #[serde(deserialize_with)] definition

Related topics