Recursive #[serde(deserialize_with)] definition

Hi everyone. I have a problem which is too big to ask here so I have a following MRE which may not show WHY i need this but I assure you I do.

Consider following snippet:

#[derive(Deserialize)]
struct Foo {
    #[serde(deserialize_with = "deserialize_my_i32")]
    bar: i32,
    #[serde(deserialize_with = "?????")]
    other_bars: Vec<i32>,
    also_bars: Vec<i32>
}

fn deserialize_my_i32<'de, D>(deserializer: D) -> Result<i32, D::Error> where D: Deserializer<'de> {
    Ok(10)
}

Here we go, I have my custom deserialize for i32 which performs addition logic (e.g. parses from different bases etc) and i want some fields to (resursively!) use it and some not. The problem is this code will fail because it only parses i32 but I apply this attribute to Vec. Desired result here is vec full of 10s. But another field should use default deserializer and be unaffected.

Newtypes has desired behavior: they work recursively. But they tend to double every model in my codebase which is pain so I'd like to avoid it if possible.

I'm sorry if I didn't formulate it correctly, it's late night here and I've lost clarity of my mind a bit


In a nutshell, I'd like this code to compile:

use serde::Deserializer;
use serde::Deserialize;

fn my_cool_des_u8<'de, D>(deserializer: D) -> Result<u8, D::Error> where D: Deserializer<'de> {
    todo!()
}

fn my_cool_des_i16<'de, D>(deserializer: D) -> Result<i16, D::Error> where D: Deserializer<'de> {
    todo!()
}

#[derive(Deserialize)]
struct Foo {
    #[serde(deserialize_with = "my_cool_des_u8")]
    small: u8,
    #[serde(deserialize_with = "my_cool_des_i16")]
    avg: i16,
    #[serde(deserialize_with = "my_cool_des_u8")]
    big: Vec<u8>,
    #[serde(deserialize_with = "my_cool_des_i16")]
    huge: Vec<i16>,
    unrelated: Vec<i16> // note no attribute here - should use default deserializer
}

You can try the serde_with crate. The syntax is slightly different, but you can use it with most std collection types.

You can write your example like this:

use serde::Deserializer;
use serde::Deserialize;

struct MyCoolDesi16;
impl<'de> serde_with::DeserializeAs<'de, i16> for MyCoolDesi16 {
    fn deserialize_as<D>(deserializer: D) -> Result<i16, D::Error>
    where
        D: Deserializer<'de>,
    {  
        todo!()
    }
}

#[serde_with::serde_as]
#[derive(Deserialize)]
struct Foo {
    #[serde_as(deserialize_as = "MyCoolDesi16")]
    avg: i16,
    #[serde_as(deserialize_as = "Vec<MyCoolDesi16>")]
    huge: Vec<i16>,
    unrelated: Vec<i16> // note no attribute here - should use default deserializer
}
3 Likes

Thank you stranger so much. It's really pretty awesome.

Unfortunately it doesn't work for custom types like I'd like to deserialize something in fnv::FnvHashSet. But it doesn't implement required traits and I cannot neither due to orphan rules. But it will work for 99% cases I suppose

The DeserializeAs trait is always implementable for a local type and allows you to "extend" any type without worrying about orphan rules. Concretely, this means you can write this:

struct MyCustomHashSetDeserialize<T>;
impl<'de, T> serde_with::DeserializeAs<'de, FnvHashSet<T>> for MyCustomHashSetDeserialize<T>;

// usage
#[serde_as(deserialize_as = "MyCustomHashSetDeserialize<MyCoolDesi16>")]
huge: FnvHashSet<i16>,

In the case of fnv::FnvHashSet, which is only an alias for std::collections::HashSet this is hopefully just an oversight of not accounting for the hashing type parameter. An issue would be a good first step to get that fixed.

1 Like

The DeserializeAs trait is always implementable for a local type

Well this is exactly the point. How does one implement custom deserializer for say SmallVec<[i32; 4]>?

I wonder if it's even possible to implement on serde_with level without GATs...

use serde::*;
use std::marker::PhantomData;

struct Arr<T, const N: usize>(PhantomData<[T; N]>);
unsafe impl<T, const N: usize> smallvec::Array for Arr<T, N> {
    type Item = T;
    fn size() -> usize {
        N
    }
}

struct SV<T>(PhantomData<T>);
impl<'de, TAs, T, const N: usize> serde_with::DeserializeAs<'de, smallvec::SmallVec<Arr<T, N>>> for SV<TAs>
where
    TAs: serde_with::DeserializeAs<'de, T>,
{
    fn deserialize_as<D>(deserializer: D) -> Result<smallvec::SmallVec<Arr<T, N>>, D::Error>
    where
        D: Deserializer<'de>,
    {
        let v: Vec<T> = Vec::<TAs>::deserialize_as(deserializer)?;
        Ok(smallvec::SmallVec::from_vec(v))
    }
}

#[serde_with::serde_as]
#[derive(Debug, serde::Deserialize)]
struct D {
    #[serde_as(deserialize_as = "SV<_>")]
    sv: smallvec::SmallVec<Arr<i16, 4>>,
}

let v = serde_json::json!({
    "sv": [1, 2, 3, 4, 5]
});
serde_json::from_value::<D>(v)?

I used Arr<T, N> here instead of [T; N], since SmallVec does not seem to support const generics yet. I don't know if the unsafe impl is even correct here. Of course you can simply duplicate the impl DeserializeAs for each desired array size.

The DeserializeAs implementation is mostly the same as Deserialize. So if you want to avoid the intermediate Vec you can of course use a Visitor which creates the SmallVec directly.

1 Like

Hi. I finally made it to try it and for some reason it doesn't work for me. Could you please elaborate why this doesn't work as expected?

[src/model/api.rs:104] serde_json::from_value::<TestA>(serde_json::json!({
            "a" : "10", "b" : "0x55"
        })) = Err(
    Error("invalid type: string \"10\", expected u32", line: 0, column: 0),
)
[src/model/api.rs:105] serde_json::from_value::<TestB>(serde_json::json!({
            "a" : "10", "b" : "0x55"
        })) = Ok(
    TestB {
        a: NiceSerializer(
            10,
        ),
        b: NiceSerializer(
            85,
        ),
    },
)

Oh, apparently attributes order does matter and if serse_as is going first it works as expexted...

Hi @Pzixel, check if this can do what you want:
GitHub - jam1garner/binrw: A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
Docs: binrw - Rust

Yes, attribute order matters, since proc_macros are expanded in order. serde_as rewrites some serde attributes, so it must always run before the derive.

<Cow<'de, str> as Deserialize<'de>>::deserialize probably does not what you want. You will always get a Cow::Owned, so it is simpler to replace it with deserializing into String directly. For avoiding allocations you need a Visitor.

You can directly implement serde_with::DeserializeAs for NiceSerializer to simplify your code. You can use the existing Deserialize as the basis. That way you don't need the Into implementations and you can simplify the attribute to #[serde_as(deserialize_as = "NiceSerializer<u32>")].

I hope those comments help you out a bit.