Can I make a shortcut for a Serde attribute?

Hi,

I have a super hairy #[serde(bound(deserialize="very-very-long-text-here")] clause that I would have to copy paste many places in my code.

I am looking a way to make a shortcut out of it. Is there a way to define a macro or an attribute that would replace itself with that attribute instead? Are such multiple passes of macro expansion possible?

Replacing just a string would be enough too, as the rest is short enough.

Thank you!

You should probably define a newtype and implement Deserialize for it manually using the appropriate bounds.

1 Like

I agree with @h2co3 that there is probably a good way to XY the issue here: what is your long bound? There is often a certain amount of trait trickery / magic that lets one shorten trait bounds :slightly_smiling_face:


Regarding the OP per se, you cannot alias a derive attribute itself, because of the way they are implemented: they are not semantic attributes, but syntactical markers (which do look like attributes, so they're "phony attributes", we could say; officially dubbed inert) for the actual macro (here, the {De,}Serialize trait(s) to interpret).

One thing you could do, however, is alias the

#[derive(Deserialize)]
#[serde(bound(deserialize = "..."))]

altogether, using, for instance (disclaimer: crate of mine), attribute_alias!:

#[macro_use]
extern crate macro_rules_attribute;

attribute_alias! {
    #[apply(Deserialize_bounded)] =
        #[derive(::serde::Deserialize)]
        #[serde(bound(deserialize = "..."))]
    ;
}

#[apply(Deserialize_bounded)]
#[derive(Serialize)] // <- you can still have other derives
struct Foo<T> {
    #[serde(skip)] // <- and other attrs
    phantom: ::core::marker::PhantomData<T>,

    // ...
}
4 Likes

Thank you for the answer, I will try that.

Re: XY problem, you want to know more, I am still struggling with a problem I've described in the other question. I am trying to develop a two-phase config parsing system from TOML using Serde.

The key thing is the placeholders support. So any key instead of a=2 could be a="${foo}". At the same time it can be just a=2 too. The application that will use this system will also be separated in two parts, one where these variables at not known yet, and then they are known, so the Config is "processed". I want to separate those two phases by type at compile time, so if the Config was already processed, it has no placeholders and all values are fully resolved as plain types.

In the follow code I use Input and Final as "tags" for both forms of Config.The key thing is the Field, which is deserializable only in it's' Field<Input, T> form, but not in the other. The process function represents variable replacement, but here it is without params. Just imagine a HashMap of values passed there.

Afaik without specializations as a feature of Rust is is more difficult to do, I settled to support just 3 field types for now. But anyway, it does not look great yet, but at least it works. The bound requirement I need to put on every struct is a bit

#![allow(dead_code)]

use anyhow::Context;
use serde::{de, de::Visitor, Deserialize};
use std::ops::{Deref, DerefMut};
use std::str::FromStr;
use std::{
    fmt::{Debug, Formatter},
    marker::PhantomData,
};

#[derive(Debug, Clone)]
struct Final;

#[derive(Debug, Clone, Deserialize)]
struct Input;

#[derive(Debug, Clone)]
enum FieldData<T> {
    Final(T),
    Input(String),
}

#[derive(Clone)]
struct Field<S, T: Clone> {
    data: FieldData<T>,
    state: PhantomData<S>,
}

trait FieldConstructable {
    fn from_string(s: &str) -> anyhow::Result<Field<Input, Self>>
    where
        Self: Clone;
    fn from_i64(v: i64) -> anyhow::Result<Field<Input, Self>>
    where
        Self: Clone;
    fn from_f64(v: f64) -> anyhow::Result<Field<Input, Self>>
    where
        Self: Clone;
}

impl FieldConstructable for String {
    fn from_string(s: &str) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_input(s))
    }
    fn from_i64(v: i64) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_input(v.to_string()))
    }
    fn from_f64(v: f64) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_input(v.to_string()))
    }
}

impl FieldConstructable for f64 {
    fn from_string(s: &str) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_input(s))
    }
    fn from_i64(v: i64) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_final(v as f64))
    }
    fn from_f64(v: f64) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_final(v))
    }
}

impl FieldConstructable for i64 {
    fn from_string(s: &str) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_input(s))
    }
    fn from_i64(v: i64) -> anyhow::Result<Field<Input, Self>> {
        Ok(Field::new_final(v))
    }
    fn from_f64(_: f64) -> anyhow::Result<Field<Input, Self>> {
        Err(anyhow::anyhow!("Expected integer, got float"))
    }
}

struct FieldVisitor<T: Clone> {
    pd: PhantomData<fn() -> T>,
}

impl<'de, T: Clone + FieldConstructable> Visitor<'de> for FieldVisitor<T> {
    type Value = Field<Input, T>;

    fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
        formatter.write_str(&format!(
            "`{}` or a placeholder expession \"${{...}}\"",
            std::any::type_name::<T>()
        ))
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        FieldConstructable::from_string(&v).map_err(de::Error::custom)
    }

    fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        FieldConstructable::from_string(&v).map_err(de::Error::custom)
    }

    fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        FieldConstructable::from_i64(v).map_err(de::Error::custom)
    }

    fn visit_f64<E>(self, v: f64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        FieldConstructable::from_f64(v).map_err(de::Error::custom)
    }
}

impl<'de, T: Clone + FieldConstructable> Deserialize<'de> for Field<Input, T> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        deserializer.deserialize_any(FieldVisitor { pd: PhantomData })
    }
}

#[derive(thiserror::Error, Debug)]
enum ProcessingError {
    #[error("unable process {field:?}, value={value:?}")]
    CannotProcessField {
        field: String,
        value: String,
        #[source]
        source: Box<dyn std::error::Error + 'static + Sync + Send>,
    },
}

impl<S, T: Default + Clone> Default for Field<S, T> {
    fn default() -> Self {
        Field {
            data: FieldData::Final(Default::default()),
            state: PhantomData,
        }
    }
}

impl<S, T: Debug + Clone> Debug for Field<S, T> {
    fn fmt(&self, f: &mut Formatter) -> Result<(), std::fmt::Error> {
        match &self.data {
            FieldData::Final(p) => write!(f, "{:?}", p),
            FieldData::Input(e) => write!(f, "expr({:?})", e),
        }
    }
}
impl<T: Clone> Field<Final, T> {
    fn new(t: T) -> Self {
        Self {
            data: FieldData::Final(t),
            state: PhantomData,
        }
    }
}

impl<T: Clone + FromStr> Field<Input, T> {
    fn new_input<S: Into<String>>(expr: S) -> Self {
        Self {
            data: FieldData::Input(expr.into()),
            state: PhantomData,
        }
    }
    fn new_final<S: Into<T>>(t: S) -> Self {
        Self {
            data: FieldData::Final(t.into()),
            state: PhantomData,
        }
    }

    fn process(&self, name: &str) -> Result<Field<Final, T>, ProcessingError>
    where
        <T as FromStr>::Err: 'static + std::error::Error + Sync + Send,
    {
        match &self.data {
            FieldData::Input(s) => match s.parse::<T>() {
                Ok(t) => Ok(Field::new(t)),
                Err(e) => Err(ProcessingError::CannotProcessField {
                    field: name.into(),
                    value: s.clone(),
                    source: Box::new(e),
                }),
            },
            FieldData::Final(f) => Ok(Field::new(f.clone())),
        }
    }
}

impl<T: Clone> Deref for Field<Final, T> {
    type Target = T;
    fn deref(&self) -> &T {
        match &self.data {
            FieldData::Final(p) => p,
            FieldData::Input(e) => {
                panic!("Impossible input field {:?} is marked final", e);
            }
        }
    }
}

impl<T: Clone> DerefMut for Field<Final, T> {
    fn deref_mut(&mut self) -> &mut T {
        match &mut self.data {
            FieldData::Final(p) => p,
            FieldData::Input(e) => {
                panic!("Impossible input field {:?} is marked final", e);
            }
        }
    }
}

// Configuration

#[derive(Debug, Clone, Default, Deserialize)]
#[serde(bound(
    deserialize = "Field<S, String>: Deserialize<'de>, Field<S, i64>: Deserialize<'de>, Field<S, f64>: Deserialize<'de>"
))]
pub struct Child<S: Clone> {
    c: Field<S, f64>,
}

impl Child<Input> {
    fn process(&self) -> anyhow::Result<Child<Final>> {
        Ok(Child {
            c: self.c.process("c")?,
        })
    }
}

impl From<String> for Field<Input, String> {
    fn from(s: String) -> Self {
        Field::new_input(s)
    }
}

#[derive(Debug, Clone, Default, Deserialize)]
#[serde(bound(
    deserialize = "Field<S, String>: Deserialize<'de>, Field<S, i64>: Deserialize<'de>, Field<S, f64>: Deserialize<'de>"
))]
pub struct Config<S: Clone> {
    a: Field<S, i64>,
    #[serde(default = "x", bound = "Field<S, String>:From<String>")]
    b: Field<S, String>,
    c: Option<Field<S, String>>,
    child: Child<S>,
}

fn x<S>() -> Field<S, String>
where
    Field<S, String>: From<String>,
{
    "default_b".to_owned().into()
}

impl Config<Input> {
    fn process(&self) -> anyhow::Result<Config<Final>> {
        Ok(Config {
            a: self.a.process("a")?,
            b: self.b.process("b")?,
            c: self
                .c
                .as_ref()
                .map(|c| c.process("c"))
                .map_or(Ok(None), |r| r.map(Some))?,
            child: self.child.process().context("Unable to process child")?,
        })
    }
}

fn main() -> anyhow::Result<()> {
    let s = r#"
a = 2233
b = "hello"
c = "ccc"
[child]
c=2333
"#;
    let c: Config<Input> = toml::from_str(&s)?;
    println!("{:#?}", c);
    let mut c: Config<Final> = c.process()?;
    println!("{:#?}", c);
    *c.a = 555;
    println!("{:#?}", c);
    Ok(())
}

Not a point of this questions, but any tips on how to approach this are welcome. Once most of it is moved to a separate module it is clean enough, except those bounds everywhere. Thanks!

It doesn't look like you should need either specialization or so long-winded bounds for this. I'd do this with one more level of indirection, and then simply implement Deserialize differently for the concrete possibilities of the typestate "tag" type.

Level of indirection inside of the Field? Usually the issue is having a Deserialize derived on something generic, where this generic parameter is actually a finite "enum" of types

I've been poking at a very similar idea, though with YAML and technically three phase (raw parse, expansion of "extends" and rules, then variable substitution). At least in my case the structure has been sufficiently different between each phase that I've not needed to try and share anything: I just use serde_yaml::Value to avoid parsing anything that I don't want to or can deal with.

Giving up on parsting straight to struct is definitely something I've considered. But it lacks any kind of validation except the basic at the time of initial parsing to toml::Table. Even the simplest errors like an obvious type mismatch will be found at the first processing phase, which can be delayed in time after the app has started.

Maybe it is not so bad. The error in a placeholder is probably more probable AND unavoidable. This is my ultimate Plan B if everything else fails.

In my case I don't really have a choice, as until it's expanded it's not valid, and there's a pretty decent split between information needed to expand and the final expanded result, so it works out pretty well.