Serde FromStr on a field

I use serde/toml to read my Config struct from a file.
However, one of the fields is a regular expression, which I want to be of type Regex from the crate regex. The problem is, that this crate has no serde support. So this does not work because of the regex field:

use regex::Regex;

#[derive(serde::Deserialize)]
struct Config {
    some_configuartion: bool,
    regex: Regex,                   // this field is the problem
    more_configuration: String,
}

However, Regex implements TryFrom<String> and FromStr. So what I would like to have is that serde deserializes to String and then uses a trait to convert it to Regex. I expected that this works:

use regex::Regex;

#[derive(serde::Deserialize)]
struct Config {
    some_configuartion: bool,
    #[serde(try_from = "String")]   // doesn't work
    regex: Regex,
    more_configuration: String,
}

Unfortunately, it doesn't work, because try_from is not a field attribute, but just a container attribute. There is a field attribute called deserialize_with which allows you to specify a specific deserialization function, but then I would need to write it myself. (That would be boilerplate, and I expect that somebody has done it already.)

What is the best solution to this?

I appreciate any help!

You can instruct Serde to deserialize something with a function provided by you.

1 Like

The crate serde_with is quite useful for such things:

/*
[dependencies]
regex = "1"
serde = { version = "1", features = ["derive"] }
serde_with = "3"
*/

use regex::Regex;
use serde_with::{serde_as, DisplayFromStr};

#[serde_as]
#[derive(serde::Deserialize)]
struct Config {
    some_configuartion: bool,
    #[serde_as(as = "DisplayFromStr")]
    regex: Regex,
    more_configuration: String,
}

(run on rustexplorer.com)

3 Likes

Yes, thanks. I also took a look on serde_with but it seems to be overkill for my usecase. It introduces new traits. I just want to use the serde with or deserialize_with functionality, but I'm looking for a clean implementation for From<...> and FromStr.

Well… let’s see if we can “steal” just the relevant code from serde_with for minimal effort:

This should be the relevant trait impl to look into… the source then can be found. Let’s just copy that and tweak it so it compiles

use serde::de::Error as DeError;
use serde::de::Visitor;
use serde::Deserializer;
use std::fmt;
use std::fmt::Display;
use std::marker::PhantomData;
use std::str::FromStr;

fn deserialize_as<'de, T, D>(deserializer: D) -> Result<T, D::Error>
where
    T: FromStr,
    T::Err: Display,
    D: Deserializer<'de>,
{
    struct Helper<S>(PhantomData<S>);
    impl<'de, S> Visitor<'de> for Helper<S>
    where
        S: FromStr,
        <S as FromStr>::Err: Display,
    {
        type Value = S;

        fn expecting(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
            write!(formatter, "a string")
        }

        fn visit_str<E>(self, value: &str) -> Result<Self::Value, E>
        where
            E: DeError,
        {
            value.parse::<Self::Value>().map_err(DeError::custom)
        }
    }

    deserializer.deserialize_str(Helper(PhantomData))
}

alright, this was easy, just needed to move the generics over from the surrounding trait, and add all the imports.

Now let’s give it a better name, and then try to use it!

use regex::Regex;
use serde::de::Error as DeError;
use serde::de::Visitor;
use serde::Deserializer;
use std::fmt;
use std::fmt::Display;
use std::marker::PhantomData;
use std::str::FromStr;

fn deserialize_from_str<'de, T, D>(deserializer: D) -> Result<T, D::Error>
where
    T: FromStr,
    T::Err: Display,
    D: Deserializer<'de>,
{
    struct Helper<S>(PhantomData<S>);
    impl<'de, S> Visitor<'de> for Helper<S>
    where
        S: FromStr,
        <S as FromStr>::Err: Display,
    {
        type Value = S;

        fn expecting(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
            write!(formatter, "a string")
        }

        fn visit_str<E>(self, value: &str) -> Result<Self::Value, E>
        where
            E: DeError,
        {
            value.parse::<Self::Value>().map_err(DeError::custom)
        }
    }

    deserializer.deserialize_str(Helper(PhantomData))
}

#[derive(serde::Deserialize)]
struct Config {
    some_configuartion: bool,
    #[serde(deserialize_with = "deserialize_from_str")]
    regex: Regex,
    more_configuration: String,
}

(playground)

perfect, it all compiles ^^

2 Likes

Wow, thank you!!

Here's a much shorter way. I haven't compared the two or played with it much at all really.

1 Like
    let buf = <&str>::deserialize(deserializer)?;

Deserializing into &str will fail for inputs that don’t contain the str data literally in a way that can be borrowed with zero copies. This depends then on the input format. E.g. for human-readable formats like JSON, I suppose (I didn’t test this) this should depend on whether or not escape sequences are present (for regexes they commonly are present); for binary formats this depends on whether or not UTF-8 encoding is used for strings in the format.

Deserializing into String would be more robust, but adds an extra allocating + data-copying step.

It’s unfortunate there is so much boilerplate. E.g. with the language features I’m imagining here this all could be a lot more compact.

1 Like

Ah, I see, makes sense.[1] One could make it marginally more robust at some cost by attempting &str then String I suppose.

Thanks for the breakdown.


  1. And yeah, JSON is not zero-copy when escapes are present. ↩︎

Of course even without any of this, some of the boilerplate (specifically around the generic arguments) can be avoided by using this just for Regex:

use regex::Regex;
use serde::de;
use serde::Deserializer;
use std::fmt;

fn deserialize_regex_from_str<'de, D>(deserializer: D) -> Result<Regex, D::Error>
where
    D: Deserializer<'de>,
{
    struct V;
    impl<'de> de::Visitor<'de> for V {
        type Value = Regex;

        fn expecting(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
            write!(formatter, "a string")
        }

        fn visit_str<E>(self, value: &str) -> Result<Regex, E>
        where
            E: de::Error,
        {
            value.parse().map_err(de::Error::custom)
        }
    }

    deserializer.deserialize_str(V)
}

#[derive(serde::Deserialize)]
struct Config {
    some_configuartion: bool,
    #[serde(deserialize_with = "deserialize_regex_from_str")]
    regex: Regex,
    more_configuration: String,
}

Also, I’m noticing that "a string" could probably be changed to "a regex" to improve the error message on deserialization failure.


Another thing I’ve noticed is that serde itself internally uses a similar visitor for a few types’s Deserialize implementations: the visitor is defined here and used in places such as this.