Extracting data with saphyr

G'day!

I spent the afternoon hacking on this. Given the following YAML:

settings:
    endpoint: "http://localhost/api/search/instant"
    limit: 50
    site-language: "en"
    restrict: "all"
    selected-languages: [ "lzh", "en", "pgd", "kho", "pli", "pra", "san", "xct", "xto", "uig" ]
    match-partial: false

I've managed to get the endpoint out and into my Settings struct:

impl TryFrom<&Yaml<'_>> for Settings {
    type Error = anyhow::Error;

    fn try_from(yaml: &Yaml) -> Result<Self> {
        if let Yaml::Mapping(settings) = &yaml["settings"] {
            let endpoint_key = &Yaml::Value(Scalar::String(Cow::from("endpoint")));

            let endpoint = settings
                .get(endpoint_key)
                .context("Missing endpoint")?
                .as_str()
                .context("Endpoint not a string")?
                .to_string();

            Ok(Settings {
                endpoint,
                limit: 50,
                site_language: "en".to_string(),
                restrict: "all".to_string(),
                selected_languages: vec!["en".to_string()],
                match_partial: false,
            })
        } else {
            Err(anyhow!("Settings is not a mapping"))
        }
    }
}

That endpoint_key looks a bit funky. Am I on the right track?

Answering my own question: yes, the endpoint_key is a bit verbose. An issue has been raised:

Again, I'm a Rust newbie, but I'm thinking of creating my own StringKey type, that can be converted to &Yaml::Value(Scalar::String(Cow::from("some sort of static string here"))).

Not entirely sure how to do that, but it's an interesting challenge.

Happy hacking!

Saphyr parser is also available via serde-saphyr crate that might be easier. I wrote a quick test that worked fine fromt the first try:


use anyhow::{Context};
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct Settings {
    pub endpoint: String,
    pub limit: usize,
    #[serde(rename = "site-language")]
    pub site_language: String,
    pub restrict: String,
    #[serde(rename = "selected-languages")]
    pub selected_languages: Vec<String>,
    #[serde(rename = "match-partial")]
    pub match_partial: bool,
}

#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
struct Root {
    pub settings: Settings,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn parse_and_assert_settings()-> anyhow::Result<()> {
        let yaml = r#"
settings:
    endpoint: "http://localhost/api/search/instant"
    limit: 50
    site-language: "en"
    restrict: "all"
    selected-languages: [ "lzh", "en", "pgd", "kho", "pli", "pra", "san", "xct", "xto", "uig" ]
    match-partial: false
"#;

        let root: Root = serde_saphyr::from_str(yaml)
            .with_context(|| "Failed to deserialize YAML into Root")?;
        let settings = root.settings;

        // Exact assertions
        assert_eq!(settings.endpoint, "http://localhost/api/search/instant");
        assert_eq!(settings.limit, 50);
        assert_eq!(settings.site_language, "en");
        assert_eq!(settings.restrict, "all");
        assert_eq!(settings.match_partial, false);

        // Languages list equality (order is preserved from YAML)
        let expected = vec![
            "lzh","en","pgd","kho","pli","pra","san","xct","xto","uig"
        ].into_iter().map(String::from).collect::<Vec<_>>();
        assert_eq!(settings.selected_languages, expected);

        Ok(())
    }
}

As some your fields are not valid Rust identifiers, you need #[serde(rename= but that's it. They do not need to look like strings at all, they can also be numbers. This also works fine:

settings:
    endpoint: "http://localhost/api/search/instant"
    1000: 1001
#[derive(Serialize, Deserialize)]
pub struct Settings {
    endpoint: String,
    #[serde(rename = "1000")]
    pub the_thousand: usize
}

Thanks for answering my question.

In the end I went with a TOML format. It is quite nifty!