Struct Referecing / Foreign Key Relationships

Hi,

i am trying to find a way to have something similar to SQL Foreign Relationships with Rust Structs.

I have a top Level Configuration struct which holds all configuration, but some of the configuration items need to reference/point to one element of a vector somewhere else in the configuration.

I for multiple reasons, using a actual SQL database is not something i want to do in this situation.
Vector elements are identified by a name String field (i have not used a map for other reasons but would work similarly).

Here is what i currently have which works:

use serde::{Deserialize, Serialize};

// Configuration Struct definitions
#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct Config {
    pub system_a: SystemA,
    pub system_b: SystemB,
    pub thing_c: ThingC,
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct SystemA {
    pub thing_a_list: ThingAList,
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct SystemB {
    pub thing_b_list: Vec<ThingB>,
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct ThingA {
    pub name: String,
    pub data: String,
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct ThingB {
    pub name: String,
    pub thing_a: ThingAReference,
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
pub struct ThingC {
    pub name: String,
    pub thing_a: ThingAReference,
}

// Referencing Definitions
pub trait Referenceable<T> {
    fn named_get(&self, name: String) -> T;
    fn named_exists(&self, name: String) -> bool;
}

pub type ThingAList = Vec<ThingA>;

impl Referenceable<ThingA> for ThingAList {
    fn named_get(&self, name: String) -> ThingA {
        let index = self.iter().position(|e| *e.name == name);

        match index {
            Some(i) => self[i].clone(),
            // This is fine since the config always has to validated before commiting
            None => panic!("Referenced Thing: '{:?}' does not exist ", name),
        }
    }

    fn named_exists(&self, name: String) -> bool {
        let index = self.iter().position(|e| *e.name == name);
        index.is_some()
    }
}

pub trait References<T> {
    fn get_ref(&self, config: Config) -> T;
    fn ref_exists(&self, config: Config) -> bool;
}

#[derive(Serialize, Deserialize, Clone, Default, Debug)]
#[serde(from = "String")]
#[serde(into = "String")]
pub struct ThingAReference {
    pub name: String,
}

impl Into<String> for ThingAReference {
    fn into(self) -> String {
        self.name
    }
}

impl From<String> for ThingAReference {
    fn from(value: String) -> Self {
        ThingAReference { name: value }
    }
}

impl References<ThingA> for ThingAReference {
    fn get_ref(&self, config: Config) -> ThingA {
        config.system_a.thing_a_list.named_get(self.clone().into())
    }

    fn ref_exists(&self, config: Config) -> bool {
        config
            .system_a
            .thing_a_list
            .named_exists(self.clone().into())
    }
}

fn main() {
    println!("Hello, world!");

    // Example data
    let mut config = Config {
        system_a: SystemA {
            thing_a_list: Vec::new(),
        },
        system_b: SystemB {
            thing_b_list: Vec::new(),
        },
        thing_c: ThingC {
            name: "name c".to_string(),
            thing_a: "thing_a_1".to_string().into(),
        },
    };

    config.system_a.thing_a_list.push(ThingA {
        name: "thing_a_1".to_string(),
        data: "important data 1".to_string(),
    });
    config.system_a.thing_a_list.push(ThingA {
        name: "thing_a_2".to_string(),
        data: "other important data 2".to_string(),
    });

    config.system_b.thing_b_list.push(ThingB {
        name: "thing_b_1".to_string(),
        thing_a: "thing_a_1".to_string().into(),
    });

    // get what thing_b references
    println!(
        "thing b 0 data from Thing a {:?}",
        config.system_b.thing_b_list[0]
            .thing_a
            .get_ref(config.clone())
            .data
    );

    // get what thing_c references
    println!(
        "thing c data from Thing a {:?}",
        config.thing_c.thing_a.get_ref(config.clone()).data
    );
}

In actual code i have these macros so that i don't need to rewrite this for everything that can be referenced or references:

pub trait Referenceable<T> {
    fn named_get(&self, name: String) -> T;
    fn named_exists(&self, name: String) -> bool;
}

#[macro_export]
macro_rules! impl_referenceable_trait {
    ($typ:ident, $ele:ty) => {
        pub type $typ = Vec<$ele>;

        impl Referenceable<$ele> for $typ {
            fn named_get(&self, name: String) -> $ele {
                let index = self.iter().position(|e| *e.name == name);

                match index {
                    Some(i) => self[i].clone(),
                    // This is fine since the config always has to validated before commiting
                    None => panic!("Referenced Thing: '{:?}' does not exist ", name),
                }
            }

            fn named_exists(&self, name: String) -> bool {
                let index = self.iter().position(|e| *e.name == name);
                index.is_some()
            }
        }
    };
}

pub trait References<T> {
    fn get_ref(&self, config: Config) -> T;
    fn ref_exists(&self, config: Config) -> bool;
}

#[macro_export]
macro_rules! impl_references_trait {
    ($thing:ident, $referenced:ty, $( $path:ident ).+) => {

        #[derive(Serialize, Deserialize, Clone, Default, Debug)]
        #[serde(from = "String")]
        #[serde(into = "String")]
        pub struct $thing {
            pub name: String,
        }

        impl Into<String> for $thing {
            fn into(self) -> String {
                self.name
            }
        }

        impl From<String> for $thing {
            fn from(value: String) -> Self {
                $thing { name: value }
            }
        }

        impl References<$referenced> for $thing {
            fn get_ref(&self, config: Config) -> $referenced {
                config.$($path).+.named_get(self.clone().into())
            }

            fn ref_exists(&self, config: Config) -> bool {
                config.$($path).+.named_exists(self.clone().into())
            }
        }
    };
}

This works and i am quite happy with it, but there are a few things which i also need but can't quite figure out how to do:

  • I need to write validation which checks if all references are valid, which can be done by calling ref_exists on all references, since there are many things that can reference something else in my i code would like to find a way to generate this validation function for the entire config when i use the references macro (or a seperate one) so that i don't forget to validate a reference during validation.
  • Something which is Referenceable also needs a function to get all the Things which reference it, i don't need the struct/element itself, i just need its path in the config and name if it has one.

In most cases, the principle "make illegal states unrepresentable" is quite good for things that requires validation. In essence, you validate everything on construction. That way, in all other parts of the code you can assume that you have a valid state. For example, String doesn't have verify_utf8 method, because it always checks it when created or modified.

To answer question two, you need to know how often you want to query the predecessors. From what I understand, a graph data structure, implemented as an Adjacency matrix - Wikipedia could solve many of the problems you face.

Hi, thank you for the reply.

i do like the "make illegal states unrepresentable" principle, many things in my configuration are Enums with fields for that reason. Additionally i have a System similar to SQL transactions which validates all "Changes" when trying to commit the data to the Config Struct. The Config Struct also allows me to easily serialize and save the config to a json file. Unfortunately i cannot validate on creation since the element is constructed and used standalone before being added to the Configuration Struct, also in some cases there are multiple instances for the configuration struct (current vs pending config) where the data might be valid with one but not the other.

I have now looked at a few library's for making a graph (GitHub - petgraph/petgraph: Graph data structure library for Rust.) and it does not seem to support serde for MatrixGraphs which is a bummer, additionally looking at the API working with the data is gonna be way more painful with the graph compared to simple structs.

I will investigate using a Adjacency matrix some more but i am not sure if it is worth using one in my case.

You haven't made it clear why an SQL database is not an option. It for sure would be simpler than trying to implement ACID yourself. And you'd get a query language for the predecessor problem as well.

I do not want to use a SQL Database for multiple reasons:

My main reason is storage, i want all configuration to be stored in json on the drive to be able to copy, backup and modify the json config file, using a database would mean that on every configuration change i would have to dump the entire database an write it as json to disk.

Second is complexity. Having to run a database server adds a lot of extra points of failure where things can go wrong when i don't need most of the features. As i compromise i have thought about running a database in memory instead but most of the in memory databases are just key value stores. The only ones i can think of which would be viable options are sqlite and surreal db which can be in memory and are relational.

Third is validation, i need additional validation which i can't do in SQL or only through complicated Triggers. For example: thing a can only reference thing b if field f on thing b is a certain value.

So in conclusion, for SQL i would need to run something like sqlite in memory with a wrapper which validates everything and somehow dumps the database to json every time it changes.