Deeply nested generic structs

Hello,

I'm relatively new to Rust and I'm trying to implement a nested data structure to represent relational data I fetch from a database.
For example:

  • there are two tables, "tenants" and "users"
  • a tenant has many users
  • a user belongs to a tenant

The data structure should allow for relations to be optional, because I for example don't always fetch a user with its tenant relation.

My attempt to implement this data structure looks like this:

// Tenant traits
pub trait TenantProps: fmt::Debug {
    fn id(&self) -> &Uuid;
    fn name(&self) -> &String;
}

pub trait RelToUsers: fmt::Debug {
    fn users(&self) -> &Vec<User>;
}

// Tenant struct
#[derive(Debug)]
pub struct Tenant {
    id          : Uuid,
    name        : String,
    users       : Option<Vec<User>>
}

// Tenant implementations
impl TenantProps for Tenant {
    fn id(&self) -> &Uuid { &self.id }
    fn name(&self) -> &String { &self.name }
}

impl RelToUsers for Tenant {
    fn users(&self) -> &Vec<User> { &self.users.as_ref().unwrap() }
}

// User traits
pub trait UserProps: fmt::Debug {
    fn id(&self) -> &Uuid;
    fn name(&self) -> &String;
}

pub trait RelToTenant: fmt::Debug {
    fn tenant(&self) -> &Tenant;
}

// User struct
#[derive(Debug)]
pub struct User {
    id          : Uuid,
    name        : String,
    tenant      : Option<Tenant>
}

// User implementations
impl UserProps for User {
    fn id(&self) -> &Uuid { &self.id }
    fn name(&self) -> &String { &self.name }
}

impl RelToTenant for User {
    fn tenant(&self) -> &Tenant { &self.tenant.as_ref().unwrap() }
}

Usage then looks like this:

fn find_tenant_with_users(id: &String) -> impl TenantProps + RelToUsers { ... }

let tenant = find_tenant_with_users(&tenant_id);
// "tenant" has the "users()" method

The problem now is that this works only for one level deep. But the data structure should also make it possible to specify "how much" of the user is in the tenants users array. To represent a tenant with its "users" relation where only the user's properties were fetched the tenant's "users" field should be of type Option<Vec<impl UserProps>>. But in case a tenant's users were fetched with their "tenant" relation the tenant's "users" field should be of type Option<Vec<impl UserProps + RelToTenant>>.

I don't even know if Rust allows for such a data structure I'm trying to implement because I already tried some things (like passing traits as generic parameters) but nothing really worked out.
Therefore I appreciate any help on this topic. Thanks in advance!

It's unclear to me what you want to achieve. What does ""how much" of the user is in the tenants users array" mean?

You have only one concrete implementation, and yet your design is very very abstract. It seems like it's trying to emulate a duck typing or multiple inheritance?
Rust doesn't like abstractions, and there's a high complexity tax when you hide implementation details behind traits. My recommendation is don't abstract things away unless you actually have to. It's not the best practice to hide things just in case.

Use real structs whenever possible. If there can be multiple types, consider using enum with explicitly these types.


If RelToTenant may not always be possible to implement, you will need two methods:

trait TenantPropsWithRelToTenant: TenantProps + RelToTenant {}

fn with_rel_to_tenant() -> impl TenantPropsWithRelToTenant {…}
fn without_rel_to_tenant() -> impl TenantProps {…}

if this varies dynamically at run time, you can make the methods return Option<impl Trait> or enum EitherOfTheseTraits. If types that implement these traits vary dynamically or are mixed in the same collection (e.g. two different structs for users), then you'll need Box<dyn Trait> to hold them.

The + syntax in trait bounds is only for Rust's built-in marker traits that don't add any methods. For your traits it's strictly one trait at a time. Note that the trait T: U syntax is not inheritance. They're still separate, it just means "before implementing T, please also implement U".


BTW: Instead of &String or &Vec use &str and &[]. The &String type makes no sense: it requires a growable String, but then & forbids it from growing. It requires String to be heap allocated and owned, but then & borrows it temporarily. It just adds two layers of indirection. Any String can be borrowed as &str.

2 Likes

First of all thank you very much for your quick reply @kornel.

It's unclear to me what you want to achieve. What does ""how much" of the user is in the tenants users array" mean?

What I mean with "how much" of the user is in the tenant's "users" array is the following:
I can fetch a tenant with its users which would have a result like this

Tenant {
  id: "...",
  name: "Tenant 1",
  users: [
    User { id: "...", name: "User 1" },
    User { id: "...", name: "User 2" },
    ...
  ]
}

But I could also fetch a tenant with its "users" relation where every user is fetched with its "tenant relation. The result would look like this:

Tenant {
  id: "...",
  name: "Tenant 1",
  users: [
    User { id: "...", name: "User 1", tenant: Tenant { id: "...", name: "Tenant 1" }},
    User { id: "...", name: "User 2", tenant: Tenant { id: "...", name: "Tenant 1" }},
    ...
  ]
}

In the first example the users in the tenant's "users" relation are of type impl UserProps. In the second example they are of type impl UserProps + RelToTenant. So to represent this difference I'd need to specify this type difference in a generic parameter of the Tenant struct.
And the reason why I tried to solve it in a very abstract way is because I don't want to end up doing something like this

pub struct Tenant {
    id          : Uuid,
    name        : String,
}

pub struct TenantWithUsers {
    id          : Uuid,
    name        : String,
    users       : Vec<User>
}

pub struct User {
    id          : Uuid,
    name        : String,
}

pub struct UserWithTenant {
    id          : Uuid,
    name        : String,
    tenant      : Tenant
}

This will be very ugly very quicky when dealing with a combination of multiple relations like UserWithTenant, UserWithProjects, UserWithTenantAndProjects etc.

Your idea with enums sounds very good. I'll definitely try this because I don't need the mixed traits in the same collection.

Thank you also very much for the tipp about &String and &Vec. I really appreciate that :smiley:

Do you now understand what I'm trying to achive? Do you think this is doable with enums?

You could model this as:

struct User {
   id: Uuid, name: String, 
   tenant: Option<Tenant>.
}

with no traits involved. If you used traits, it'd probably replicate the exact same structure:

trait AbstractUser { fn tenant(&self) -> Option<&Tenant> }

The other approach is to omit the tenant field in User, and have:

struct UserWithTenant {
   user: User,
   tenant: Tenant,
}

This still lets you use methods that take &User.

And you could implement AbstractUser for User that returns None and AbstractUser for UserWithTenant that returns Some from the tenant method.

Consider also parallel arrays:

struct UsersWithTenants {
   users: Vec<User>,
   tenants: Option<Vec<Tenant>>,
}

where users[x] has corresponding entry in tenants[x] if it's set (or non-empty).

Ok, either way it seems like I'll have to define a lot of structs for all sorts of combinations of relations. While trying to circumvent writing all this boilerplate code by further experimenting with generics, traits, box, dyn etc. something else came to my mind - macros.
So the past hours I learned how to create my own macro and it was quite the journey.
But now I have a working solution that creates real structs which is according to you preferred over all the abstractions I tried to use before.

This is the macro I came up with:

macro_rules! entity_structs {
    ////////////////////////////////////////////////////////////////////////////
    // Matches (relation | props) and creates a struct with all props and the
    // given relation
    (@inner
        // One relation
        relation $rel_n:ident { $( $rel_p:ident : $rel_t:ty )+ }

        // Props
        | $($prop:ident: $prop_t:ty)*

    ) => {
        #[derive(Debug)]
        pub struct $rel_n {
            $( pub $prop: $prop_t ,)*
            $( pub $rel_p: $rel_t ,)+
        }
    };

    ////////////////////////////////////////////////////////////////////////////
    // Matches (relation, rest_of_relations | props)
    // -> creates a struct with all props and the frist relation
    // -> calls itself recursively with the rest of the relations
    (@inner
        // First relation
        relation $rel_n:ident { $( $rel_p:ident : $rel_t:ty )+ },

        // Rest of the relations
        $(  relation $xrel_n:ident { $( $xrel_p:ident : $xrel_t:ty )+ }  ),+

        // Props
        | $($prop:ident: $prop_t:ty)*

    ) => {
        // Create struct from first relation
        #[derive(Debug)]
        pub struct $rel_n {
            $( pub $prop: $prop_t ,)*
            $( pub $rel_p: $rel_t ,)+
        }

        // Recurse down the rest of the relations
        entity_structs!(@inner
            // Rest of the relations
            $(  relation $xrel_n { $($xrel_p: $xrel_t)+ }  ),*

            // Props
            | $($prop: $prop_t)*
        );
    };

    ////////////////////////////////////////////////////////////////////////////
    // Entry point of the macro without relations
    (
        pub struct $name:ident {
            $($prop:ident: $prop_t:ty,)*
        }
    ) => {
        // Create base struct
        #[derive(Debug)]
        pub struct $name {
            $( pub $prop: $prop_t ),*
        }
    };

    ////////////////////////////////////////////////////////////////////////////
    // Entry point of the macro with at least one relation
    (
        pub struct $name:ident {
            $($prop:ident: $prop_t:ty,)*
        }

        generate relation structs {
            $(
                $xrel_n:ident with additional fields {
                    $( $xrel_p:ident : $xrel_t:ty )+
                }
            ,)+
        }
    ) => {
        // Create base struct
        #[derive(Debug)]
        pub struct $name {
            $( pub $prop: $prop_t ),*
        }

        entity_structs!(@inner
            // List of relations
            $(  relation $xrel_n { $( $xrel_p : $xrel_t )+ }  ),+

            // Props
            | $($prop: $prop_t)*
        );
    };
    ////////////////////////////////////////////////////////////////////////////
}

This is how it's used:

entity_structs!(
    pub struct Tenant {
        id              : String,
        name            : String,
    }

    generate relation structs {
        Tenant_w_Users with additional fields {
            users       : Vec<User>
        },
    }
);

entity_structs!(
    pub struct User {
        id              : String,
        name            : String,
    }

    generate relation structs {
        User_w_Tenant with additional fields {
            tenant      : Tenant
        },

        User_w_Tenant_w_Users with additional fields {
            tenant      : Tenant_w_Users
        },
    }
);

And this is what the macro generates:

pub struct Tenant {
    pub id      : String,
    pub name    : String,
}

pub struct Tenant_w_Users {
    pub id      : String,
    pub name    : String,
    pub users   : Vec<User>,
}

pub struct User {
    pub id      : String,
    pub name    : String,
}

pub struct User_w_Tenant {
    pub id      : String,
    pub name    : String,
    pub tenant  : Tenant,
}

pub struct User_w_Tenant_w_Users {
    pub id      : String,
    pub name    : String,
    pub tenant  : Tenant_w_Users,
}

What do you think about this solution?

You could have a

// A field in Tenant
pub enum TenantData {
    None,
    Partial { users: Vec<User> },
    Full { users: Vec<Users>, plants: Vec<Plants> },
}

Which can be less combinatorial-explosiony than an Option for every data field.

Or alternatively you could have

pub struct Tenant<UserData> {
    id: Uuid,
    name: String,
    users: UserData,
}
// And a struct for each actual combination of available user data

With the former you only have one type everywhere that can dynamically adjust how much data is available, and with the latter it would be enforced at the type level and you'd need program logic to convert between data levels.

I prefer either of those to the flat model.

1 Like

Thank you very much for your reply @quinedot
Your first suggestion is very nice because I didn't think of using enums that way. But I'd like to have the relations enforced at the type level. That's why I'm looking into the second suggestion you made but I'm not quite sure if I understand it correctly.
Would that look something like this?

pub struct Tenant<UserData> {
    id      : String,
    name    : String,
    users   : Vec<UserData>,
}

pub struct User {
    id      : String,
    name    : String,
}

pub struct UserWithTenant {
    id      : String,
    name    : String,
    tenant  : Tenant<User>
}

// Then use tenant with Tenant<User> or Tenant<UserWithTenant>

If that's what you had in mind I'd still end up defining a lot of structs which wouldn't be that big of a problem because I could probably generate that with a macro.
A bigger issue I'd say is how to model multiple optional relations?
For example a user might belong to a tenant but also might have many projects and sometimes only one of these relations is present and should be represented with a field in the struct.

I prefer either of those to the flat model.

What to you mean by "flat model"?

If you have lots of optional properties that can be attached, there's a typemap pattern:

which is basically a hashmap indexed by the type you want, so you can have user.get::<Tenant>() and user.get::<FavoriteColor>() for whatever arbitrary type you come up with.

Something like that, yeah. It's not a way to avoid lots-of-structs, it's an alternate way to structure lots-of-structs, e.g.

  • Tenant :arrow_right: Tenant<()>
  • Tenant_w_Users :arrow_right: Tenant<User>

Then if you have methods that make sense regardless of the "user knowledge", you can

// Define methods independent of how much user data one has
impl<T> Tenant<T> { /* ... */ }
1 Like

This won't enforce the relations at the type level and I'll end up unwrapping and/or checking the results I get back from the type map. That's not what I want to do.

This is a very interesting alternative to flat structure I had before. Thank you very much for the clarification and the suggestion to not use the flat structure. I tried it out and came up with this data structure:

////////////////////////////////////////////////////////////////////////////////
// Tenant
pub struct Tenant<TR: TenantRelation> {
    pub id          : i32,
    pub name        : i32,
    pub relations   : TR,
}

// Trait
pub trait TenantRelation {}

// No relation
impl TenantRelation for () {}

// With "users"
struct TenantRelUsers<UR: UserRelation> {
    users: Vec<User<UR>>
}
impl<UR: UserRelation> TenantRelation for TenantRelUsers<UR> {}


////////////////////////////////////////////////////////////////////////////////
// User
pub struct User<Rel: UserRelation> {
    pub id          : i32,
    pub name        : i32,
    pub relations   : Rel,
}

// Trait
pub trait UserRelation {}

// No relation
impl UserRelation for () {}

// With "tenant"
struct UserRelTenant<TR: TenantRelation> {
    tenant: Tenant<TR>
}
impl<TR: TenantRelation> UserRelation for UserRelTenant<TR> {}

// With "tenant" and "projects"
struct UserRelTenantAndProjects<TR: TenantRelation, PR: ProjectRelation> {
    tenant  : Tenant<TR>,
    projects: Vec<Project<PR>>,
}
impl<TR: TenantRelation, PR: ProjectRelation> UserRelation
for UserRelTenantAndProjects<TR, PR> {}


////////////////////////////////////////////////////////////////////////////////
// Project
struct Project<Rel: ProjectRelation> {
    pub id          : i32,
    pub name        : i32,
    pub relations   : Rel,
}
pub trait ProjectRelation {}
impl ProjectRelation for () {}

struct ProjectRelOwner<UR: UserRelation> {
    owner: User<UR>
}
impl<UR: UserRelation> ProjectRelation for ProjectRelOwner<UR> {}

As this is a lot of boilerplate code I wrote macro to generate all of that for me:

Macro Definition
use paste::paste;
macro_rules! gen_entity_structs {
    ////////////////////////////////////////////////////////////////////////////
    // Generate Relation
    (@inner
        >>> relation generator <<<
        struct_name: $struct_n:ident;
        struct_props: $($prop:ident: $prop_t:ty),*;
        trait_name: $trait:ident;

        struct $r_name:ident <
            $ ($g:ident: $gt:ident),*
        > {
            $( $r_prop:ident: $r_prop_t:ty ),*
        }
    ) => {

        #[derive(Debug, Clone)]
        pub struct $r_name < $($g: $gt),* > {
            $( $r_prop: $r_prop_t ),*
        }

        impl <$($g: $gt),*> $trait for $r_name<$($g),*> {}


        // Implement From trait to convert for example:
        // Tenant<()> into UserRelTenant<()> or
        // Vec<User<()>> into TenantRelUsers<()>
        paste! {
        #[allow(unused_parens)]
        impl< $($g: $gt),* > From<($($r_prop_t),*)>
        for $r_name < $($g),* > {
            fn from (( $($r_prop),* ): ($($r_prop_t),*)) -> Self {
                $r_name { $($r_prop),* }
            }
        }
        }


        // Implement From trait to convert for example:
        // (User<()>, UserRelTenant<TR>) into User<UserRelTenant<TR>>
        impl< $($g: $gt),* > From<($struct_n<()>, $r_name<$($g),*>)>
        for $struct_n<$r_name<$($g),*>> {
            fn from((a, b): ($struct_n<()>, $r_name<$($g),*>)) -> Self {
                let $struct_n { $($prop),* ,.. } = a;
                $struct_n { $($prop),* , relations: b }
            }
        }

        // Implement From trait to convert for example:
        // User<UserRelTenant<TR>> into User<()>
        impl< $($g: $gt),* > From<$struct_n<$r_name<$($g),*>>>
        for $struct_n<()> {
            fn from(a: $struct_n<$r_name<$($g),*>>) -> Self {
                let $struct_n { $($prop),* ,.. } = a;
                $struct_n { $($prop),* , relations: () }
            }
        }
    };

    // Iteration: Split into head and tail, call with head, recurse with tail
    (@inner
        >>> iterate <<<
        struct_name     : $struct_n:ident;
        struct_props    : $($prop:ident: $prop_t:ty),*;
        trait_name      : $trait:ident;


        struct $r1_name:ident < $ ($g1:ident: $gt1:ident ),* > {
            $( $r1_prop:ident: $r1_prop_t:ty ),*
        },

        $(
            struct $r_name:ident < $ ($g:ident: $gt:ident ),* > {
                $( $r_prop:ident: $r_prop_t:ty ),*
            }
        ),+

    ) => {

        // Call relation generator
        gen_entity_structs!(@inner
            >>> relation generator <<<
            struct_name : $struct_n;
            struct_props: $($prop: $prop_t),*;
            trait_name  : $trait;

            struct $r1_name < $ ($g1: $gt1 ),* > {
                $( $r1_prop: $r1_prop_t ),*
            }
        );

        // Continue iteration
        gen_entity_structs!(@inner
            >>> iterate <<<
            struct_name : $struct_n;
            struct_props: $($prop: $prop_t),*;
            trait_name  : $trait;

            $(
                struct $r_name < $ ($g: $gt ),* > {
                    $( $r_prop: $r_prop_t ),*
                }
            ),*
        );

    };

    // Iteration: End of iteration
    (@inner
        >>> iterate <<<
        struct_name     : $struct_n:ident;
        struct_props    : $($prop:ident: $prop_t:ty),*;
        trait_name      : $trait:ident;


        struct $r1_name:ident < $ ($g1:ident: $gt1:ident ),* > {
            $( $r1_prop:ident: $r1_prop_t:ty ),*
        }

    ) => {
        // Call relation generator
        gen_entity_structs!(@inner
            >>> relation generator <<<
            struct_name : $struct_n;
            struct_props: $($prop: $prop_t),*;
            trait_name  : $trait;

            struct $r1_name < $ ($g1: $gt1 ),* > {
                $( $r1_prop: $r1_prop_t ),*
            }
        );
    };


    ////////////////////////////////////////////////////////////////////////////
    // Entry point (without relations)
    (
        struct $name:ident {
            $(  $prop:ident: $prop_t:ty  ),*
        }

    ) => {
        #[derive(Debug, Clone)]
        pub struct $name {
            $(  pub $prop: $prop_t  ),*
        }
    };

    // Entry point (with relations)
    (
        struct $name:ident {
            $(  $prop:ident: $prop_t:ty  ),*
        }

        relations [
            $(
                struct $r_name:ident < $ ($g:ident: $gt:ident ),* > {
                    $( $r_prop:ident: $r_prop_t:ty ),*
                }
            ),+
        ]

        // relations $relations:tt

    ) => {
        paste!{

        #[derive(Debug, Clone)]
        pub struct $name<T: [<$name Relation>]> {
            // Properties
            $(  pub $prop: $prop_t  ),*,

            // Relations field
            pub relations: T
        }

        // Define trait "$nameRelation"
        pub trait [<$name Relation>] {}

        // Implement trait for ()
        impl [<$name Relation>] for () {}

        // Relations
        gen_entity_structs!(@inner
            >>> iterate <<<
            struct_name     : $name;
            struct_props    : $(  $prop: $prop_t  ),*;
            trait_name      : [<$name Relation>];

            $(
                struct $r_name < $ ($g: $gt ),* > {
                    $( $r_prop: $r_prop_t ),*
                }
            ),+
        );

        } // end paste
    };

    ////////////////////////////////////////////////////////////////////////////
}

Using the macro looks like this:

gen_entity_structs!(
    struct Tenant {
        id: i32,
        name: i32
    }

    relations [
        struct TenantRelUsers<UR: UserRelation> { users: Vec<User<UR>> }
    ]
);

gen_entity_structs!(
    struct User {
        id: i32,
        name: i32
    }

    relations [
        struct UserRelTenant<TR: TenantRelation> { tenant: Tenant<TR> },

        struct UserRelTenantAndProjects<TR: TenantRelation, PR: ProjectRelation> {
            tenant: Tenant<TR>,
            projects: Vec<Project<PR>>
        }
    ]
);

gen_entity_structs!(
    struct Project {
        id: i32,
        name: i32
    }

    relations [
        struct ProjectRelOwner<UR: UserRelation> {
            owner: User<UR>
        }
    ]
);

With this I got rid of the flat structure as you suggested and in addition to that the macro generates implementations of the From trait to make conversions available. With these From implementations I can for example do something like that:

// Create plain user
let plain_user: User<()> = User {
    id: 1, name: 2, relations: ()
};

// Create UserRelTenant from Tenant
let user_tenant: UserRelTenant::<()> = Tenant {
    id: 1, name: 2, relations: ()
}.into();

// Combine both to a User with "tenant" relation
let user_full: User<UserRelTenant<()>> =
    (plain_user, user_tenant).into();

// Convert the User with "tenant" relation to a plain user again
let plain_user_2: User<()> = user_full.into();


// Create a tenant's "users" relation from a vector of plain users
let tenant_users: TenantRelUsers<()> = vec![
    User { id: 1, name: 1, relations: () },
    User { id: 2, name: 2, relations: () },
].into();

What do you think about this solution now? For me this is usable and pretty close to what I wanted to achieve. But does it have any downsides? I'm very excited to hear what you say :slight_smile: