Advice on creating a model-based system

Good evening all.

Apologies, this is a big post.

I am having an issue I can’t find any solution with.

Note : the question may look like this : Function that can return any struct that implements a Trait
But I would prefer having the expert’s eye on the whole problem rather than just a small part of it — that may even be a code smell.
This project is itended to last, I need it to be well-built from the start.

The problem I’m trying to solve

My program reads commands (from file, stream, stdin.. anything) in JSON format.
It processes them into ModelInstances, which are put in cache for future (fast) computations.

Models

Models don't have much in common except for the ModelInstance trait.
Each model is defined as a struct like so :

#[derive(Serialize, Deserialize)]
pub struct Cat {
    id: u32,
    // ... model data
}

Here’s the ModelInstance trait :

pub trait ModelInstance {
    /// Get Instance ID
    fn get_id(&self) -> u32;

    /// Get the name of the model, in **lowercase**.
    /// This is used for serialization or representation
    fn get_model_name() -> &'static str;
}

Instances of a model live in a ModelManager.
It's a struct with a Vec of ModelInstances inside (the cache), plus a bunch of methods to work with the data, and some common behavior from the ModelManager trait.

pub trait ModelManager<T>
where
    T: ModelInstance,
{
    fn get_cache_mut(&mut self) -> &mut ModelInstanceListCache<T>;
    fn get_cache(&self) -> &ModelInstanceListCache<T>;

    // Other methods to work with the cache (getById, getAll, merge, replace, insert, etc...)
    // These are all implemented in the ModelManager trait, because they only rely on get_cache_mut and get_cache.
}

/// The manager for the Cat model.
/// It implements the ModelManager trait + model-specific methods.
pub struct CatManager {
    cache: ModelInstanceListCache<Cat>,
}

Finally, all ModelManagers are grouped in a single RootDataStore.

struct RootDataStore {
    cats: models::CatManager,
    houses: models::HouseManager,
}

Commands

When a command comes in, it's parsed into a Command enum :

struct ModelCommand {
    pub model_name: String,
    pub action: ModelCommandAction, // An enum with CRUD-like actions
    pub data: serde_json::Value,
}

pub enum Command {
    ModelCommand(ModelCommand), // CRUD-like operations on the cache of instances
    SpecialCommand, // Calculation orders, config, etc...
}

Problems

.. And this is where i'm having problems. Ideally, the Command enum would have a run method that would take &self and run itself.
However, ModelCommands can target any model. In order to run such a command, I am expecting to require a get_module_with_model_name method on the RootDatastore.
That method would take the model name in and return :

  • a reference to the "cats" member of the RootDataStore instance, which is a CatManager.
  • a reference to the "houses" member of the RootDataStore instance, which is a HouseManager.

And with these both implementing the ModelManager<T> trait, I can't get things to work. I've tried :

  • Putting store modules (the "cats" and "houses") in Boxes
  • Making a StoreModule Enum with the different types

I'm open to any constructive feedback on this...

Thanks a lot for reading until there.
Cheers

So the core problem here is that you really want to use a trait object so RootDataStore can return a reference to a single type, based on the name of the model. Right now though ModelManager has T as a type parameter and that's going to be different for every manager. That makes it not really suitable for what you want to do here

Box<dyn ModelManager<Cat>> doesn't let you return the house manager, and vice versa.

Assuming all of the default methods you didn't include in your post can be adapted to not include T in their declaration, we can make some minor changes to ModelManager and add a new trait to work around this. The new trait will handle all of the dynamic functionality that RootDataStore and Command need to deal with without actually exposing the model type to them.

First we make the T type parameter on ModelManager into an associated type. Then we move all of the default methods you had implemented on ModelManager into our new trait "DynModelManager".
(We can't use a type parameter because that gets in the way of implementing DynModelManager for all of the types implementing ModelManager)

Now DynModelManager doesn't need to worry about those two methods on ModelManager that needed T. We can then do a blanket impl so all ModelManagers get that default behavior from DynModelManager.

Then RootStore only needs to worry about DynModelManager which is much more trait object friendly.

Playground

use serde_json::json;

pub trait ModelInstance {
    fn get_id(&self) -> u32;
    fn get_model_name() -> &'static str;
}

pub struct ModelInstanceListCache<T>(Vec<T>);

pub trait ModelManager {
    type Model: ModelInstance;

    fn get_cache_mut(&mut self) -> &mut ModelInstanceListCache<Self::Model>;
    fn get_cache(&self) -> &ModelInstanceListCache<Self::Model>;
}

pub trait DynModelManager {
    // Put methods that used to be default methods on ModelManager here.

    // create is just here for illustrative purposes.
    fn create(&mut self, data: serde_json::Value);
}

impl<M> DynModelManager for M
where
    M: ModelManager,
{
    // implement methods which were previously default methods in ModelManager
    fn create(&mut self, data: serde_json::Value) {
        println!("{data}");
    }
}

/// The manager for the Cat model.
/// It implements the ModelManager trait + model-specific methods.
pub struct CatManager;

pub struct Cat(u32);

impl ModelInstance for Cat {
    fn get_id(&self) -> u32 {
        self.0
    }

    fn get_model_name() -> &'static str {
        "cat"
    }
}

impl ModelManager for CatManager {
    type Model = Cat;

    fn get_cache_mut(&mut self) -> &mut ModelInstanceListCache<Self::Model> {
        todo!()
    }

    fn get_cache(&self) -> &ModelInstanceListCache<Self::Model> {
        todo!()
    }
}

struct RootDataStore {
    cats: CatManager,
}

impl RootDataStore {
    pub fn manager(&mut self, name: &str) -> Option<&mut dyn DynModelManager> {
        Some(if name == Cat::get_model_name() {
            &mut self.cats
        } else {
            return None;
        })
    }
}

fn main() {
    let mut root = RootDataStore { cats: CatManager };

    root.manager("cat").unwrap().create(json!({"cat": "data"}))
}

@semicoleon thank you for the reply.

I should have included these methods then !
I need the manager to implement methods that accept and return model instances, like so :

pub trait DynModelManager {
    fn get(&self, id: u32) -> Option<&ModelInstance>;
    fn get_list(&self) -> &Vec<ModelInstance>;
    fn merge(&ModelInstance);
}

// Later..
root.manager("cat").unwrap().get(999).unwrap()

Obviously DynModelManager can't do that as it is since it cannot have an associated type.

Do you have any idea ?

What exactly do you need to do with the models? If you can use a trait to cover all the actions you can have those methods work with &mut dyn ModelInstance and keep things basically the same.

To clarify what this is all about :

The Rust program I am writing is a lib crate to manage the state of a document. I have a working implementation in Python, but it's memory-hungry and does not WASM -- so i'm re-implementing it in Rust. (I did read the Book and play around with it but i'm still learning as of today)

The program is designed to be a long lived process that runs as long as the document is open. It knows about the whole state of the document.
Documents are made of objects of various types (the models, Cat and House). There are typically a lot of instances of each Model, and they have rather complex relationships. Mutating a Cat, for instance, usually mutates many others, because of the document logic.

All This logic is model-specific, so the current working implementation (Python) encapsulates it in the models. I'm trying to replicate this on the Rust implementation :

  • Fat models that know a lot,
  • Managers that hold the instances for a specific model, and expose an API to query and mutate instances,
  • The "Root Datastore" which is just a collection of all the managers so I have a single source of truth.

What I've been trying to do was to factor some of the manager code. Indeed, all managers can do get(), create_or_update(), etc. The rest is specific.

I have tried different ways to update the playground and get closer to what I want but I can't find a solution that can run this :

fn main() {
    let mut root = RootDataStore {
        cats: CatManager { cache: vec![] },
    };
    let the_cat = Cat::from_value(json!({ "id": 1 })).unwrap();
    root.manager("cat").unwrap().create_or_update(&the_cat);
    // Retrieve the cat
    let the_cat_from_store = root.cats.get(999).unwrap();
    assert!(the_cat_from_store.get_id() == 999);
}

I get that my needs are somewhat specific, I think for the prototyping phase I might just accept some slight code duplication and continue learning until I figure out how factor this out properly

Best,

Based on that description it kind of sounds like you don't really need traits here, you could just access the model managers directly on the root store if the commands need to do different things for different types. Is there a reason just passing the command to the different managers wouldn't work?

If you are really just trying to mirror what you were doing in python, you could adapt the managers to take a Box<dyn Any>. Then downcast in the create method to make sure it was the right type, but there are some restrictions around what you can do with Any so you'd want to make sure you understood the documentation for Any before you did that.

The reason I was trying to use traits is that there is a small part of common behavior between all managers get(), get_list(), etc..
But I can live with that being duplicated for now.

Anyways thank you very much for the time taken ! I will have a look at Any, but it sounds like runtime typing which i'm not a huge fan of.

Is there a reason you wouldn't go with an enum to represent the various ModelInstance types? (There certainly are good reasons, sorry if I missed it).

pub enum ModelInstance {
    Cat(Cat),
    Dog(Dog),
    Capybara(Capybara),
}

One nice bit with that is you can implement Into<ModelInstance> for all sub model instance types. Then your functions can take impl Into<ModelInstance> and the function can be called for all bare model types (Cat, etc.) as well as the ModelInstance enum itself, because T is Into<T>. You just need to call .into() on it first thing in each function.

5 Likes

whoa, i need to start to think Enum... That seems promising indeed. I will definitely try this !

So I went ahead and reimplemented everything the Enum way.
It works ! The Into<ModelInstance> trick is really neat :wink:

I'm marking this as the answer. :+1:
Thank you again to everyone who helped ! :slight_smile:

Note that you should avoid to implement Into<T> directly. Instead of having impl Into<SomeType> for YourType, you can have impl From<YourType> for SomeType which gives you the Into impl for free.

I'll also mention this sounds very close to an ECS, though perhaps existing libraries are too complicated or overkill, given they're generally optimized for at least millions of operations per second, while still keeping flexibility.

I think the current best option there is still bevy's core ECS

At least there might be tricks you can see, seven if it's not a great match?

Indeed , cargo warned me. Corrected, thanks

I have been playing with Bevy ECS a bit before indeed, and I didn't feel it was very adapted. As said there's not much common behavior between the objects of a document, so I couldn't split the behavior of my objects (components) into many reusable systems, so I ended up with 1 Entity > 1 Component > 1 System.. Plus the event loop was an issue (I can't offer a constantly running event loop so I had to identiy when to run it)...

However I might very well use it for an other project that seems like a great fit for this :yum:

Ah, sounds like there's a reason it sounded similar!

Well the basement bargain version of an ECS is you just have:

struct World {
  foos: BTreeMap<u32, Foo>,
  bars: BTreeMap<u32, Bar>,
  ..
}

But it's a bit of annoying busywork.

"Real" ECSs do terrible things to nicely handle arbitrary components with performance, but you might be able to do something similar with a crate line erased_set (no endorsement, I haven't used it),

struct World {
  maps: ErasedSet,
}

impl World {
  fn get_map<T>(&self, id: u32) -> &BTreeMap<u32, T> {
    self.maps.get_or_insert_with::<BTreeMap<u32, T>>(Default::default)
  }

  // get_map_mut...

  fn get<T>(&self, id: u32) -> Option<T> {
    self.get_map::<T>().get(id)
  }

  // ...
}

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.