Conventions for named objects

I have a few types of structs in my code, each of which needs to have a name field so that I can store them and retrieve them from a HashMap:

struct struct1 {
    name: String,
    // snipped more stuff
}
struct struct2 {
    name: String,
    // snipped more stuff
}

In my repo (wrapper for a HashMap) I would like to have an add() method that would retrieve the name from the struct object and use it as the key.

impl MyRepo {
    // ...
    pub fn add<T>(self: &mut Self, obj: T) {
        self.my_map.insert(obj.name, obj); 
    }
}

Obviously the compiler does not know anything about the T object and its members:
error[E0609]: no field `name` on type `T`

What’s the convention here? The best I can come up with is to create a NamedObject trait and implement it the same for each type of struct, i.e. return the name member. Doesn’t sound ideal and has lots of extra typing.

Any other convention I am missing ?

Could you perhaps put struct1 and struct2 into an enum?

enum NamedStructs {
    Struct1 {
        name: String,
        //
    },
    Struct2 {
        name: String
        //
    }
}

This would let you pass them around simultaneously, but if it includes lifetimes or type parameters, it becomes abit unwieldy to have it like this.

2 Likes

I probably do need to mess with lifetimes, e.g there is an associated method that returns references to owned strings.

Traits are the way to express that a certain type has a certain behavior, so that’s what I would do. To avoid the extra typing you can create a derive for it. That’s not to bad.

1 Like

1 - Extracting the name with a generic

Whenever you need a generic with a property, you need to abstract the desired common behavior (e.g., get-ting a struct’s name) with a trait, and then bind your generic with that trait:

use ::std::rc::Rc; // or ::std::sync::Arc

struct Struct1 {
    name: Rc<str>,
    // ...
}

struct Struct2 {
    name: Rc<str>,
}

pub
trait Named {
    fn name (self: &'_ Self) -> &'_ Rc<str>;
}

macro_rules impl_Named_for {($(
    $Struct:ty;
)*) => ($(
    impl Named for $Struct {
        #[inline]
        fn name (self: &'_ Self) -> &'_ Rc<str>
        {
            &self.name
        }
    } 
)*)}

impl_Named_for! {
    Struct1;
    Struct2;
}

impl MyRepo {
    // ...
    pub
    fn add<T> (self: &'_ mut Self, obj: T)
    where
         T : Named, // this bound allows us to use the `.name()` method
    {
        self.my_map.insert(obj.name().clone(), obj); 
    }
}

Since you “need” owned types to be used as the HashMap keys, you will have no choice but to .clone() the field. Thus, instead of storing Strings in your structures, you can convert them .into() a Rc<str>, thus getting extremely cheap Clone-ing :slight_smile: (at the expense of not being able to mutate the name of a struct afterwards, but given your use case it seems very unlikely (HashMaps are never too fond of mutation anyways)).

This way we solve the issue with extracting the name of a struct in a generic manner.

Problem

But there is an issue: what are the exact Key and Value so that the type of MyRepo's my_map field is of type HashMap<Key, Value> ?

  • Key = Rc<str> as shown above for cheap Clone-ing,

  • Value = ... ? The problem is that there needs to be a single type that dynamically dispatches to each type Struct1, Struct2, etc.

Unifying Struct1, Struct2, etc. into a single type

In Rust there are two non-unsafe ways this can be achieved:

  1. using an enum, as @OptimisticPeach suggested (the dynamic dispatch takes place from the required pattern matching on the value):

    pub
    enum MapValue {
        pub
        Struct1(Struct1),
        pub
        Struct2(Struct2),
    }
    
    /// we will abstract over the ability to create an enum with Into<MapValue>
    impl From<Struct1> for MapValue {
        #[inline]
        fn from (struct1: Struct1) -> Self
        {
            MapValue::Struct1(struct1)
        }
    }
    impl From<Struct2> for MapValue {
        #[inline]
        fn from (struct2: Struct2) -> Self
        {
            MapValue::Struct2(struct2)
        }
    }
    

    then Value = MapValue and we can write

    impl MyRepo {
        pub
        fn add<T> (self: &'_ mut Self, obj: T)
        where
            T : Named, // .name() method
            T : Into<MapValue>, // .into() conversion
        {
            self.my_map.insert(obj.name().clone(), obj.into()); 
        }
    }
    

    Then, after accessing a value stored within the map, such as with a if let Some(map_value) = my_map.get("some name") {,
    you need to perform the dynamic dispatch with a match:

    match *map_value {
        | MapValue::Struct1(ref struct1) => {
            let _: &Struct1 = struct1; // you can use struct1 with the correct type here
        },
        | MapValue::Struct2(ref struct2) => {
            let _: &Struct2 = struct2; // you can use struct2 with the correct type here
        },
    }
    
  2. using a trait object: the specific types are then “definitely” lost, you will just have an opaque handle to the common behavior specified by the trait object. Since trait objects require a level of indirection (their dynamic dispatch taking place through a vtable and an opaque pointer to the data), in order to get ownerhip you’d need an owning pointer such as Box or Rc (the former grants easy mutation, the latter, cheap Clone-ing). This thus requires:

    • an object safe trait to abstract over the common behavior:

      trait CommonBehavior : Named { // can also include other object-safe traits
          // object safety requires that the method not have type parameters (no generic)
          fn some_common_method (
              // object safety requires that the method take `self` with indirection:
              self: &'_ Self, /* or:
                    &'_ mut Self,
                    Box<Self>*/
              some_arg: ArgType, // object safety requires that ArgType not be Self
          ) -> RetType; // object safety requires that RetType not be Self either
      
    • Choosing the level of indirection for our trait object: &_, &mut _, Box<_>, Rc<_>. Since the most flexible one to use is Box<_> (no borrowing, allows mutation), let’s use that here:

    Now we are able to have a unified Value type for our map: Box<dyn CommonBehavior>.

    Conversion from obj: T (where T : CommonBehavior) into a Box<dyn CommonBehavior> now just requires boxing it: Box::new(obj) as Box<dyn CommonBehavior>:

    fn add (self: &'_ mut Self, obj: Box<dyn CommonBehavior>)
    {
        // since CommonBehavior : Named, we can call .name()
        self.my_map.insert(obj.name().clone(), obj);
    }
    
6 Likes

I didn’t know about the Rc<str> trick. Has anyone run benchmarks to show this makes a real difference with just taking a String?

Rc is supposed to have very little performance overhead. According to this article, the memory overhead is just an extra word to hold the reference count, and the time overhead is the time to increment a reference when you clone it. Other than cloning, I don’t know if it has any time overhead at all (maybe other than affecting cache performance if you have a million of them).

Essentially, for almost all use cases, Rc is just as performant as a simple reference, but with more flexibility.

Awesome answers, thank you!

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.