Rust data structure that is indexed twice; ECS?

I am a long time Java programmer taking an interest to study Rust. My understanding is that in Rust a container owns its stored objects. Suppose I have an enrollment relation that is an association between a student and a course. I want to find all courses a student may have; at the same time I want to find all students a course may have. I can use a hashmap for indexing, but I can't put this enrollment thing in two different hashmaps from my understanding.

Am I looking at a situation where I am cloning the enrollment struct for a secondary index? Can an existing Rust ECS crate more easily help solve this problem?

If you need multiple owners, a reference-counter is the way to do it. The standard library provides one.

More broadly, you seem to be conflating references with a relational model. You can refer to a student by a unique identifier, which is probably small and no big deal to bit-wise copy. Storing references in a collection is almost always not what you want, unless it is a transitory collection in some imperative code block.

3 Likes

The standard library types mentioned are Rc and Arc. If you need to be able to mutate, you'll probably want an Rc<RefCell<T>> or Arc<Mutex<T>>.

What about introducing another level of indirection via a third HashMap?

struct Enrollments {
    next_id: usize,
    courses: HashMap<String, Vec<usize>>,
    students: HashMap<String, Vec<usize>>,
    enrollments: HashMap<usize, Enrollment>,
}

struct Enrollment {
    student: String,
    course: String,
    ...
}

Then when you insert a new enrollment, you create an index and update the courses and students tables accordingly.

Here is the rest of my code:

impl Enrollment {
    fn new(course: impl Into<String>, student: impl Into<String>) -> Self {
        Enrollment {
            course: course.into(),
            student: student.into(),
        }
    }
}


impl Enrollments {
    pub fn register(&mut self, enrollment: Enrollment) {
        let id = self.next_id;
        self.next_id += 1;
        self.courses
            .entry(enrollment.course.clone())
            .or_default()
            .push(id);
        self.students
            .entry(enrollment.student.clone())
            .or_default()
            .push(id);
        self.enrollments.insert(id, enrollment);
    }

    pub fn by_course_name<'a>(&'a self, name: &str) -> impl Iterator<Item = &'a Enrollment> + 'a {
        self.courses
            .get(name)
            .map(|ids| ids.as_slice())
            .unwrap_or_default()
            .iter()
            .map(|enrollment_id| &self.enrollments[enrollment_id])
    }

    pub fn by_student_name<'a>(&'a self, name: &str) -> impl Iterator<Item = &'a Enrollment> + 'a {
        self.students
            .get(name)
            .map(|ids| ids.as_slice())
            .unwrap_or_default()
            .iter()
            .map(|enrollment_id| &self.enrollments[enrollment_id])
    }
}

(playground)

Then when you have a main() like this:

fn main() {
    let mut enrollments = Enrollments::default();

    enrollments.register(Enrollment::new("Maths", "Michael"));
    enrollments.register(Enrollment::new("Maths", "Julia"));
    enrollments.register(Enrollment::new("Programming", "Michael"));

    println!("Math enrollments:");
    for enrollment in enrollments.by_course_name("Maths") {
        println!("  {:?}", enrollment);
    }

    println!("Michael's courses:");
    for enrollment in enrollments.by_student_name("Michael") {
        println!("  {:?}", enrollment);
    }
}

it will generate the expected output

Math enrollments:
  Enrollment { student: "Michael", course: "Maths" }
  Enrollment { student: "Julia", course: "Maths" }
Michael's courses:
  Enrollment { student: "Michael", course: "Maths" }
  Enrollment { student: "Michael", course: "Programming" }

The benefit of this method is you can have get_xxx_mut() methods which do the appropriate lookup and give you a &mut Enrollment without worrying about smart pointers or mutexes.

2 Likes

I think I agree with Rustacious. I have written one test application using Arc of Mutex of S where S is a struct. It strikes me as somewhat fragile to support. I really like u32 sized identifiers, and the concept of ECS. Now giving it some more thought I think what I want are two B-tree indexed immutable structs. They are sorted by (student,course) to enrollment, (course,student) to enrollment. The first one allows me to search for all courses tied to a student. The second one allows me to search for all students tied to a course. The enrollment component can contain a student and course id, and no index is needed by itself.

I looked at a number of Rust ECS libraries, including Bevy. Bevy has a enhancement request #3742 to add relationship support. It seems Flecs (C++) was designed to support relationships out of the box.

Flecs entity ID design

There are some graph implementations, petgraph is well known but graph looks interesting too. You could use a graph internally to create an API for tables with related columns.

I tried out crossbeam_utils:: thread::scope(). This allows me to have plain reference borrow of Mutex. No more Arc.