Need advice on data layout for an algorithm

I have an algorithm that was originally implemented in another language than rust.
This algorithm finds the right order of rendering for isometric objects. I have a goal of implementing this algorithm in rust.

I'll try to be short and not go into too much detail, but basically, in the original implementation, it works like this:

  • Every isometric object is a dynamically allocated.
  • Every isometric object has some isometric points, which are also dynamically allocated. Object have references to these points, and points have references to their object.
  • There is a special container that holds points grouped by their depth. Actually it is a hash map of vectors of points, using depth as key. Every point holds a reference to the vector it is in, and every vector holds a reference to the hash map.
  • When an object is added, all of its points are added to the container.
  • When an object moves, its points are moved inside a container so that they are in the right place. This is done through the point object and uses references to the vector and to the hash map. Basically removes point from the original vector and adds to the right vector according to the new depth.
  • When an object gets removed, all of its points are removed from the container. And here again, references from the point to the vector are very handy.
  • To iterate objects in the right order we iterate over all vectors and over all of the points in these vectors. During this iteration, we get objects from points and do some things with them, like storing references to each other if one object is blocked from drawing by another object, but this is mostly outside of the scope of this question.

My previous rust experience is mostly solving Advent of Code puzzles, and this mostly did not involve any complex data structures. At least I've never used Rc<RefCell<_>>, or may be used it once and do not remember it. I've read the documentation and I think I understand the concepts.

First of all, I've decided to implement a point container because it is a self-contained thing.

Some pseudo-code, so that I can speak in more detail:

struct Point {depth: Depth, owner: PointSubContainer};
struct PointContainer {map: HashMap<Depth, PointSubContainer>}
struct PointSubContainer{depth: Depth, owner: PointContainer}

impl Point {
 fn new(depth, container) Point{depth, owner: container.get_or_create_sub_for_depth(depth)}
 fn change_depth(depth) { 
  self.owner.remove(self);
  self.depth = depth;
  self.owner = self.owner.owner.get_or_create_sub_for_depth(depth);
  self.owner.add(self);
 }
 fn dispose() {/*remove from owner*/}
}

Because points are referenced by both objects and containers I had to store them in Rc, and because I have to modify them, that became Rc<RefCell>. The same thing is with point containers and their sub-containers: they also became Rc<RefCell>. I created most of the data structure and a code for adding points. I decided that container will store weak references to points and auto-remove them when point gets dropped. That was not very hard, but I felt like I lose the power of rust because now any reference to anything is actually mutable. But the main problems were when I needed to create an iterator over the container: Return an iterator from struct in RefCell - #7 by romamik.

Here is what I've got at the moment: Rust Playground

It all feels like I'm doing something wrong from the beginning. May I had not to store references to containers in points, but have the change_depth method accept the root container as an argument and find actual sub-containers dynamically, that's more expensive than having references to them, but not too much. I will still have to have references to the points from both containers and objects, and that's a problem: I can imagine points being stored and owned by a container, like vector stores it's items for example, and having references to them in objects, but that will make container immutable because there always will be references to it's contents. I can eliminate references from points to objects, and find points dynamically when I need to modify this, but that can be expensive at runtime.

So, what is the rust approach to such algorithms?

I’ve looked at this code, but before I can suggest possible alternative approaches I need to understand the deal with these “objects” you mention in the previous text / description. In particular, in the code I only see one type, “DepthPoint”, which I assume to be about points, not objects… is the code simply incomplete? You’re talking about references between points and objects though here:

Assuming the “objects” are not part of the “what I've got at the moment” at all yet, could you elaborate slightly on what references you imagine between points and objects, and what you had in mind when discussing the alternative to “eliminate” some of them with an expensive runtime overhead?

Yes, "Objects" are not part of what I've got at the moment, because I decided to start with PointContainer and it is not ready still.

"Objects" are the main owners of "Points". And they should be easy to find given the object.
The bigger structure will be something like this:

struct ObjectContainer {
  objects: HashMap<ObjectId, Object>,
  points: PointContainer
}
Point {depth,sub_container,object}
Object {points: Vec<Point>}

impl ObjectContainer {
fn add_object(obj_descr) {
   let object = create_object_from_desct(obj_descr); // this will also create points
   self.objects[object.id] = object;
   self.points.add(object.points);
}

fn update_object(obj_descr) {
   let object = self.object[obj_descr.id];
   object.update(obj_descr); // that will modify object points, making them change their position in PointContainer
}

fn iterate() -> Iterator<Object> {
  for sub_container in self.points {
     for point in sub_container {
         yield point.object; // that's an oversimplification but gives an idea
     }
  }
}

As you can see object has reference to its points and vice versa. I can have only references from objects to points and have points to hold the Ids of the objects instead of the references. That will make me look up objects by Id, when iterating over points. And that will make me find points somehow in the containers when I modify an object.

I would try to avoid Rc and RefCell.

You don't need points to store pointers into "depth vector" in the hash map -- you can simply look it up in the hash map by the hash map key (depth). Storing pointers seems like a micro-optimization but the benefits aren't clear: look-ups in a hash map are O(1) anyway, and storing pointers requires additional storage and maintenance to keep them in sync. If you really need to optimize the hash map, there are other things you can do, such as changing the hash function etc.

You don't need the "depth vectors" in the "depth hash map" to store pointers back to the hash map. It's unnecessary for hash map values to store pointers to the hash map.

Since depth is an ordered key, you may want to use BTreeMap instead of HashMap, so that it will keep track of the depth order for you.

Similarly, points shouldn't need to store pointers back to the objects they are in or any Ids of those objects. This relationship is implicit in the object. Instead, I would have your HashMap / BTreeMap contain some sort of mapping "depth -> (object id, point id)".

I think it is not right stylistically to have methods on Point that perform operations on things that the Point doesn't own (the object, the hash map, etc). It seems like this is what leads you to a complicated design where a Point stores pointers to things that own it. Instead, they should be functions outside of Point.

3 Likes

I've tried an approach without Rc and RefCell and it worked very well, the code has reduced in size by a factor of 3 and it was super easy to write. Thanks.

I still have to check the performance, but probably you are right: maintaining all these backlinks plus reference counting in Rc and borrow checking in RefCell will also take time so more map access will not be a problem.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.