DAG implementation with indices - is it bad?

So, I'm building a small autodiff library - GitHub - botev/meta_diff in Rust. In there I need to create a DAG of the computation, thus in this case I use similar approach to this Baby Steps - e.g. representing nodes as being indexed inside a vector. Now unfortunately I will need to delete nodes, thus I made my graph to be Vec<Option<Node>>, sort of having the notion of deleted node. Now my graph is not intended to grow to big (not more than 1000 nodes) and so I was wondering if in this case having an Option would make it much worse? Also what is the opinion of this instead of RefCell and Arc usage? Would that be potentially better?
Any other suggestions would be appreciated.

Here's some good news. Your node type is ComputeNode, is that correct? That at least means that Option<ComputeNode> uses no extra space, because ComputeNode contains a non-nullable pointer that the rust can use for enum layout optimization. You shuold be able to verify this with size_of.

I guess handling the None case everywhere could be a drag, maybe separate the graph logic and the other logic more to help with that?

1 Like

Oh, that was something I did not know, and is nice to hear! The only problem one can see is that it is sort of manual memory management essentially. But at least knowing it does not introduce any extra space is good.

As for the managing of None it is a bit a drag(that's why I have and this methods get_node and get_node_mut, which return Result, and I usually just call try!(get_node(...)) to make it a bit more cleaner.

Could you elaborate a bit more on how to separate the graph logic and the other or like an example of how to do it. One of the key aspects at the moment is that the child -> parent connections are in fact there trough the Operator variable inside a node. Do you mean to split those to the graph logic instead?

Ok that seems ok, that way you treat the None node almost as if they are removed from the graph, that's good.

I notice you use format! directly in get_node. You should change that to only be called in the error case, otherwise you are allocating that error message all the time, even when it's not needed.

1 Like

Thanks very much for that! Really appreciated for taking a look at the code! And yeah I guess you are right since the ok_or is not a macro, so probably should change it. Again thanks for the suggestions!

There's been ok_or_else which accepts closure. This makes error message allocation lazy.