Writing a self-referential graph

chung · March 12, 2024, 2:20am

I am trying to rewrite a self-referential(edges and nodes) graph(C++ code) into Rust Playground
There are 3 base structs

struct HalfEdge<'a, NodeData, EdgeData, DerivedNode, DerivedEdge> {
    node_data: PhantomData<NodeData>,
    data: EdgeData,
    twin: Option<&'a mut DerivedEdge>,
    next: Option<&'a mut DerivedEdge>,
    prev: Option<&'a mut DerivedEdge>,
    from: Option<&'a mut DerivedNode>,
    to: Option<&'a mut DerivedNode>,
}

 struct HalfEdgeNode<NodeData, EdgeData, DerivedNode, DerivedEdge> {
    data: NodeData,
    edge_data: PhantomData<EdgeData>,
    derive_node: Option<DerivedNode>,
    incident_edge: Option<DerivedEdge>,
}

 struct HalfEdgeGraph<NodeData, EdgeData, DerivedNode, DerivedEdge> {
    _node_data: PhantomData<NodeData>,
    _edge_data: PhantomData<EdgeData>,
    nodes: Vec<DerivedNode>,
    edges: Vec<DerivedEdge>,
}

And the corresponding concrete structs

struct STHalfEdge<'a>(
    HalfEdge<
        'a,
        SkeletalTrapezoidationJoint,
        SkeletalTrapezoidationEdge,
        STHalfEdgeNode<'a>,
        Box<STHalfEdge<'a>>,
    >,
);

struct STHalfEdgeNode<'a>(
    HalfEdgeNode<
        SkeletalTrapezoidationJoint,
        SkeletalTrapezoidationEdge,
        Box<STHalfEdgeNode<'a>>,
        Box<STHalfEdge<'a>>,
    >,
);

struct SkeletalTrapezoidationGraph<'a>(
    HalfEdgeGraph<
        SkeletalTrapezoidationJoint,
        SkeletalTrapezoidationEdge,
        STHalfEdgeNode<'a>,
        STHalfEdge<'a>,
    >,
);

Although the translation from the OOP approach to Rust is not ideal, I managed to proceed with the rewrite, until I hit this for loop, particularly twin = twin->prev_->twin_->prev_

for (edge_t* twin = source_twin;; twin = twin->prev_->twin_->prev_)
{
   edge_t* edge = &graph_.edges.front();
   edge->from_ = twin->to_;
   edge->to_ = twin->from_;
   edge->twin_ = twin;
   twin->twin_ = edge;
   edge->from_->incident_edge_ = edge;
}

I deducted it is equivalent as below in Rust

let mut twin: Option<&mut STHalfEdge> = source_twin;
loop {
    // forgive me for the as_mut() and unwrap()
    let t: &mut Box<STHalfEdge> = twin
        .as_mut()
        .unwrap()
        .0
        .prev
        .as_mut()
        .unwrap()
        .0
        .twin
        .as_mut()
        .unwrap()
        .0
        .prev
        .as_mut()
        .unwrap();
        
    twin = Some(&mut *t);
}

Because it is a self-referential struct, things are really ugly now.

error[E0506]: cannot assign to `twin` because it is borrowed
   --> src/lib.rs:179:17
    |
164 |                 let t: &mut Box<STHalfEdge> = twin
    |                                               ---- `twin` is borrowed here
...
179 |                 twin = Some(&mut *t);
    |                 ^^^^^^^^^^^^^^^^^^^^
    |                 |
    |                 `twin` is assigned to here but it was already borrowed
    |                 borrow later used here

How should I tackle this problem?

quinedot · March 12, 2024, 2:35am

Don't use references; there's no safe and usable way to have self-referencial structs based on references. It's UB to have two active &mut to the same memory, which would include any (undirected) cycles in your graph.

Consider using Arc and sync::Weak with Mutex, or Rc and rc::Weak with RefCell.

chung · March 12, 2024, 4:43am

Do you mean converting twin: Option<&'a mut DerivedEdge> to twin: Option<Rc<RefCell<DerivedEdge>>> like this?

    pub(super) struct HalfEdge<NodeData, EdgeData, DerivedNode, DerivedEdge> {
        pub(super) node_data: PhantomData<NodeData>,
        pub(super) data: EdgeData,
        pub(super) twin: Option<Rc<RefCell<DerivedEdge>>>,
        pub(super) next: Option<Rc<RefCell<DerivedEdge>>>,
        pub(super) prev: Option<Rc<RefCell<DerivedEdge>>>,
        pub(super) from: Option<Rc<RefCell<DerivedNode>>>,
        pub(super) to: Option<Rc<RefCell<DerivedNode>>>,
    }

Hyeonu · March 12, 2024, 4:45am

Since the graph already holds vectors of nodes and edges, you can store indices instead of references. No interior mutability is needed.

ratmice · March 12, 2024, 5:24pm

It is also worth mentioning Lukas Kalbertodt did his masters thesis^[1] on a rust implementation of this data structure for his lox crate which uses this indices instead of references approach.

GitHub - LukasKalbertodt/masters-thesis: My master's thesis about writing the polygon mesh library `lox` in Rust ↩︎

quinedot · March 12, 2024, 6:08pm

I must admit I didn't look at the data-structure in depth; I brought up Arc/Rc/Weak as they are common ways of having "back pointers" in cyclic structures. If you can use indices or similar as others have suggested, it might be a better approach, depending on the particulars of the problem.

programmerjake · March 13, 2024, 6:47pm

(edit: added edges as demo of how to use multiple types pointing to each other)
if you're ok with all graph nodes being allocated in an Arena (you can allocate nodes with Arena::alloc which only needs a shared reference to the Arena; this has the drawback that you can't deallocate any nodes in your graph until you deallocate the whole Arena), you can use shared references:

pub struct Node<'a> {
    pub edges: RefCell<Vec<&'a Edge<'a>>>,
    pub data: RefCell<String>,
}

pub struct Edge<'a> {
    pub nodes: [Cell<&'a Node<'a>>; 2],
    pub data: RefCell<String>,
}

// needed so all Arenas have the same lifetime
#[derive(Default)]
pub struct Arenas<'a> {
    pub nodes: Arena<Node<'a>>,
    pub edges: Arena<Edge<'a>>,
}

fn main() {
    let arenas = Arenas::default();
    let node1 = &*arenas.nodes.alloc(Node {
        neighbors: RefCell::default(),
        data: RefCell::new("node1".to_string()),
    });
    let node2 = &*arenas.nodes.alloc(Node {
        neighbors: RefCell::default(),
        data: RefCell::new("node2".to_string()),
    });
    let edge1 = &*arenas.edges.alloc(Edge {
        nodes: [Cell::new(node1), Cell::new(node2)],
        data: RefCell::new("edge1".to_string()),
    });
    let edge2 = &*arenas.edges.alloc(Edge {
        nodes: [Cell::new(node1), Cell::new(node1)], // self-loop
        data: RefCell::new("edge2".to_string()),
    });
    node1.neighbors.borrow_mut().extend([edge1, edge2]);
    node2.neighbors.borrow_mut().push(edge1);
}

quinedot · March 13, 2024, 6:55pm

The stack can be your arena if you don't have a non-trivial destructor.

ratmice · March 13, 2024, 7:01pm

In general this is also likely going to be true for any Vector based approach too,
for instance that lox crate I linked above uses a wrapper stable-vec, that doesn't really delete indices.

Depending on what the OP is doing there perhaps cases where the actual reference based approach of the doubly connected edge list may be better than a stable index addressing approach.

for instance if implementing decimation or subdivision, when splitting/joining every node results in many additions or removals. I'm actually fairly curious to see a comparison of these problems running on different implementations of this data structure.

programmerjake · March 13, 2024, 7:02pm

yes, but using the stack like that makes it hard to have a dynamically sized graph, hence why I suggested Arena.

jumpnbrownweasel · March 13, 2024, 9:37pm

I've found that even with an arena (I tried it with Bumpalo) you can't put the arena and something allocated from the arena in the same struct, because then the struct needs a lifetime annotation for the allocated item, and it becomes self-referential. This is why I always recommend using indexes. Your example doesn't try to package up things in a struct, so you may not have run into that problem yet.

Ddystopia · March 13, 2024, 10:20pm

You can try using stuff like docs.rs/static_rc for aliasing and docs.rs/ghost_cell for interior mutability. Instead of static cell you might use some arena as well

jumpnbrownweasel · March 13, 2024, 11:18pm

Be sure to read its Maturity section.

And its Safety section.

programmerjake · March 14, 2024, 3:11am

yes, what I usually do is have a separate Context<'a> type that holds all the arenas, and then all the other data structures borrow from it. then at the top-level of my code have let context = Context::new(); and then pass &context around to wherever I need it.

Note that Arenas are more suitable for my use cases since I often am working on compiler-style things where you build a bunch of data-structures, and then tear everything down at the end.

Contrast that with a long-lived data-structure where you're constantly adding/removing items and need to not run out of memory, Arenas are poorly suited for that because you can't free the memory for removed items without destroying the entire data-structure -- e.g. an in-memory database for a web server.

jumpnbrownweasel · March 14, 2024, 3:12am

For using the index approach there are several good crates that reuse space for deleted entries in a Vec: slab, sharded-slab, slotmap.

programmerjake · March 14, 2024, 3:15am

If you want index-based graphs, strongly consider using petgraph - Rust, since they already did all the hard work. It works well in my experience.

If you need common graph algorithms but still want your own datastructures, you can use petgraph's traits, they make that relatively easy.

chung · March 14, 2024, 9:33am

lox is good and very informative for learning with his thesis. I was trying to directly use lox instead of rolling my own implementation.

However, there is a major issue with lox, there is no way to get the twin edge as the method is not public. Of course I could copy his implementation unsafe { Self::new(HalfEdgeHandle::new(self.idx() ^ 1)) } but it is less ideal.

So I might have to look into petgraph instead.

ratmice · March 14, 2024, 10:57am

This is good to know, I have only recently begun looking at the lox implementation myself, and hadn't noticed that yet. I think I see the problem with just making that public.

In that if an edges twin is deleted I'm guessing you need to iterate through the list of elements in the direction of the bitflip until finding a non-empty edge?
And in the case where the twin was a boundary edge could be a boundary edge but i'm uncertain how he stores boundary edges.

It does seem like a missing hole.

I personally wouldn't really want to go with a general purpose graph library for this data structure.
As the implementation of lox here shows by using bit-flips for finding the twin edge, the edges of
this graph are ordered. Which is not really a normal property for graphs, and so you will have to also store something like a [EdgeId; 3] on top of the graph. While a structure like lox leaves these implicit
and by storing everything in the proper order, can use arithmetic to compute indices.

fwiw my mental-model of the half-edge data structure could be considered non-traditional,
As I tend to view it as a 3-uniform ordered hypergraph (using insertion order), with uber-edges.
To visualize that instead of drawing edges between vertices, draw a loop around three vertices clock-wise or counter-clockwise for your order. For the uber-edge draw an edge between the loops to form the twin.

I just say this because I find that while you can explain this using graph terminology it is also has these properties which aren't well reflected by the typical graphs everyone is already familiar with.
Anyhow my tendency would be to try and get a public impl of twin into lox.

Overall the typical half-edge structure is optimized based on these properties, in a way that a general graph library unaware of these properties will be limited to adjacency-list/adjacency matrix representation. Despite that, I do see the appeal of just using a graph library.

Ddystopia · March 18, 2024, 10:07am

They are quite good imo. Ghost cell has been formally proven while even if static_rc is not so well reviewed, code and idea are so simple you can really do it yourself

system · June 16, 2024, 10:07am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Graph data structure and lifetimes help	10	1787	July 14, 2020
[Resolved] Self-referential struct case help	7	732	April 22, 2020
Graph data structure (fighting the borrow checker) help	7	2853	January 12, 2023
Self-referential struct - request for a review help	4	466	October 23, 2021
Problem with understanding lifetimes and references help	16	945	May 21, 2020

Writing a self-referential graph

Related topics