Graph data structure (fighting the borrow checker)


#1

Hi, guys.
I’m doing course on algorithms, and to make it more interesting, I’m implementing algorithms in rust.

Currently I’m trying to create Graph.

pub struct Graph<'l, T: 'l> {
    pub verticies: Vec<Vertex<'l, T>>,
    pub edges: Vec<Edge<'l, T>>
}

pub struct Vertex<'l, T: 'l> {
    pub payload: &'l T
}

pub struct Edge<'l, T: 'l> {
    pub tail: &'l Vertex<'l, T>,
    pub head: &'l Vertex<'l, T>

}

In my view, Graph should own it’s verticies and edges. In contrast, edges should only have reference to it’s edges.

The problem arrives when I try to create Graph:

let lb = Vertex{payload: &"left bottom".to_string()};
let rb = Vertex{payload: &"right bottom".to_string()};
let lt = Vertex{payload: &"left top".to_string()};
let rt = Vertex{payload: &"right top".to_string()};


let top = Edge {tail: &lt, head: &rt};
let bottom = Edge {tail: &lb, head: &rb};
let left = Edge {tail: &lt, head: &lb};
let right = Edge {tail: &rt, head: &rb};
let diagonal = Edge {tail: &lb, head: &rt};

let graph = Graph {
    verticies: vec![lb, rb, lt, rt],
    edges: vec![top, bottom, left, right, diagonal]
}; 

When I obtain reference to vertex by doing let top = Edge {tail: &lt, head: &rt};, both verticies freeze, so it becomes impossible to use them later when I create Graph.

How can I solve this? Maybe I use the wrong approach to create data structure in rust?


#2

Hi,

It seems to me you had some experience with managed language like C# or Java. In that language, a reference points to some object on heap, and prolongs its lifetime.
But that’s not the case in Rust. Here, a reference is much closer to its C++ counterpart - in fact a pointer, which is enforced by compiler to point to some valid data.

So:

will not own payload as you might expect, and it expects payload exists somewhere else. The more proper decl I see here:

pub struct Vertex<T> {
    pub payload: T
}

Next, by

you effectively pin down both edges and vertices into freeze. Why? Basically because of borrow checker.
I won’t try explain lifetimes and borrow checker in detail here, because there’s enough good info on both of them. I just point out that you can think on borrowing rules as kind of read-write lock. You can safely read from memory location from multiple places and even multiple threads. But if you want to change it, you must wait till all read operations complete - because if someone reads data while you write it, the reader might observe inconsistent state of structure.
TL;DR: you have multiple read references here. So you won’t be able to mutate a portion of structure as long as even single one of them lives.

Unfortunately i’m a bit lazy to write full graph implementation here :smile:
Check this link:


It shows simplest graph built over reference-counted smart pointers.

UPD
http://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/
Toy impl of graph based on vector indices


#3

It describes a graph structure actually used in rustc – see librustc_data_structures/graph/mod.rs; so it’s not a toy. See also crate petgraph which provides the same data structure; it’s pretty bare bones (raw), given that graph use cases differ wildly.


#4

My apologies, no pun was intended.


#5

Thank for the reply.
Indeed, I have experience with Java and JavaScript, never worked with C.

Btw, I intentionally put reference in Vertex so it doesn’t own the value. My intention was to have payload data (such as users or any other domain object) separately from graph that may represent thing like friendship, following, etc.

Thank for the analogy. Frankly speaking, I’ve read docs and book section about ownership/borrowing/lifetimes few times, but these concepts are still quite difficult for me.

And finally - thanks for links.


#6

Using T instead of &T allows you to do both, by simply creating a Graph<&SomePayload> instead of a Graph<SomePayload>.


#7

You don’t need to use references at all, here. You have vectors of nodes and edges. Each element of a vector has a unique index. Store the indices.

There are more complicated schemes a purist might like better, where the nodes and edges exist independently of any graph, but when you have something you actually want to get done, simplicity is best.