Exploring Rust / Developing a Custom Graph

Good question.

The first choice would indeed be C or C++ -- mainly because i am familiar with the languages -- Rust is challenging given its learning curve -- but, the significant benefit of compile time checks may outweigh that -- so i am exploring this.

Another language i am playing with to consider is Ada -- i happen to program in Ada a while back and like its design premise as well -- it even now has formal tools to check code -- but, given the limited community around it -- Rust seems to be a better choice, longer term.

Edit: (reached respond limit, answering here ZiCog response further below):

True.

But, to get started with Rust vs. continuing to use C/C++ the learning curve adds time and cost and a commitment -- e.g. the need to overcome non-C calling conventions when interfacing with C-based ecosystems.

This is the decision i am currently deliberating.

True, and I would not use it as full featured C++ but as a typed checked C ... which, i think, is the way to go with performance in mind.

Dan

Good point :slight_smile:

We have travelled similar roads...

Until 18 months ago, when. I felt the need for speed or other constraints were pressing I would have reached for C or C++. Basically because after using a bunch of other compiled languages, PL/M 86, Coral, Pascal, Ada... they were the only ones left standing in the modern world. Go would have been on the list if it did not require a run time, and require a garbage collector and importantly did not suffer from worse jitters than Javascript/node.js when I tried it for some application a few years ago,

A few years ago when I found myself at a loose end, I thought I would use some time to reacquaint myself with Ada. It did not go well. All in all hard to use and it seemed to have sprouted some unfathomable features since I last used it. The last straw was when following an Ada tutorial on the net, by some professor in the US. At one point he had written some very obscure code that took an age to understand. He put a comment in that code to the effect "we have to do this here because Ada is absurd". Eventually I realised that the prof did not understand how to use tagged types and had put that code in rather than do it the simple way. Except the "simple way" was not easy to discern either!.

I don't see C or C++ exiting my life anytime soon but Rust offers a lot of compelling attractions beside the fussy type and lifetime checking. From the Cargo build system/package manager, to the meaningful and useful error messages, to its cross-platform support to its extensive community and more.

To my mind, the problems of learning Rust are no greater than those of C++. Arguably less. C++ is a huge and complex language. I firmly believe no single person understands all its features and how they work together. Sit through ten hours of cppcon presentations (YouTube) and you will find nearly all the talks are about the myriad pitfalls one has to remember to avoid when using almost every feature of the language. This is way too much for beginners.

It's worth mentioning that you can modify a vector to have true constant time push by "amortizing" the copy-to-new-vector operation. To do this, you keep both the old and new allocation around, moving two items over to the new vector every time you push.

5 Likes

I am actually coming from working on a Prolog based implementation -- and in Prolog all access of such structures (Facts in prolog lingo) is a Hash with logarithmic access time.

I'm working on a Prolog implementation in Rust right now and can say that in my implementation, most predicate lookups and term accesses are in O(1), not O(log(n)).

Rust hasn't posed too many obstacles to implementing the Warren abstract machine, which allows you to build trees and cyclic graphs as Prolog terms using tagged integers in a flattened heap of fixed-size cells. The integers are relative offsets that point mostly back into the heap itself. Several other Prolog implementations are based on the WAM, and they're known to be quite fast.

In Scryer's heap representation, these integers are plain usize integers, not pointers, so they never trip the borrow checker. That approach is probably a little under-typed for the taste of most Rustaceans, but it is a simple and effective way of representing graph structures.

That's about to change somewhat, though. I'm moving to a much more compact heap representation, which will be managed by a tracing garbage collector. Scryer should become much faster once heap cells are able to fit in registers and cache lines. This will involve some delving into Unsafe Rust.

1 Like

Hi Alice,

Can you elaborate a bit / illustrate -- i don't understand the design you have in mind.

thanks,

Dan

Thank you for mentioning your experience.

Let me indicate how i see things -- perhaps, I am misunderstanding here something:

In Prolog one approach to represent a graph could be by stating or asserting a node with arity 1, and a link with arity three. For example:

node(a).
node(b).
node(c).
link(l1, a,b).
link(l2, a,c).

An outgoing link could be retrieved by a following rule:

outgoing_link(Node, Link_ID, Target_Node) :- 
   link(Link_ID, Node, Target_Node).

Assuming that the second argument of the link/3 predicate is indexed, then there would be a hash lookup by the given Node with the index hashing all links -- wouldn't that be logarithmic time. Retrieving k outgoing link would be k times logarithmic lookup.

Now, if I continue to the next nodes, and outgoing links -- this would yet again be a logarithmic lookup, each. Hence graph traversal seems rather expensive in Prolog's hash-based (indexed) lookup of facts -- vs traversing linked (via pointers) outgoing nodes or some kind of vector variant -- both of which could do better.

Do i understand this correctly, or am i missing here something crucial.

thanks,

Dan

There's some indexing involved if the argument is instantiated, yes, but the lookup in collections like IndexMap is constant on average, not logarithmic. Even then, there's one only hash lookup per argument value per predicate, even if that same value appears in multiple clauses. So, no, it wouldn't be k times a logarithmic lookup, it would be an O(1) lookup, a bounded number of times per predicate call. That bound is uniform across all predicates.

So, how would an algorithm do that traverses all outgoing links, recursively, in Prolog, starting from a node, vs.. doing the same in a pointer based graph implementation vs. in a vec based implementation -- would these be equivalent up-to some epsilon?

Or, stated differently, would a Prolog based implementation have about the same performance characteristics as a Rust implementation.

Edit:

I assume you mean that the index for functor link and the second arg is based on an IndexMap -- but, i don't understand why its on average constant and not logarithmic.

Suppose you have a graph with 1M nodes and links -- which probably has many more links than nodes - say at a ration of 1:4, so perhaps around 800K links -- would there be one indexmap for each function + argi concatenation -- this would generate a lot of small indexmaps -- perhaps an indexmap of indexmaps -- and a main one, including about 800K entries (for each node +link pair) ... so the first lookup would take log while the second close to constant?

I am trying to visualize how this is typically implemented

Dan

I don't have much time to write, but you might find my fac code interesting. It uses the Vec and indices to handle a large graph. I haven't touched the code in years, so it's not my most beautiful code, as I was much newer to rust at the time I wrote this. It was also before the 2018 edition, although it's now updated. I don't handle freeing of elements, add it's not needed. I believe it uses my tiny set type, which saves vast amounts of memory for big builds. Fac being a build tool.

Edit tinyset::Set64 is what you want for sets of indices. Small sets will be 64 bits in size, and larger ones scale very well.

just noticed this master thesis:

Evaluating Memory Models for Graph‐Like Data Structures in the Rust Programming Language:Performance and Usability by Rasmus Viitanen,

When a vector reallocates, this has a linear cost because every element must be copied to a new allocation. However, what if you just keep the smaller and larger allocation around together, and only move the items to the larger allocation as new items are pushed? Then since you typically double the size of the allocation each time, there will be enough calls to push that you will have finished moving all the items over by the time you need to allocate a even larger allocation.

1 Like

Thank you.

I still can't visualize this -- is there perhaps a paper or reference that discusses this approach that i can lookup.

How can two vectors reduce the need to reallocation a move of large number of elements -- eventually, one vector will fill up and if you want to have continuous memory allocation for vec you have to create a new one and move things over.

Your suggestion hasn't clicked yet in my mind :slight_smile:

Dan

Pushing a bunch of items would look something like this; each push moves one old item (and possibly allocates an uninitialized array):

Old                New
[0,1,2,3]
[0,1,2,_]          [_,_,_,3,4,_,_,_]
[0,1,_,_]          [_,_,2,3,4,5,_,_]
[0,_,_,_]          [_,1,2,3,4,5,6,_]
[0,1,2,3,4,5,6,7]
[0,1,2,3,4,5,6,_]  [_,_,_,_,_,_,_,7,8,_,_,_,_,_,_,_]
...

The downside is that you no longer have a single continuous memory region that contains all of the elements: They're split into two parts, so each lookup needs an extra bounds check to determine which allocation the element is currently in. Element 2, for example, will be either at Old[2] or New[2], depending on how far the backfill has gone at the time of access.

If you need a continuous slice, you can always run the backfill copying ahead to cover the region that you need, though this obviously might take linear time in the length of the returned slice.

You still need to move all the elements, it's just that you don't move all of them at once. This lets you put an upper limit on how long a call to push can take, as each call to push has an upper limit on the number of elements you move at once. In other words, instead of having a few of the push calls take a large amount of time, you spread that time out by making every push call a bit slower.

You can't use this data structure if you want a single contiguous memory allocation. Most of the time, the elements would be split between two allocations. That said, you could have a method like VecDeque::as_slices.

1 Like

Suppose you have a graph with 1M nodes and links -- which probably has many more links than nodes - say at a ration of 1:4, so perhaps around 800K links -- would there be one indexmap for each function + argi concatenation -- this would generate a lot of small indexmaps -- perhaps an indexmap of indexmaps -- and a main one, including about 800K entries (for each node +link pair) ... so the first lookup would take log while the second close to constant?

I'd recommend you have a look at Hassan ait-Kaci's book on the Warren abstract machine, particularly the section on indexing. The lookup time is average constant according to IndexMap's documentation. I think the reason is that IndexMap uses hashing for lookup. Your idea is fairly close to what the Warren abstract machine -- indices are distributed as small hash tables throughout the WAM bytecode.

I think if you account for the overheads in interpreting WAM bytecode and memory management, yes, the performance of a pointer-based graph implementation vs. a vec based implementation should be comparable up to a constant multiplier.

I think this is very promising (amazing actually) for Prolog -- it means that if relational data structures are chosen well, then, up to garbage collection, performance for graph based algorithms, can be close to what can be achieved in languages C and C++ with pointer based graphs structures.

And, I assume that in Rust a vector based implementation would approximate that as well.

And garbage collection in prolog seems simpler than other languages given how the prolog virtual machine stack works -- i guess, not unlike in Rust where scope rules, and hence deallocations, are stack-esque ...

Dan

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.