Rust, distributed hash table

  1. Both Rust crates & Algorithms / Theory papers welcome.

  2. Imagine we have a cluster computers; where computers are regularly being added and removed (crash) from the cluster.

  3. Suppose we care about two ops:

Set(u64, V) => ()
Get(u64) -> Option

  1. When we have n machines, we want the space of u64 to be split over the n machines.

  2. When a new machine is added, we want it to "steal portion of the keyspace" from existing machines.

  3. When a machine crashes, we want to "redistributed key space (of crashed machine)" to the running machines. During the crash, we lose the V's, so we end up temporarily returning None's until a Set happens.

Question: what algorithms solve this problem? what Rust crates exist for this ?

If I understand your requirements correctly you want to "shard" data over many machines. Because of its shear size or for performance. You also want it to be fault tolerant.

For the fault tolerant part you probably want to look into Lamport's "paxos" algorithm or new "raft" algorithms:

All this thinking stems from Lamport's paper: https://www.microsoft.com/en-us/research/uploads/prod/2016/12/The-Byzantine-Generals-Problem.pdf

If I understand 'etcd' uses the Raft algorithm: etcd/raft at main · etcd-io/etcd · GitHub as does the Cockroach distributed SQL database: https://www.cockroachlabs.com/

There is a Rust crate for Raft: https://crates.io/crates/raft

1 Like

It is not obvious to me how to solve this problem with Raft / Paxos.

Although the nodes in the cluster do need to "achieve consensus" on how the key space is split, I am not sure this is the right approach.

It's not obvious to me either.

As you say, as far as I can tell the nodes in the cluster need to achieve consensus.

Having kept half an eye on this kind of thing since discovering the Byzantine Generals paper many years ago I'm not aware of any better solutions to arriving at consensus than Raft/Paxos.

If they are not the right approach I'd love to know what might be.

Still, I'm sure there are many smart people who have been thinking about the consensus problem over the years and I'm not aware of their work.

This is the same boat I am in. This sounds like a sufficiently general problem that I am confident the 'right solution' exists out there. I am just not sure what it is called. :slight_smile:

Have you tried searching Google Scholar for "distributed hash table"? I see plenty of results there.

No, I have not read any of those papers. I have read Distributed hash table - Wikipedia a few times and tried to find a textbook on Amazon (to no avail).

Is there a consensus yet on which algorithm is best, or is there a number of different algorithms, depending on specific scenarios ?

I have no idea, my knowledge of distributed systems ends about where @ZiCog's does :‌) But you can almost certainly read and understand many of the academic papers about this problem without reading a whole textbook first, CS papers tend to be pretty friendly that way (much more so than e.g. pure math papers).

Two papers I would start with are Chord, which introduced the use of consistent hashing for distributed key-value storage, and Dynamo, a practical implementation of similar architectural ideas, used for some of Amazon's early distributed systems.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.