Hi @tigranbs. Treescale looks really cool and useful. From my very quick reading the code is very clean and readable. I was able to quickly grasp how it works and what is provided in about 5-10 minutes. It’s always nice seeing things like that
It appears you have the networking and routing down pat, and the approach looks like a good one. It also appears like you are going to get started making the events persistent to provide a type of queue. This is a great idea! Note that if you plan on providing true lossless queue, you will have to handle network partitions as well as node failures. You don’t want to end up in a situation like RabbitMq where network partitions cause incorrect behavior (split brain). Kafka gets this right as far as I know. If the events are best effort then the problem becomes much easier. Making a distributed consistent queue is extremely challenging, so I encourage you to look at other existing systems and read the literature to see where you can avoid mistakes made in the past and improve on the state of the art. Having this type of system in Rust would be a great boon for the ecosystem and I am rooting for you to pull it off!
One thing that did concern me however, was some of the claims in the documentation. As a long time developer with a background in distributed systems, seeing statements like
0 cost unlimited scaling immediately set off red flags that this marketing is either misguided our deceitful. Instead of statements like this I encourage you to directly provide examples of your scalability. For instance, in some scenarios, routing may take events across multiple nodes, while in others it may send them directly. This variance, and any persistence to disk is defintely not zero cost, and any latency stemming from this should be explained to the reader in an up front manner. Also, you should define what exactly provides for the lossless guarantees mentioned in the docs and how these impact scalability.
Next, you argue against horizontal scalability, but that is exactly the goal you should be striving for. It simply means that as you add more nodes to the system, the system scales allmost linearly. You can’t do any better than that. Also, there is no reason that horizontal scalability is tied to a request/response system at all. Any number of messaging patterns can be used, as that is orthogonal to the underlying characteristics of the network and more dependent upon algorithm and CAP theorem tradeoffs.
Lastly, I’d encourage you to reword the language talking about
Slave. There really is no need to use those terms anymore, and they are triggering and counterproductive to inclusivity in your project. Instead, use words like
follower, and when the two are combined you can call them
To summarize, great code so far, but be a bit more humble and descriptive in your documentation.