I'm working on a distributed application and I'm looking for a way to simplify communication between different processes (some of them services, some of them more like workers/clients). In the past I was using TCP connections and serde to serialize and deserialize messages, but it involved quite a lot of work to set it all up. So I'm wondering - do you know about any tools that could make it simpler?
The only libraries that come close to solving my problem were various RPC libraries, but all of them allow only for a client/server communication, not two way messages. This can be overcame by opening two connections between each of the services, but for many reasons it's not great. The main reason on why I don't want to use it is that I want some of the processes to connect to a central service. The central service will need to listen on a certain IP and port, but some of the processes might not be set up this way - they might be inaccessible from the outside.
Any pointers would be appreciated!
I am not aware of a Rust based system that does this, but I think Apache Hadoop might give you some ideas - since the Hadoop FS and the job scheduler basically need to communicate across CPU's on different machines to function.
I'm sure that Hadoop has some interesting solutions, but for me that would be probably a massive overkill. I don't really need anything very performant, just an abstraction for sending structured messages over TCP. Maybe it's a good time to finally write a library myself I guess?
We are using the NATS messaging system https://nats.io/ to do this kind of thing.
NATS is dead easy to get started with and there are NATS client libraries for all kind of languages besides Rust:
The simplest step is to have data producers "publish" messages to a NATS server and consumers "subscribe" to those messages. But you can also use NATS in a Request/Reply manner between individual client and server.
If you need it you can use NATS with authentication and TLS support. As bonus it can be used with sync and async Rust programs (Using Tokio).
If you ever need to scale up it's easy to spin up multiple NATS servers and they will distribute messages among themselves to the right destinations. This also adds fault tolerance, a NATS client can be given a list of IP addresses servers in a cluster and it will switch to alternative routes if one fails.
In our application we simply use a single NATS server in the cloud to accept connections from many edge clients. Mostly the messages are JSON (serde) although you can use whatever format you like. Then there are a bunch of processes on a few servers in the cloud working together through NATS.
By the way those edge NATS clients aren't accessible from our cloud servers as they are behind mobile networks. Which I think addresses your point about processes being inaccessible from the outside.
We went with NATS as it totally removes all that messing around with socket connections and all that low level client/server stuff. It decouples all our processes from each other, they only need to be able to get to an IP address of a NATS server.
Thanks for the reply! NATS certainly looks very interesting and I will surely check it out as I might need sth like it for a distributed project in the future, but for what I need right now it seems like overkill as well. In the scenario I'm having I will have a central service and clients (whether remote or as a CLI to control it), so it will be one connection per client only, nothing fancy. That's why I want to keep it extremely simple - deploy a docker image with the main service somewhere, spin up clients with the main service IP and that's it.