Design patterns for secure multi machine UDP traffic

I have a Rust "server" that runs on multiple machines. These machines are not part of a VPC. They talk to each other only over public internet.

If the traffic was over TCP, I would just setup ssh tunnels between the machines. However, the traffic is over UDP.

Is there a simple way to secure this? Ideally I include some crate, hard code some public key / private keys, and the machines can talk to each other over DTLS UDP.

If you want to distribute messages between widely distributed machines in a secure manner over the public internet I would strongly recommend using a ready made solution that does exactly that.

We use the NATS messaging system to that: https://nats.io/

NATS messing servers can cluster themselves over all kind of machines at all kind of locations. The cluster routes messages between nodes of the cluster, so clients sending and receiving messages can connect to any node in the cluster. The system provides fault tolerance, nodes can go down and the cluster still works, clients just use another working node if they have to. NATS takes care of security, authentication etc.

The simplest use of NATS is for senders to "publish" messages to the cluster and receivers to "subscribe" to interesting messages from the cluster. You can also do request/response and other things.

Sadly NATS is written on Go not Rust (There is a project for someone) but it works very well and reliably.

Using such a solution saves having to mess around with point to point messaging between all your machines and taking care of security etc, etc, etc.

2 Likes

Is there some reason it needs to be dealing directly in raw UDP? The obvious way to secure it would be to use a solution that already includes TLS and works over UDP -- perhaps QUIC - Wikipedia, which is widely supported already as part of (the draft of) HTTP/3. Or you can use something older, like Stream Control Transmission Protocol - Wikipedia which supports tunnelling over UDP.

Great question. All I need is for it to be unordered unreliable. I absolutely need to avoid head of line blocking; and application layer has logic to handle unreliable.

1 Like

Well that's why HTTP/3 moved to QUIC (from TCP):

The switch to QUIC aims to fix a major problem of HTTP/2 called "head-of-line blocking"
~ https://en.wikipedia.org/wiki/HTTP/3#Comparison_with_HTTP/1.1_and_HTTP/2

So maybe you just want to send over multiple QUIC streams (it multiplexes them) rather than packets. The wikipedia page helpfully even mentions 3 different rust crates for it.

(And consider, depending on the details, leaving some of the dealing-with-unreliable to the protocol layer.)

Are you sure you need asymmetric cryptography? You could distribute a shared secret across hosts manually and encrypt packets using an AEAD algorithm. In addition to encrypted payload you would simply append randomly generated nonce and verification tag. It should be relatively simple to implement and it would provide confidentiality, integrity, and authenticity.

In the case if you need the assymetric cryptography after all, then one potential simple solution is crypto_box.

1 Like

You are absolutely right, this problem can be solved using symmetric crypto only.

The reason I mentioned pub key crypto was my original solution in mind looked something like:

  1. use openssl generate root certificate auth
  2. use root CA, generate intermediate CA
  3. for every machine, generate certificate; sign cert with intermediate CA
  4. any time two machines want to talk, they use certs to auth each other, then use certs to generate symmetric key

I was going to just use https://crates.io/crates/openssl

What are the advantages of crypto_box ?

  1. I have looked at but never used Http3/quic

  2. I don't fully understand your solution, so I apologize if I am attacking a straw man.

Here is the problem I see: if I setup http3 servers on every machine then:

  1. I need to give each a signed CA (http3 requires certificate)

  2. when machine B tries to talk to machine A, machine B still needs to authenticate itself

===

I guess what I see here is: this does little for key distribution, but after the keys are distributed, use QUIC instead of UDP ? Not sure, again, I might be misunderstanding and attacking a strawman here.

@scottmcm : This might be a good time for me to familiarize myself with QUIC. Which crate would you recommend? Currently I am looking at GitHub - quinn-rs/quinn: Async-friendly QUIC implementation in Rust

It's a pure-Rust crate with all advantages which it entails: static linking, smaller footprint, ease of cross compilation, etc. Also it provides smaller and harder to misuse API surface (though you should still be careful around potential nonce reuse, you should be fine as long as you generate random nonces using ThreadRng or OsRng).

If you do not trust the linked RustCrypto crates, then you could use Rust wrappers around libsodium. It also should be possible to use OpenSSL's implementation of AEAD modes (e.g. AES-GCM) instead of relying on the significantly more complicated CA-based protocol.

@newpavlov

Interesting: does crypto_box support DTLS ? If not, how does it work with UDP ?

crypto_box simply encrypts a message using provided public key of recipient and private key of sender. You encrypt packet payload and send it via UDP, nothing more. I think the crate introduction describes it well enough. crypto_box does not contain any notion of CA or sessions. You would have to distribute list of accepted public keys to all your nodes yourself.

But, as I said earlier, it looks like you should be fine with a far simpler solution based on a secret shared across all you nodes. Of course, leaking this shared secret (key) would allow an attacker to decrypt all previous messages and forge new ones, so applicability of such solution depends on your threat model (e.g. whether you need forward secrecy or not).

Note that the crypto_box approach also does not provide forward secrecy. Leaking private key of a node will allow an attacker to decrypt all messages addressed to it (including previously recorded) and forge messages in its name. For forward secrecy you would have to manage ephemeral keys and sessions, thus making the communication protocol statefull.

@newpavlov : I re-read this thread, and I think I figured out what went wrong.

  1. You were suggesting crypto_box only for pub key crypto.

  2. I was asking about symmetric key crypto for udp, i.e. aes-128-gcm, after the symmetric key is established.

Please, read the crypto_box docs properly. It includes authenticated encryption in the form of XSalsa20Poly1305 or XChaCha20Poly1305.

I've not used it myself, so I can't give recommendations. It just seemed like an obvious fit, since it was designed for multi-stream secure communication over UDP with no head-of-line blocking, exactly like you were mentioning.

Note that you can use QUIC without using HTTP/3, so "if I setup http3 servers on every machine" may well be irrelevant for what you need to do.

And I have no idea if cert-auth is possible in QUIC the way it is in HTTP-with-TLS-on-TCP. It may well be, and if so might be sufficient to authenticate and authorize both directions -- you could just put the same cert on all the machines. (Obligatory asterisk for cert rolling making everything harder, if you need to be able to do it without shutting everything down at once.)

Can you please point me to a single example that uses crypto_box for encrypting udp ?

You are being frustrated by the fact I have not read crypo_box documentation in full.

I am frustrated by the fact that you are suggesting crypto_box but not showing evidence that dtls / encrypting-udp is possible with crypto_box. In particular, we need something where

  • if enc_{N} arrives before enc_{N-1}, we can decrypt enc_N immedately without waiting for enc_{N-1} has arrived.

I believe the formal term for this is 'dtls'. I can not find any evidence that crypto_box support dtls.

This is a real, not hypothetical problem, as a different library, rusttls , curently does not appear to support dtls either: [WIP/DRAFT] Continuation on DTLS support. by TimonPost · Pull Request #326 · rustls/rustls · GitHub

I will cite myself:

I propose a lower level and simpler solution than one you are seemingly fixated upon (but note the mentioned trade-offs).

You encrypt and decrypt each packet independently from others. In other words, let's say you want send a UDP packet with payload hello. You generate a random nonce (it must be different for each packet), encrypt payload with an AEAD algorithm (either using a shared secret key or key computed based on ECDH as implemented in crypto_box). Now you send 5 + N + M bytes in a single UDP packet, where N and M are lengths of nonce (12 bytes for ChaCha20Poly1305) and verification tag (16 bytes for ChaCha20Poly1305) respectively. Receiving node accepts the packet, splits ciphertext, nonce, and tag. Next, it verifies validity of the tag and decrypts the ciphertext, passing the resulting plaintext further to your program. Note that this protocol is completely stateless.

I don't know how to explain it in simpler terms, so it probably will be my last comment in this thread.

An easy layered solution would be to set up a WireGuard VPN using either userspace or in-process networking, then use regular UDP connections inside the tunnel.

(In process means that the VPN stack is compiled in to your application binary, and the OS just sees encapsulated packets.)

This, to the best of my knowledge, is an algorithm you made up on the spot, not a standard algorithm.

The way to prove crypto secure is by providing a reduction of the form: given a blackbox that can efficiently break this protocol, we can build an efficient algorithm for breaking assumed difficult crypto primitive.

You are not willing to go through the effort to show that. That is fine. From my perspective, lacking such a proof, there is n reason to believe this protocol is secure.

I would like to congratulate you on creating a "Note that this protocol is completely stateless" while the rest of the community tends to need some counters of some form when encrypting more than 1 message.

Sigh... It's a very straightforward application of the standard AEAD primitive to UDP communication under assumption that you can share a secret key or list of accepted public keys across nodes and that you do not need forward secrecy. It's not so different from how TLS works at the very base layer, the difference is that instead of fixed keys it works with ephemeral keys. But it looks like you don't wish to understand the basics. In that case, it indeed would be better for you to use a high-level "boxed" solution as proposed by others, so you can forget about my comments.

Use of counters is nothing more than an optimization. It's more efficient to increment nonce than generate random nonce for each packet. Using completely random nonces is the safest way to not screw up with nonce reuse.