Theoretical Discussion on Peer Discovery

I am hoping to start, or should I say restart, the discussion on potential peer discovery solutions for a truly pure peer-to-peer network. I am attempting to create what I would call a "stupid" p2p network and this one concept seems either impossible, which I dislike even typing, or just hasn't been solved.
So let me create a theoretical environment (yes this is skewed towards the constraints in my project but makes the solution a little spicier).

  • All peers will become peers through the application and their existence is dynamic and highly volatile.
  • All peers must serve all functions. (So no dedicated relay, bootstrap, rendezvous nodes, etc.)
  • The application has no knowledge of other peers, servers, static internet resources, etc. only knowledge of itself and its network interface.
  • The network itself may be, and most likely will be, high latency, low bandwidth, serverless, no internet connection, and presentable to the application/gateway as a WAN.
  • The application must be network agnostic, so no solutions requiring special network configs (this may be the most "impossible" constraint).
  • The peers operate on low-end commodity computing resources

I have looked into many p2p implementations from Bitcoin, Ethereum, IPFS, BitTorrent, countless Github projects, and several discussion threads like this one. All seem to point to the absolute necessity for knowledge of some static resource, whether it be an online resource, a server, or a relay. Other solutions I have found necessitate unique networking environments or vpn tunneling hardware to package alongside the application. I am hoping to discover a solution that avoids all of these and allows a node to simply come online and discover other nodes running the same application and then bootstrap off one another.

There's no way around having a way to introduce new nodes to the network. When you're working across multiple networks your only option is to have a "known peer" which is often a static resource of some sort (usually a DNS name). It doesn't have to be fully static though, as long as peer addresses are stable and don't have high churn[1]. When you first initialize a node you give it a currently running peer, and then have it store some number of active peer addresses to try connecting to next time it restarts.

If your nodes won't have Internet access will they all be on the same network? If so you can use something like multicast DNS to introduce new peers to the network by using multicast TCP


  1. i.e. a node is never offline long enough for all of its known peers to have disappeared or changed addresses ↩ī¸Ž

I am beginner/intermediate in my knowledge of networking so I may/will be glossing over details here. The churn rate can be expected to be extremely high, as in the network will need to survive despite being potentially reduced to a single peer from several hundred or thousand, instantly.
They will all technically be on the same network, but will not know they are, same way a normal WAN operates where local devices know nothing of the global network topology. So each node would have a inside global IP through their router to lets say 192.168.30.0/24 but every router could potentially be leasing from the same DHCP pool, 10.0.0.0/24, meaning every node could have the same inside local IP of 10.0.0.1. My program implements mDNS and discovers local peers fine (of which there will be none as there is a single node for each WAN facing router), but as soon as I separate them with the above schema they cannot discover each other.
The DNS option would still require some static server to host the DNS correct?
Adherence to the constraints is critical in my case. Imagine if you were creating a p2p hiking app, (not what I'm doing but a good commercial application) and all you had was a computer of some sort and a router that only connects to other routers of that type with ip addrs in the same range (no internet). How could you make that app discover other hikers using that app. Churn based on assumption that squirrels hate routers/computers and actively seek out and destroy them. Best solution I've heard so far would be vpn tunneling but that still requires some sort of hardware implementation which I need to remain agnostic to, to the best of my abilities.
As a side note as well, I am testing with my own emulation of a WAN inside ESXi, (much harder to do than anticipated), so the networking issues I am specifically dealing with may not apply when dealing with general WAN implementations.
Do you know of a simple WAN emulator, that does not utilize internet, that would alleviate developer error?

The "stupid" p2p solution would be

  1. Every node opens port 6666
  2. Every node does a port scan of every possible IP address
  3. Upon detection of an open port 6666. Connect to it
  4. The connected nodes share information about IP addresses

A p2p solution without a centralized discovery source would have to rely on invite.

Yep, Scuttlebutt functions like this.

The Manyverse app (I think the most popular Android implementation of it) had an entirely local mode where you could connect to other peers over bluetooth (Which would fit with the hiking usecase), or the local LAN, both of which were removed because they didn't function that well in practise. All that's left is the invite-based method over the internet (Using special always on peers to handle replication)

1 Like

Well, none of this seems to be particularly Rust-related...

There often is a subtle misunderstanding here, I have encountered it before in the context of bittorrent's DHT:

It is not that you need to know any specific/central/static resource to join the network. The protocol design allows any and all nodes to be used as contact to initiate bootstrapping.
So as far as the p2p protocol itself is concerned there is no central component and using the "spray and pray" approach would work (at least on IPv4).

Now for convenience most implementations will choose to use some fixed bootstrapping mechanism which could be one or more nodes operated by the maintainers, by the community. Or it could be a regularly updated list of "observed to be long-lived" nodes maintained somewhere. Or you could ask a friend for the IP and port displayed on their client for a one-time bootstrap (followed by local caching of known nodes). You could also consider more opportunistic approaches such as finding nearby nodes via network-local multicast, QR codes, NFC or whatever is appropriate for your situation.

But none of that is inherent in the p2p protocol design itself. The protocol just says "contact any node, once you do that the rest happens automatically"

The churn rate can be expected to be extremely high, as in the network will need to survive despite being potentially reduced to a single peer from several hundred or thousand, instantly.

That doesn't make sense, from a global perspective. It would mean simultanous failure of many machines. A local instance may loose connectivity to everything, sure. But that's only from its viewpoint. The rest of the network does not magically vanish. Even if babies do not believe in object permanence, it's still a thing.
So once a node regains access to some network it can reconnect to known contacts.

Imagine if you were creating a p2p hiking app, (not what I'm doing but a good commercial application) and all you had was a computer of some sort and a router that only connects to other routers of that type with ip addrs in the same range (no internet).

Well, if you're not planning for the internet but for some other network then you need to consider what your lower network layers are first. Do you use blutooth, zigbee, p2p wifi, ... and what capabilities do those provide? It sounds like some case for a mesh network and associated routing protocol.

3 Likes

Apologies about the lack of Rust-related material. I was given developmental freedom on this project and found Rust and decided it was the best for creating an asynchronous p2p application. I am currently using libp2p, async-trait, futures, and tokio to help me in creating this project but am new to the language, and development, and the project is ambitious. There already are and will be more questions posted by me relating to challenges I am facing with the actual Rust implementation of this concept and more, (trying my best to make it 100% Rust).

I really like the suggestion about a, "regularly updated list of 'observed to be long-lived' nodes maintained somewhere.". This is a good suggestion for the rebooting of a node. But my challenge really lies in the initial bootup of a node. How does a node discover other nodes participating in a network and if no nodes can be found then become the sole node of a new network and await other nodes to come online. Some operators of the nodes will have physical access to one another but others will be isolated so "asking a friend" will not always work. Will have to look more into the other suggestions for their applicability to my use case although I am sure they would work in others. The state of the network will not be predefined and so will cease to exist at times when no nodes are running this application. The application will work off a virtual network within the app I guess? That may be confusing it more so let me explain a bit. You mentioned at the end that I would need to consider the lower network layers, and for the demo I can. The hardware serving as the gateway to my "WAN" runs a p2p mesh MANET that will handle all of the discovery, routing protocols, dynamic entry and exit, etc. of any other similar hardware. But the request for the project is that this fact be obfuscated and the network will operate like a WAN, not internet, and the hardware a normal router, (a network engineer is handling this side of things). Point being, I should be able to run my application from "most" networks in spite of internet and it work the same.

As unreasonable as it is, this is where I find the big gap in other discussions, in my use case, the potential simultaneous failure, whether from disconnect or physical destruction, of many, or all nodes participating in the network is possible. There is also the case where the network may split into separate networks and then rejoin one another, but that's another discussion. In the case of all nodes failing, the next node that comes online just assumes its the sole member of a new p2p network, and the network regrows, or awaits contact with another network to merge with.

"Ask a friend" is an example of a more general concept. Which means using a side-channel such as talking to someone, reading a sign with instructions printed on it or anything.

To generalize further. There some very broad, overlapping categories how one can bootstrap:

  • fixed or occasionally updated lists.
    how those lists are populated is a sub-problem that in turn has many solutions
  • manual input by various means
  • side-channels / piggybacking on other communication
  • recovery based on previous bootstrap
  • automatic discovery assisted by lower network layers/hardware capabilities/other protocols

You also need to consider splitting your requirements based on abstraction levels. Designing an application around only having temporary network access is a higher-layer concern (e.g. having to queue up data for later delivery) compared to in-the-moment neighbor-discovery and routing.

But the request for the project is that this fact be obfuscated and the network will operate like a WAN, not internet, and the hardware a normal router, (a network engineer is handling this side of things). Point being, I should be able to run my application from "most" networks in spite of internet and it work the same.

Ultimately higher layers can only provide features that can be built on top of what the lower layers offer. If you're using IP over avian carriers you can't build realtime video conferencing.
If the lower layers do not support unsolicited session initiation then establishing a p2p network is not possible without some side-channel providing invites / handshakes.

So you cannot ignore them. What you can do is have dedicated support multiple different lower-layer networks.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.