software-architecture – How do distributed applications (bitcoin, torrents…) find each other?

Question:

How are distributed/decentralized software able to connect and find other machines running the same software?

In Bitcoin, for example, how do "Full Nodes" find each other? How does he find "another wallet" to communicate?


Some decentralized software, such as Torrent, use an intermediary, unless mistakenly called Trackers, which allow you to find each other. They seem to act like a DNS, roughly speaking , in order to get addresses from those who have the downloaded file and can send it to you .


But if a platform is distributed, with no central server to obtain information, how on earth can it find the others? For me, in my view there will always be an "intermediate", to enable one to find the others, am I right? Is there another way to enable "distributed" application connections?

Answer:

In the case of the Bitcoin protocol, specifically, there is no intermediary for a network client to find others to connect to. A Bitcoin client, when starting up, will try to use some methods, in order, to discover other nodes on the network. These methods are as follows:

  • All nodes maintain a list of other nodes known to it, and preferentially connect to them on startup. This list contains all the nodes it has ever connected to.

    The protocol also makes it possible for a node to ask another node for information about the active nodes it knows about, via the getaddr message. Clients send this message to the nodes they connect to, in order to increment their list of known nodes.

    If the client has never been online before, he does not yet have any IP in his list of known nodes, so he will need to get one using the next method.

  • DNS queries: There is a list of domains maintained for the sole purpose of providing a list of IPs that are known to be running Bitcoin network nodes, and these domains are embedded in the Bitcoin client code.
    So all it takes is a DNS query on these domains to have a list of initial nodes to connect to, and from them, the local base of known nodes grows.
    You can see the specific snippet of code where these domains are, and the snippet where the client makes the query ;

  • There is also a list of Bitcoin network node IP's embedded (hardcoded) in the code, which will be used if all other methods fail. This list can be seen in this part of the code , and is used here ;
  • The user can also start his client passing a list of IP's of nodes he should try to connect initially, but in that case you would have to know the IP of at least one node that already exists.
    This is done via the -addnode=<ip> parameter. A list of available parameters can be seen here .

Note: This answer and the links to the code on GitHub refer to version 0.14.2 .

Scroll to Top