-
Notifications
You must be signed in to change notification settings - Fork 5
Peer to Peer Guide Revamp #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
87f2520
e922a0f
02a74e5
1b966b4
85d0fcb
3aebb22
477b22f
c261bdd
36b3d65
3074235
57365f9
3fbe07d
b3aa6c6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,165 @@ | ||
| # Peer-to-peer networking | ||
|
|
||
| ## Overview | ||
|
|
||
| Peer-to-peer (P2P) networking is a way for devices or peers to communicate directly without going through a central server. Every peer is equal—both can send and receive data. | ||
|
|
||
| DefraDB is a decentralized database built on this idea. Instead of the traditional client-server setup, DefraDB uses P2P networking so apps can sync data locally and share information without relying on a trusted middleman. This supports a decentralized, private, and user-focused approach. | ||
|
|
||
| :::tip[Key Points] | ||
|
|
||
| DefraDB leverages P2P networking via libp2p to synchronize data directly between distributed nodes, enabling **offline-first applications without a central server**. | ||
|
|
||
| **Key capabilities:** | ||
|
|
||
| - **Passive replication** – Automatic broadcasting of updates via PubSub (similar to UDP) | ||
| - **Active replication** – Direct, point-to-point synchronization between specific nodes (similar to TCP) | ||
| - **NAT traversal** – Circuit relays and hole punching to connect nodes behind firewalls | ||
| - **Resilient synchronization** – Updates queue offline and sync automatically when connectivity returns | ||
|
|
||
| DefraDB stores documents as update graphs (similar to Git) using IPLD blocks distributed across nodes. | ||
|
|
||
| ::: | ||
|
|
||
| ## Key concepts | ||
|
|
||
| ### Libp2p networking framework | ||
|
|
||
| Libp2p is a modular, decentralized networking framework created by Protocol Labs for IPFS (InterPlanetary File System). It handles transport, security, peer routing, and content discovery. | ||
|
|
||
| DefraDB uses libp2p to let peers talk to each other directly, replicate documents, and manage updates—similar to how Git tracks and merges changes. | ||
|
|
||
| **Note**: See [LibP2P documentation](https://docs.libp2p.io/concepts/introduction/overview/#why-libp2p) for more information. | ||
|
|
||
| ### Documents and collections | ||
|
|
||
| - **Document**: A single record with multiple fields, bound by a schema. Similar to a row in an SQL table. | ||
| - **Collection**: A group of documents that share the same schema. Similar to an SQL table. | ||
|
|
||
| ### Why DefraDB needs P2P networking | ||
|
|
||
| DefraDB stores documents and [InterPlanetary Linked Data](https://ipld.io/) (IPLD) blocks across multiple nodes, sometimes spread across the globe. P2P networking keeps them in sync whether they're on the same device, on different devices owned by the same user, or shared with collaborators—all without depending on a central server. | ||
|
|
||
| ### Replication modes | ||
|
|
||
| DefraDB supports two replication modes: | ||
|
|
||
| **Passive replication**: Automatically broadcasts updates over a global PubSub network (similar to UDP), great for quick sharing with minimal coordination. | ||
|
|
||
| **Active replication**: Creates a direct link to a chosen peer (similar to TCP), ensuring updates are delivered reliably to that node. | ||
|
|
||
| ### How DefraDB implements P2P | ||
|
|
||
| - **PubSub**: Nodes can publish and subscribe to topics. Each document gets its own topic for passive replication. | ||
| - **Granularity**: Passive replication focuses on individual documents. Active replication can handle whole collections or just selected items. | ||
|
|
||
| ## Benefits of P2P in DefraDB | ||
|
|
||
| DefraDB's P2P architecture provides several key advantages: | ||
|
|
||
| - **Resilience**: Keeps working during network outages. Changes are queued and synced later. | ||
| - **Trustless operation**: Works without needing to trust a central server. | ||
| - **Global collaboration**: Lets developers collaborate across the globe without built-in restrictions. | ||
| - **Advanced networking**: Leverages libp2p's features for discovery and NAT traversal. | ||
|
|
||
| ### NAT traversal solutions | ||
|
|
||
| Connecting to a server in a data center is straightforward—each server has its own IP address. However, home networks present a challenge: a single IP address for the modem and multiple devices protected by a NAT firewall make direct connections difficult. Libp2p offers two solutions: | ||
|
|
||
| **Circuit Relays**: A third-party node acts as an intermediary to resolve the NAT firewall issue. Both peers connect to this publicly accessible relay node, which serves as a conduit. While this requires trust in the relay node to properly forward information, connections operate over encrypted transport layers, preventing the relay from intercepting data. The relay must remain online and accessible for this to work. | ||
|
|
||
| **NAT Hole Punching**: A technique that allows nodes to connect directly to a device behind a NAT firewall, enabling direct peer connections without a trusted intermediary. | ||
|
|
||
| ## Passive replication | ||
|
|
||
| Passive replication is your "set it and forget it" mode. Once it's on, it quietly keeps things in sync without extra effort. | ||
|
|
||
| ### How passive replication works | ||
|
|
||
| - **Automatic activation**: Starts automatically when P2P is enabled. | ||
| - **Document-level topics**: Each document has its own PubSub topic. | ||
| - **Targeted updates**: Only peers subscribed to that topic receive the changes. | ||
| - **Self-organizing**: Nodes find and connect to the right peers on their own. | ||
|
|
||
| ### When nodes miss updates | ||
|
|
||
| In passive replication mode, the most recent update is broadcast through the network using a Merkle DAG (directed acyclic graph). The broadcasting node doesn't verify that receiving nodes have all previous updates—that's the responsibility of the receiving node. | ||
|
|
||
| If a node misses updates and then receives a new one, it must synchronize all previous updates before considering the document current. This is necessary because DefraDB's internal data model is based on all changes over time, not just the most recent change. | ||
|
|
||
| When broadcasting the most recent update, it's sent over the PubSub network. However, if a node needs to retrieve previous updates by traversing back through the Merkle DAG, it uses the Distributed Hash Table (DHT) instead. | ||
|
|
||
| ### Use cases for passive replication | ||
|
|
||
| Choose passive replication when you: | ||
|
|
||
| - Want automatic syncing without managing connections | ||
| - Need updates sent to anyone subscribed to a document | ||
| - Prefer a low-maintenance option for collaborative environments with many peers | ||
|
|
||
| ## Active replication | ||
|
|
||
| Active replication is like having a dedicated delivery route between you and a specific peer, ensuring that every update reaches them directly. | ||
|
|
||
| ### How active replication works | ||
|
|
||
| - **Direct peer selection**: Choose exactly who you want to sync with by picking a peer and setting up a direct connection. | ||
| - **Real-time updates**: Updates are pushed instantly without waiting for network-wide broadcasts. | ||
| - **Reliable delivery**: Ideal for important data, making it a great choice when syncing with archival nodes or trusted partners. | ||
| - **Flexible granularity**: Allows you to replicate an entire collection or only specific parts you want. | ||
|
|
||
| ### Use cases for active replication | ||
|
|
||
| Choose active replication when you: | ||
|
|
||
| - Need a direct, reliable link to a specific peer | ||
| - Want real-time updates with no delays | ||
| - Need full control over which collections or documents are shared | ||
| - Are syncing with archival nodes or specific collaborators | ||
|
|
||
| ## Peer IDs and addressing | ||
|
|
||
| ### Peer ID | ||
|
|
||
| When DefraDB starts, it creates a Peer ID—a unique identifier based on a private key generated during the first startup. This Peer ID is essential for various parts of the P2P networking system. | ||
|
|
||
| ### Multi-address format | ||
|
|
||
| A node automatically listens on multiple addresses or ports when the P2P module is instantiated. These are expressed as multi-addresses—strings that represent network addresses and include information about transport protocols and multiple network stack layers. | ||
|
|
||
| Format: | ||
|
|
||
| ```bash | ||
| /ip4/<ip_address>/tcp/<port>/p2p/<peer_id> | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| /ip4/0.0.0.0/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 | ||
| ``` | ||
|
|
||
| By default, DefraDB listens on P2P port `9171`. | ||
|
|
||
| ## Current limitations and future development | ||
|
|
||
| ### Scalability considerations | ||
|
|
||
| **Document topic overhead**: Having every document with its own independent topic can create overhead with thousands or millions of documents. The team is exploring aggregate topics scoped to subnets (group-specific or application-specific). | ||
|
|
||
| **Multi-hop between subnets**: Currently, synchronizing between subnets requires going through the global network, requiring multiple hops. The team is exploring multi-hop mechanisms to address this. | ||
|
|
||
| **Bitswap and DHT scalability**: Current limitations are being addressed through: | ||
|
|
||
| - **PubSub-based query system**: Allows queries and updates through the global PubSub network using query topics independent of document topics. | ||
| - **GraphSync**: A Protocol Labs protocol that may resolve Bitswap algorithm and DHT issues. | ||
|
|
||
| ### Future improvements | ||
|
|
||
| **Head Exchange protocol**: A new protocol in development to address issues with syncing the Merkle DAG when updates have been missed or concurrent, diverged updates have been made. It aims to efficiently: | ||
|
|
||
| - Establish the most recent update seen by each node | ||
| - Determine if there are divergent updates | ||
| - Find the most efficient way to synchronize nodes with minimal communication | ||
|
|
||
| **Replicator persistence**: Currently, replicators don't persist through node updates or restarts—they must be re-added after each restart. This will be resolved in a future release. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| { | ||
| "label": "How-to Guides", | ||
| "position": 2 | ||
| } | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,226 @@ | ||
| --- | ||
| sidebar_label: Peer-to-Peer How-to Guide | ||
| sidebar_position: 10 | ||
| --- | ||
|
|
||
| # Peer-to-peer how-to guides | ||
|
|
||
| This guide provides step-by-step instructions for configuring and managing peer-to-peer networking in DefraDB. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| Before following these guides, ensure you have: | ||
|
|
||
| - DefraDB installed on your system | ||
| - Basic familiarity with command-line interfaces | ||
| - Understanding of [P2P networking concepts](/defradb/Concepts/peer-to-peer.md) | ||
|
|
||
| ## Start and configure DefraDB | ||
|
|
||
| ### Start DefraDB with P2P enabled (default) | ||
|
|
||
| P2P networking is enabled by default when you start DefraDB: | ||
|
|
||
| ```bash | ||
| defradb start | ||
| ``` | ||
|
|
||
| You'll see output similar to: | ||
|
|
||
| ```bash | ||
| Jan 2 10:15:49.124 INF cli Starting DefraDB | ||
| Jan 2 10:15:49.161 INF net Created LibP2P host PeerId=12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 Address=[/ip4/127.0.0.1/tcp/9171] | ||
| Jan 2 10:15:49.162 INF net Starting internal broadcaster for pubsub network | ||
| ``` | ||
|
|
||
| ### Start DefraDB without P2P | ||
|
|
||
| To disable P2P networking: | ||
|
|
||
| ```bash | ||
| defradb start --no-p2p | ||
| ``` | ||
|
|
||
| ### Change the P2P port | ||
|
|
||
| By default, DefraDB listens on port `9171`. To use a different port: | ||
|
|
||
| ```bash | ||
| defradb start --p2paddr /ip4/<ip_address>/tcp/<port> | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 | ||
| ``` | ||
|
|
||
| **Parameters**: | ||
|
|
||
| - Replace `<ip_address>` with your actual IP address (use `0.0.0.0` to listen on all interfaces) | ||
| - Replace `<port>` with your desired port number | ||
|
|
||
| ## Manage Peer IDs | ||
|
|
||
| ### Get your Peer ID | ||
|
|
||
| To retrieve your node's Peer ID using HTTP: | ||
|
|
||
| ```bash | ||
| curl -H "Accept: application/json" http://localhost:9181/api/p2p/info | ||
| ``` | ||
|
|
||
| The Peer ID is generated from a private key created during the first startup and remains consistent across restarts. | ||
|
|
||
| ## Connect to peers | ||
|
|
||
| ### Connect to a specific peer | ||
|
|
||
| To connect to a particular peer when starting DefraDB: | ||
|
|
||
| ```bash | ||
| defradb start --peers /ip4/<ip_address>/tcp/<port>/p2p/<peer_id> | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| defradb start --peers /ip4/192.168.1.100/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 | ||
| ``` | ||
|
|
||
| **Parameters**: | ||
|
|
||
| - Replace `<ip_address>` with the peer's IP address | ||
| - Replace `<port>` with the peer's P2P port | ||
| - Replace `<peer_id>` with the peer's Peer ID | ||
|
|
||
| ## Manage document subscriptions (passive replication) | ||
|
|
||
| Passive replication works at the document level. Subscribe to specific documents to receive updates automatically. | ||
|
|
||
| ### Subscribe to document updates | ||
|
|
||
| ```bash | ||
| defradb client p2p document add <docID> | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| defradb client p2p document add bafybeihz5k3c2jzx7m4x5v6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
| ``` | ||
|
|
||
| ### Unsubscribe from document updates | ||
|
|
||
| ```bash | ||
| defradb client p2p document remove <docID> | ||
| ``` | ||
|
|
||
| ### View all active document subscriptions | ||
|
|
||
| ```bash | ||
| defradb client p2p document getall | ||
| ``` | ||
|
|
||
| ## Manage collection subscriptions (active replication) | ||
|
|
||
| Active replication can work at the collection level, allowing you to replicate entire collections to specific peers. | ||
|
|
||
| ### Subscribe to collection updates | ||
|
|
||
| ```bash | ||
| defradb client p2p collection add <collectionID> | ||
| ``` | ||
|
|
||
| ### Unsubscribe from collection updates | ||
|
|
||
| ```bash | ||
| defradb client p2p collection remove <collectionID> | ||
| ``` | ||
|
|
||
| ### View all active collection subscriptions | ||
|
|
||
| ```bash | ||
| defradb client p2p collection getall | ||
| ``` | ||
|
|
||
| ## Enable active replication | ||
|
|
||
| Active replication creates a direct, persistent connection to a specific peer for reliable data synchronization. | ||
|
|
||
| ### Add a replicator using HTTP | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:9181/api/p2p/replicators \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "Info": { | ||
| "ID": "<peer_id>", | ||
| "Addrs": ["<peer_address>"] | ||
| }, | ||
| "Collections": ["<collection_name>"] | ||
| }' | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:9181/api/p2p/replicators \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "Info": { | ||
| "ID": "12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4", | ||
| "Addrs": ["/ip4/192.168.1.100/tcp/9171"] | ||
| }, | ||
| "Collections": ["Books"] | ||
| }' | ||
| ``` | ||
|
Comment on lines
+153
to
+177
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can omit these |
||
|
|
||
| **Parameters**: | ||
|
|
||
| - `ID`: The Peer ID of the node you want to replicate to | ||
| - `Addrs`: Array of multi-addresses for the peer | ||
| - `Collections`: Array of collection names to replicate (e.g., `["Books"]`) | ||
|
|
||
| ### Add a replicator using CLI | ||
|
|
||
| ```bash | ||
| defradb client p2p replicator set -c <collection_name> <peer_id> | ||
| ``` | ||
|
|
||
| Example: | ||
|
|
||
| ```bash | ||
| defradb client p2p replicator set -c Books 12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 | ||
| ``` | ||
|
|
||
| **Note**: Currently, replicators don't persist through node restarts. You'll need to re-add them after each restart. This limitation will be addressed in a future release. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Verify P2P is running | ||
|
|
||
| Check the startup logs for confirmation that the LibP2P host was created and the P2P network is active: | ||
|
|
||
| ``` | ||
| INF net Created LibP2P host PeerId=... Address=[...] | ||
| INF net Starting internal broadcaster for pubsub network | ||
| ``` | ||
|
|
||
| ### Check current peer connections | ||
|
|
||
| Use the P2P info endpoint to see your current peer connections: | ||
|
|
||
| ```bash | ||
| curl http://localhost:9181/api/p2p/info | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same note above about this CLI example |
||
| ``` | ||
|
|
||
| ### Connection issues within home networks | ||
|
|
||
| If peers can't connect within the same home Wi-Fi network, this is typically due to NAT firewall restrictions. Consider: | ||
|
|
||
| 1. Using circuit relays (a publicly accessible third-party node as an intermediary) | ||
| 2. Configuring NAT hole punching | ||
| 3. Connecting peers through the internet rather than the local network | ||
|
|
||
| See the [P2P Conceptual](/defradb/Concepts/peer-to-peer.md) page for more information on NAT traversal. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with
defradb client p2p info