From 87f25203bdf0956c655096a142b432c89d2aa41e Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Wed, 3 Dec 2025 10:41:43 -0800 Subject: [PATCH 01/13] Renamed folders --- docs/defradb/Conceptual/_category_.json | 5 +++++ .../{concepts => Conceptual}/content-addressable-storage.md | 0 docs/defradb/{concepts => Conceptual}/content-identifier.md | 0 docs/defradb/{concepts => Conceptual}/ipfs.md | 0 docs/defradb/{concepts => Conceptual}/libp2p.md | 0 docs/defradb/{concepts => Conceptual}/merkle-crdt.md | 0 docs/defradb/Procedural/_category_.json | 5 +++++ docs/defradb/{guides => Procedural}/akash-deployment.md | 0 docs/defradb/{guides => Procedural}/deployment.md | 0 docs/defradb/{guides => Procedural}/explain-queries.md | 0 docs/defradb/{guides => Procedural}/peer-to-peer.md | 0 docs/defradb/{guides => Procedural}/schema-migration.md | 0 docs/defradb/{guides => Procedural}/schema-relationship.md | 0 docs/defradb/{guides => Procedural}/secondary-index.md | 0 .../defradb/{guides => Procedural}/time-traveling-queries.md | 0 docs/defradb/concepts/_category_.json | 5 ----- docs/defradb/guides/_category_.json | 5 ----- 17 files changed, 10 insertions(+), 10 deletions(-) create mode 100644 docs/defradb/Conceptual/_category_.json rename docs/defradb/{concepts => Conceptual}/content-addressable-storage.md (100%) rename docs/defradb/{concepts => Conceptual}/content-identifier.md (100%) rename docs/defradb/{concepts => Conceptual}/ipfs.md (100%) rename docs/defradb/{concepts => Conceptual}/libp2p.md (100%) rename docs/defradb/{concepts => Conceptual}/merkle-crdt.md (100%) create mode 100644 docs/defradb/Procedural/_category_.json rename docs/defradb/{guides => Procedural}/akash-deployment.md (100%) rename docs/defradb/{guides => Procedural}/deployment.md (100%) rename docs/defradb/{guides => Procedural}/explain-queries.md (100%) rename docs/defradb/{guides => Procedural}/peer-to-peer.md (100%) rename docs/defradb/{guides => Procedural}/schema-migration.md (100%) rename docs/defradb/{guides => Procedural}/schema-relationship.md (100%) rename docs/defradb/{guides => Procedural}/secondary-index.md (100%) rename docs/defradb/{guides => Procedural}/time-traveling-queries.md (100%) delete mode 100644 docs/defradb/concepts/_category_.json delete mode 100644 docs/defradb/guides/_category_.json diff --git a/docs/defradb/Conceptual/_category_.json b/docs/defradb/Conceptual/_category_.json new file mode 100644 index 0000000..621e8ca --- /dev/null +++ b/docs/defradb/Conceptual/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "Conceptual guides", + "position": 3 + } + \ No newline at end of file diff --git a/docs/defradb/concepts/content-addressable-storage.md b/docs/defradb/Conceptual/content-addressable-storage.md similarity index 100% rename from docs/defradb/concepts/content-addressable-storage.md rename to docs/defradb/Conceptual/content-addressable-storage.md diff --git a/docs/defradb/concepts/content-identifier.md b/docs/defradb/Conceptual/content-identifier.md similarity index 100% rename from docs/defradb/concepts/content-identifier.md rename to docs/defradb/Conceptual/content-identifier.md diff --git a/docs/defradb/concepts/ipfs.md b/docs/defradb/Conceptual/ipfs.md similarity index 100% rename from docs/defradb/concepts/ipfs.md rename to docs/defradb/Conceptual/ipfs.md diff --git a/docs/defradb/concepts/libp2p.md b/docs/defradb/Conceptual/libp2p.md similarity index 100% rename from docs/defradb/concepts/libp2p.md rename to docs/defradb/Conceptual/libp2p.md diff --git a/docs/defradb/concepts/merkle-crdt.md b/docs/defradb/Conceptual/merkle-crdt.md similarity index 100% rename from docs/defradb/concepts/merkle-crdt.md rename to docs/defradb/Conceptual/merkle-crdt.md diff --git a/docs/defradb/Procedural/_category_.json b/docs/defradb/Procedural/_category_.json new file mode 100644 index 0000000..545680e --- /dev/null +++ b/docs/defradb/Procedural/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "Procedural guides", + "position": 2 + } + \ No newline at end of file diff --git a/docs/defradb/guides/akash-deployment.md b/docs/defradb/Procedural/akash-deployment.md similarity index 100% rename from docs/defradb/guides/akash-deployment.md rename to docs/defradb/Procedural/akash-deployment.md diff --git a/docs/defradb/guides/deployment.md b/docs/defradb/Procedural/deployment.md similarity index 100% rename from docs/defradb/guides/deployment.md rename to docs/defradb/Procedural/deployment.md diff --git a/docs/defradb/guides/explain-queries.md b/docs/defradb/Procedural/explain-queries.md similarity index 100% rename from docs/defradb/guides/explain-queries.md rename to docs/defradb/Procedural/explain-queries.md diff --git a/docs/defradb/guides/peer-to-peer.md b/docs/defradb/Procedural/peer-to-peer.md similarity index 100% rename from docs/defradb/guides/peer-to-peer.md rename to docs/defradb/Procedural/peer-to-peer.md diff --git a/docs/defradb/guides/schema-migration.md b/docs/defradb/Procedural/schema-migration.md similarity index 100% rename from docs/defradb/guides/schema-migration.md rename to docs/defradb/Procedural/schema-migration.md diff --git a/docs/defradb/guides/schema-relationship.md b/docs/defradb/Procedural/schema-relationship.md similarity index 100% rename from docs/defradb/guides/schema-relationship.md rename to docs/defradb/Procedural/schema-relationship.md diff --git a/docs/defradb/guides/secondary-index.md b/docs/defradb/Procedural/secondary-index.md similarity index 100% rename from docs/defradb/guides/secondary-index.md rename to docs/defradb/Procedural/secondary-index.md diff --git a/docs/defradb/guides/time-traveling-queries.md b/docs/defradb/Procedural/time-traveling-queries.md similarity index 100% rename from docs/defradb/guides/time-traveling-queries.md rename to docs/defradb/Procedural/time-traveling-queries.md diff --git a/docs/defradb/concepts/_category_.json b/docs/defradb/concepts/_category_.json deleted file mode 100644 index dbc4274..0000000 --- a/docs/defradb/concepts/_category_.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "label": "Concepts", - "position": 3 - } - \ No newline at end of file diff --git a/docs/defradb/guides/_category_.json b/docs/defradb/guides/_category_.json deleted file mode 100644 index 7494f24..0000000 --- a/docs/defradb/guides/_category_.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "label": "Guides", - "position": 2 - } - \ No newline at end of file From e922a0f36dc78812a58a450622d1c0eef3dc09d6 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Wed, 3 Dec 2025 11:12:54 -0800 Subject: [PATCH 02/13] added p2p files --- .../Conceptual/peer-to-peer-conceptual.md | 146 ++++++++++++ .../Procedural/peer-to-peer-procedural.md | 218 ++++++++++++++++++ 2 files changed, 364 insertions(+) create mode 100644 docs/defradb/Conceptual/peer-to-peer-conceptual.md create mode 100644 docs/defradb/Procedural/peer-to-peer-procedural.md diff --git a/docs/defradb/Conceptual/peer-to-peer-conceptual.md b/docs/defradb/Conceptual/peer-to-peer-conceptual.md new file mode 100644 index 0000000..1e69205 --- /dev/null +++ b/docs/defradb/Conceptual/peer-to-peer-conceptual.md @@ -0,0 +1,146 @@ +# Peer-to-peer networking + +## Overview + +Peer-to-peer (P2P) networking is a way for devices or peers to communicate directly without going through a central server. Every peer is equal—both can send and receive data. + +DefraDB is a decentralized database built on this idea. Instead of the traditional client-server setup, DefraDB uses P2P networking so apps can sync data locally and share information without relying on a trusted middleman. This supports a decentralized, private, and user-focused approach. + +## Key concepts + +### Libp2p networking framework + +Libp2p is a modular, decentralized networking framework created by Protocol Labs for IPFS (InterPlanetary File System). It handles transport, security, peer routing, and content discovery. + +DefraDB uses libp2p to let peers talk to each other directly, replicate documents, and manage updates—similar to how Git tracks and merges changes. + +**Note**: See [LibP2P documentation](https://docs.libp2p.io/concepts/introduction/overview/#why-libp2p) for more information. + +### Documents and collections + +- **Document**: A single record with multiple fields, bound by a schema. Similar to a row in an SQL table. +- **Collection**: A group of documents that share the same schema. Similar to an SQL table. + +### Why DefraDB needs P2P networking + +DefraDB stores documents and [InterPlanetary Linked Data](https://ipld.io/) (IPLD) blocks across multiple nodes, sometimes spread across the globe. P2P networking keeps them in sync whether they're on the same device, on different devices owned by the same user, or shared with collaborators—all without depending on a central server. + +### Replication modes + +DefraDB supports two replication modes: + +**Passive replication**: Automatically broadcasts updates over a global PubSub network (similar to UDP), great for quick sharing with minimal coordination. + +**Active replication**: Creates a direct link to a chosen peer (similar to TCP), ensuring updates are delivered reliably to that node. + +### How DefraDB implements P2P + +- **PubSub**: Nodes can publish and subscribe to topics. Each document gets its own topic for passive replication. +- **Granularity**: Passive replication focuses on individual documents. Active replication can handle whole collections or just selected items. + +## Benefits of P2P in DefraDB + +DefraDB's P2P architecture provides several key advantages: + +- **Resilience**: Keeps working during network outages. Changes are queued and synced later. +- **Trustless operation**: Works without needing to trust a central server. +- **Global collaboration**: Lets developers collaborate across the globe without built-in restrictions. +- **Advanced networking**: Leverages libp2p's features for discovery and NAT traversal. + +### NAT traversal solutions + +Connecting to a server in a data center is straightforward—each server has its own IP address. However, home networks present a challenge: a single IP address for the modem and multiple devices protected by a NAT firewall make direct connections difficult. Libp2p offers two solutions: + +**Circuit Relays**: A third-party node acts as an intermediary to resolve the NAT firewall issue. Both peers connect to this publicly accessible relay node, which serves as a conduit. While this requires trust in the relay node to properly forward information, connections operate over encrypted transport layers, preventing the relay from intercepting data. The relay must remain online and accessible for this to work. + +**NAT Hole Punching**: A technique that allows nodes to connect directly to a device behind a NAT firewall, enabling direct peer connections without a trusted intermediary. + +## Passive replication + +Passive replication is your "set it and forget it" mode. Once it's on, it quietly keeps things in sync without extra effort. + +### How passive replication works + +- **Automatic activation**: Starts automatically when P2P is enabled. +- **Document-level topics**: Each document has its own PubSub topic. +- **Targeted updates**: Only peers subscribed to that topic receive the changes. +- **Self-organizing**: Nodes find and connect to the right peers on their own. + +### When nodes miss updates + +In passive replication mode, the most recent update is broadcast through the network using a Merkle DAG (directed acyclic graph). The broadcasting node doesn't verify that receiving nodes have all previous updates—that's the responsibility of the receiving node. + +If a node misses updates and then receives a new one, it must synchronize all previous updates before considering the document current. This is necessary because DefraDB's internal data model is based on all changes over time, not just the most recent change. + +When broadcasting the most recent update, it's sent over the PubSub network. However, if a node needs to retrieve previous updates by traversing back through the Merkle DAG, it uses the Distributed Hash Table (DHT) instead. + +### Use cases for passive replication + +Choose passive replication when you: + +- Want automatic syncing without managing connections +- Need updates sent to anyone subscribed to a document +- Prefer a low-maintenance option for collaborative environments with many peers + +## Active replication + +Active replication is like having a dedicated delivery route between you and a specific peer, ensuring that every update reaches them directly. + +### How active replication works + +- **Direct peer selection**: Choose exactly who you want to sync with by picking a peer and setting up a direct connection. +- **Real-time updates**: Updates are pushed instantly without waiting for network-wide broadcasts. +- **Reliable delivery**: Ideal for important data, making it a great choice when syncing with archival nodes or trusted partners. +- **Flexible granularity**: Allows you to replicate an entire collection or only specific parts you want. + +### Use cases for active replication + +Choose active replication when you: + +- Need a direct, reliable link to a specific peer +- Want real-time updates with no delays +- Need full control over which collections or documents are shared +- Are syncing with archival nodes or specific collaborators + +## Peer IDs and addressing + +### Peer ID + +When DefraDB starts, it creates a Peer ID—a unique identifier based on a private key generated during the first startup. This Peer ID is essential for various parts of the P2P networking system. + +### Multi-address format + +A node automatically listens on multiple addresses or ports when the P2P module is instantiated. These are expressed as multi-addresses—strings that represent network addresses and include information about transport protocols and multiple network stack layers. + +Format: +``` +/ip4//tcp//p2p/ +``` + +Example: +``` +/ip4/0.0.0.0/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 +``` + +By default, DefraDB listens on P2P port `9171`. + +## Current limitations and future development + +### Scalability considerations + +**Document topic overhead**: Having every document with its own independent topic can create overhead with thousands or millions of documents. The team is exploring aggregate topics scoped to subnets (group-specific or application-specific). + +**Multi-hop between subnets**: Currently, synchronizing between subnets requires going through the global network, requiring multiple hops. The team is exploring multi-hop mechanisms to address this. + +**Bitswap and DHT scalability**: Current limitations are being addressed through: +- **PubSub-based query system**: Allows queries and updates through the global PubSub network using query topics independent of document topics. +- **GraphSync**: A Protocol Labs protocol that may resolve Bitswap algorithm and DHT issues. + +### Future improvements + +**Head Exchange protocol**: A new protocol in development to address issues with syncing the Merkle DAG when updates have been missed or concurrent, diverged updates have been made. It aims to efficiently: +- Establish the most recent update seen by each node +- Determine if there are divergent updates +- Find the most efficient way to synchronize nodes with minimal communication + +**Replicator persistence**: Currently, replicators don't persist through node updates or restarts—they must be re-added after each restart. This will be resolved in a future release. diff --git a/docs/defradb/Procedural/peer-to-peer-procedural.md b/docs/defradb/Procedural/peer-to-peer-procedural.md new file mode 100644 index 0000000..2b79818 --- /dev/null +++ b/docs/defradb/Procedural/peer-to-peer-procedural.md @@ -0,0 +1,218 @@ +# Peer-to-peer how-to guides + +This guide provides step-by-step instructions for configuring and managing peer-to-peer networking in DefraDB. + +## Prerequisites + +Before following these guides, ensure you have: + +- DefraDB installed on your system +- Basic familiarity with command-line interfaces +- Understanding of [P2P networking concepts](link-to-concepts-page) + +## Start and configure DefraDB + +### Start DefraDB with P2P enabled (default) + +P2P networking is enabled by default when you start DefraDB: + +```bash +defradb start +``` + +You'll see output similar to: + +``` +Jan 2 10:15:49.124 INF cli Starting DefraDB +Jan 2 10:15:49.161 INF net Created LibP2P host PeerId=12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 Address=[/ip4/127.0.0.1/tcp/9171] +Jan 2 10:15:49.162 INF net Starting internal broadcaster for pubsub network +``` + +### Start DefraDB without P2P + +To disable P2P networking: + +```bash +defradb start --no-p2p +``` + +### Change the P2P port + +By default, DefraDB listens on port `9171`. To use a different port: + +```bash +defradb start --p2paddr /ip4//tcp/ +``` + +Example: + +```bash +defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 +``` + +**Parameters**: +- Replace `` with your actual IP address (use `0.0.0.0` to listen on all interfaces) +- Replace `` with your desired port number + +## Manage Peer IDs + +### Get your Peer ID + +To retrieve your node's Peer ID using HTTP: + +```bash +curl -H "Accept: application/json" http://localhost:9181/api/p2p/info +``` + +The Peer ID is generated from a private key created during the first startup and remains consistent across restarts. + +## Connect to peers + +### Connect to a specific peer + +To connect to a particular peer when starting DefraDB: + +```bash +defradb start --peers /ip4//tcp//p2p/ +``` + +Example: + +```bash +defradb start --peers /ip4/192.168.1.100/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 +``` + +**Parameters**: +- Replace `` with the peer's IP address +- Replace `` with the peer's P2P port +- Replace `` with the peer's Peer ID + +## Manage document subscriptions (passive replication) + +Passive replication works at the document level. Subscribe to specific documents to receive updates automatically. + +### Subscribe to document updates + +```bash +defradb client p2p document add +``` + +Example: + +```bash +defradb client p2p document add bafybeihz5k3c2jzx7m4x5v6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3 +``` + +### Unsubscribe from document updates + +```bash +defradb client p2p document remove +``` + +### View all active document subscriptions + +```bash +defradb client p2p document getall +``` + +## Manage collection subscriptions (active replication) + +Active replication can work at the collection level, allowing you to replicate entire collections to specific peers. + +### Subscribe to collection updates + +```bash +defradb client p2p collection add +``` + +### Unsubscribe from collection updates + +```bash +defradb client p2p collection remove +``` + +### View all active collection subscriptions + +```bash +defradb client p2p collection getall +``` + +## Enable active replication + +Active replication creates a direct, persistent connection to a specific peer for reliable data synchronization. + +### Add a replicator using HTTP + +```bash +curl -X POST http://localhost:9181/api/p2p/replicators \ + -H "Content-Type: application/json" \ + -d '{ + "Info": { + "ID": "", + "Addrs": [""] + }, + "Collections": [""] + }' +``` + +Example: + +```bash +curl -X POST http://localhost:9181/api/p2p/replicators \ + -H "Content-Type: application/json" \ + -d '{ + "Info": { + "ID": "12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4", + "Addrs": ["/ip4/192.168.1.100/tcp/9171"] + }, + "Collections": ["Books"] + }' +``` + +**Parameters**: +- `ID`: The Peer ID of the node you want to replicate to +- `Addrs`: Array of multi-addresses for the peer +- `Collections`: Array of collection names to replicate (e.g., `["Books"]`) + +### Add a replicator using CLI + +```bash +defradb client p2p replicator set -c +``` + +Example: + +```bash +defradb client p2p replicator set -c Books 12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 +``` + +**Note**: Currently, replicators don't persist through node restarts. You'll need to re-add them after each restart. This limitation will be addressed in a future release. + +## Troubleshooting + +### Verify P2P is running + +Check the startup logs for confirmation that the LibP2P host was created and the P2P network is active: + +``` +INF net Created LibP2P host PeerId=... Address=[...] +INF net Starting internal broadcaster for pubsub network +``` + +### Check current peer connections + +Use the P2P info endpoint to see your current peer connections: + +```bash +curl http://localhost:9181/api/p2p/info +``` + +### Connection issues within home networks + +If peers can't connect within the same home Wi-Fi network, this is typically due to NAT firewall restrictions. Consider: + +1. Using circuit relays (a publicly accessible third-party node as an intermediary) +2. Configuring NAT hole punching +3. Connecting peers through the internet rather than the local network + +See the [P2P concepts](link-to-concepts-page) page for more information on NAT traversal. From 02a74e54916b176c7144eca77e9d3cf0f3f413be Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Wed, 3 Dec 2025 12:26:05 -0800 Subject: [PATCH 03/13] update --- docs/defradb/Conceptual/peer-to-peer-conceptual.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/defradb/Conceptual/peer-to-peer-conceptual.md b/docs/defradb/Conceptual/peer-to-peer-conceptual.md index 1e69205..51686a3 100644 --- a/docs/defradb/Conceptual/peer-to-peer-conceptual.md +++ b/docs/defradb/Conceptual/peer-to-peer-conceptual.md @@ -6,6 +6,20 @@ Peer-to-peer (P2P) networking is a way for devices or peers to communicate direc DefraDB is a decentralized database built on this idea. Instead of the traditional client-server setup, DefraDB uses P2P networking so apps can sync data locally and share information without relying on a trusted middleman. This supports a decentralized, private, and user-focused approach. +:::tip[Key Points] + +DefraDB leverages P2P networking via libp2p to synchronize data directly between distributed nodes, enabling **offline-first applications without a central server**. + +**Key capabilities:** +- **Passive replication** – Automatic broadcasting of updates via PubSub (similar to UDP) +- **Active replication** – Direct, point-to-point synchronization between specific nodes (similar to TCP) +- **NAT traversal** – Circuit relays and hole punching to connect nodes behind firewalls +- **Resilient synchronization** – Updates queue offline and sync automatically when connectivity returns + +DefraDB stores documents as update graphs (similar to Git) using IPLD blocks distributed across nodes. + +::: + ## Key concepts ### Libp2p networking framework From 1b966b4ef7d6d16dc8f7715e20d96e850d07a536 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Wed, 3 Dec 2025 12:49:32 -0800 Subject: [PATCH 04/13] update --- docs/defradb/Procedural/peer-to-peer-procedural.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/defradb/Procedural/peer-to-peer-procedural.md b/docs/defradb/Procedural/peer-to-peer-procedural.md index 2b79818..4331963 100644 --- a/docs/defradb/Procedural/peer-to-peer-procedural.md +++ b/docs/defradb/Procedural/peer-to-peer-procedural.md @@ -8,7 +8,7 @@ Before following these guides, ensure you have: - DefraDB installed on your system - Basic familiarity with command-line interfaces -- Understanding of [P2P networking concepts](link-to-concepts-page) +- Understanding of [P2P networking concepts](/defradb/Conceptual/peer-to-peer-conceptual.md) ## Start and configure DefraDB @@ -22,7 +22,7 @@ defradb start You'll see output similar to: -``` +```bash Jan 2 10:15:49.124 INF cli Starting DefraDB Jan 2 10:15:49.161 INF net Created LibP2P host PeerId=12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 Address=[/ip4/127.0.0.1/tcp/9171] Jan 2 10:15:49.162 INF net Starting internal broadcaster for pubsub network @@ -51,6 +51,7 @@ defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 ``` **Parameters**: + - Replace `` with your actual IP address (use `0.0.0.0` to listen on all interfaces) - Replace `` with your desired port number @@ -83,6 +84,7 @@ defradb start --peers /ip4/192.168.1.100/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb7 ``` **Parameters**: + - Replace `` with the peer's IP address - Replace `` with the peer's P2P port - Replace `` with the peer's Peer ID @@ -170,6 +172,7 @@ curl -X POST http://localhost:9181/api/p2p/replicators \ ``` **Parameters**: + - `ID`: The Peer ID of the node you want to replicate to - `Addrs`: Array of multi-addresses for the peer - `Collections`: Array of collection names to replicate (e.g., `["Books"]`) @@ -215,4 +218,4 @@ If peers can't connect within the same home Wi-Fi network, this is typically due 2. Configuring NAT hole punching 3. Connecting peers through the internet rather than the local network -See the [P2P concepts](link-to-concepts-page) page for more information on NAT traversal. +See the [P2P Conceptual](/defradb/Conceptual/peer-to-peer-conceptual.md) page for more information on NAT traversal. From 85d0fcb586967549bad6bb8817286296236c36b4 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Fri, 5 Dec 2025 10:28:06 -0800 Subject: [PATCH 05/13] update --- docs/defradb/{Conceptual => Concepts}/_category_.json | 0 .../{Conceptual => Concepts}/content-addressable-storage.md | 0 docs/defradb/{Conceptual => Concepts}/content-identifier.md | 0 docs/defradb/{Conceptual => Concepts}/ipfs.md | 0 docs/defradb/{Conceptual => Concepts}/libp2p.md | 0 docs/defradb/{Conceptual => Concepts}/merkle-crdt.md | 0 docs/defradb/{Conceptual => Concepts}/peer-to-peer-conceptual.md | 0 docs/defradb/{Procedural => Guides}/_category_.json | 0 docs/defradb/{Procedural => Guides}/akash-deployment.md | 0 docs/defradb/{Procedural => Guides}/deployment.md | 0 docs/defradb/{Procedural => Guides}/explain-queries.md | 0 docs/defradb/{Procedural => Guides}/peer-to-peer-procedural.md | 0 docs/defradb/{Procedural => Guides}/peer-to-peer.md | 0 docs/defradb/{Procedural => Guides}/schema-migration.md | 0 docs/defradb/{Procedural => Guides}/schema-relationship.md | 0 docs/defradb/{Procedural => Guides}/secondary-index.md | 0 docs/defradb/{Procedural => Guides}/time-traveling-queries.md | 0 17 files changed, 0 insertions(+), 0 deletions(-) rename docs/defradb/{Conceptual => Concepts}/_category_.json (100%) rename docs/defradb/{Conceptual => Concepts}/content-addressable-storage.md (100%) rename docs/defradb/{Conceptual => Concepts}/content-identifier.md (100%) rename docs/defradb/{Conceptual => Concepts}/ipfs.md (100%) rename docs/defradb/{Conceptual => Concepts}/libp2p.md (100%) rename docs/defradb/{Conceptual => Concepts}/merkle-crdt.md (100%) rename docs/defradb/{Conceptual => Concepts}/peer-to-peer-conceptual.md (100%) rename docs/defradb/{Procedural => Guides}/_category_.json (100%) rename docs/defradb/{Procedural => Guides}/akash-deployment.md (100%) rename docs/defradb/{Procedural => Guides}/deployment.md (100%) rename docs/defradb/{Procedural => Guides}/explain-queries.md (100%) rename docs/defradb/{Procedural => Guides}/peer-to-peer-procedural.md (100%) rename docs/defradb/{Procedural => Guides}/peer-to-peer.md (100%) rename docs/defradb/{Procedural => Guides}/schema-migration.md (100%) rename docs/defradb/{Procedural => Guides}/schema-relationship.md (100%) rename docs/defradb/{Procedural => Guides}/secondary-index.md (100%) rename docs/defradb/{Procedural => Guides}/time-traveling-queries.md (100%) diff --git a/docs/defradb/Conceptual/_category_.json b/docs/defradb/Concepts/_category_.json similarity index 100% rename from docs/defradb/Conceptual/_category_.json rename to docs/defradb/Concepts/_category_.json diff --git a/docs/defradb/Conceptual/content-addressable-storage.md b/docs/defradb/Concepts/content-addressable-storage.md similarity index 100% rename from docs/defradb/Conceptual/content-addressable-storage.md rename to docs/defradb/Concepts/content-addressable-storage.md diff --git a/docs/defradb/Conceptual/content-identifier.md b/docs/defradb/Concepts/content-identifier.md similarity index 100% rename from docs/defradb/Conceptual/content-identifier.md rename to docs/defradb/Concepts/content-identifier.md diff --git a/docs/defradb/Conceptual/ipfs.md b/docs/defradb/Concepts/ipfs.md similarity index 100% rename from docs/defradb/Conceptual/ipfs.md rename to docs/defradb/Concepts/ipfs.md diff --git a/docs/defradb/Conceptual/libp2p.md b/docs/defradb/Concepts/libp2p.md similarity index 100% rename from docs/defradb/Conceptual/libp2p.md rename to docs/defradb/Concepts/libp2p.md diff --git a/docs/defradb/Conceptual/merkle-crdt.md b/docs/defradb/Concepts/merkle-crdt.md similarity index 100% rename from docs/defradb/Conceptual/merkle-crdt.md rename to docs/defradb/Concepts/merkle-crdt.md diff --git a/docs/defradb/Conceptual/peer-to-peer-conceptual.md b/docs/defradb/Concepts/peer-to-peer-conceptual.md similarity index 100% rename from docs/defradb/Conceptual/peer-to-peer-conceptual.md rename to docs/defradb/Concepts/peer-to-peer-conceptual.md diff --git a/docs/defradb/Procedural/_category_.json b/docs/defradb/Guides/_category_.json similarity index 100% rename from docs/defradb/Procedural/_category_.json rename to docs/defradb/Guides/_category_.json diff --git a/docs/defradb/Procedural/akash-deployment.md b/docs/defradb/Guides/akash-deployment.md similarity index 100% rename from docs/defradb/Procedural/akash-deployment.md rename to docs/defradb/Guides/akash-deployment.md diff --git a/docs/defradb/Procedural/deployment.md b/docs/defradb/Guides/deployment.md similarity index 100% rename from docs/defradb/Procedural/deployment.md rename to docs/defradb/Guides/deployment.md diff --git a/docs/defradb/Procedural/explain-queries.md b/docs/defradb/Guides/explain-queries.md similarity index 100% rename from docs/defradb/Procedural/explain-queries.md rename to docs/defradb/Guides/explain-queries.md diff --git a/docs/defradb/Procedural/peer-to-peer-procedural.md b/docs/defradb/Guides/peer-to-peer-procedural.md similarity index 100% rename from docs/defradb/Procedural/peer-to-peer-procedural.md rename to docs/defradb/Guides/peer-to-peer-procedural.md diff --git a/docs/defradb/Procedural/peer-to-peer.md b/docs/defradb/Guides/peer-to-peer.md similarity index 100% rename from docs/defradb/Procedural/peer-to-peer.md rename to docs/defradb/Guides/peer-to-peer.md diff --git a/docs/defradb/Procedural/schema-migration.md b/docs/defradb/Guides/schema-migration.md similarity index 100% rename from docs/defradb/Procedural/schema-migration.md rename to docs/defradb/Guides/schema-migration.md diff --git a/docs/defradb/Procedural/schema-relationship.md b/docs/defradb/Guides/schema-relationship.md similarity index 100% rename from docs/defradb/Procedural/schema-relationship.md rename to docs/defradb/Guides/schema-relationship.md diff --git a/docs/defradb/Procedural/secondary-index.md b/docs/defradb/Guides/secondary-index.md similarity index 100% rename from docs/defradb/Procedural/secondary-index.md rename to docs/defradb/Guides/secondary-index.md diff --git a/docs/defradb/Procedural/time-traveling-queries.md b/docs/defradb/Guides/time-traveling-queries.md similarity index 100% rename from docs/defradb/Procedural/time-traveling-queries.md rename to docs/defradb/Guides/time-traveling-queries.md From 3aebb2284c7b7e78fdd68bba5b2ebd91ea2ad198 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Fri, 5 Dec 2025 10:28:14 -0800 Subject: [PATCH 06/13] update --- docs/defradb/Concepts/_category_.json | 2 +- docs/defradb/Guides/_category_.json | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/defradb/Concepts/_category_.json b/docs/defradb/Concepts/_category_.json index 621e8ca..dbc4274 100644 --- a/docs/defradb/Concepts/_category_.json +++ b/docs/defradb/Concepts/_category_.json @@ -1,5 +1,5 @@ { - "label": "Conceptual guides", + "label": "Concepts", "position": 3 } \ No newline at end of file diff --git a/docs/defradb/Guides/_category_.json b/docs/defradb/Guides/_category_.json index 545680e..7494f24 100644 --- a/docs/defradb/Guides/_category_.json +++ b/docs/defradb/Guides/_category_.json @@ -1,5 +1,5 @@ { - "label": "Procedural guides", + "label": "Guides", "position": 2 } \ No newline at end of file From 477b22fced34807e45611b73c624688bea933f75 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Fri, 5 Dec 2025 10:29:40 -0800 Subject: [PATCH 07/13] update --- .../Concepts/{peer-to-peer-conceptual.md => peer-to-peer.md} | 0 .../Guides/{peer-to-peer-procedural.md => peer-to-peer-how-to.md} | 0 2 files changed, 0 insertions(+), 0 deletions(-) rename docs/defradb/Concepts/{peer-to-peer-conceptual.md => peer-to-peer.md} (100%) rename docs/defradb/Guides/{peer-to-peer-procedural.md => peer-to-peer-how-to.md} (100%) diff --git a/docs/defradb/Concepts/peer-to-peer-conceptual.md b/docs/defradb/Concepts/peer-to-peer.md similarity index 100% rename from docs/defradb/Concepts/peer-to-peer-conceptual.md rename to docs/defradb/Concepts/peer-to-peer.md diff --git a/docs/defradb/Guides/peer-to-peer-procedural.md b/docs/defradb/Guides/peer-to-peer-how-to.md similarity index 100% rename from docs/defradb/Guides/peer-to-peer-procedural.md rename to docs/defradb/Guides/peer-to-peer-how-to.md From c261bdd48ce5b6be5209ef019b99177e32896ee1 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Mon, 8 Dec 2025 13:31:20 -0800 Subject: [PATCH 08/13] update section name --- docs/defradb/Guides/_category_.json | 5 ----- docs/defradb/How-to Guides/_category_.json | 5 +++++ docs/defradb/{Guides => How-to Guides}/akash-deployment.md | 0 docs/defradb/{Guides => How-to Guides}/deployment.md | 0 docs/defradb/{Guides => How-to Guides}/explain-queries.md | 0 .../defradb/{Guides => How-to Guides}/peer-to-peer-how-to.md | 0 docs/defradb/{Guides => How-to Guides}/peer-to-peer.md | 0 docs/defradb/{Guides => How-to Guides}/schema-migration.md | 0 .../defradb/{Guides => How-to Guides}/schema-relationship.md | 0 docs/defradb/{Guides => How-to Guides}/secondary-index.md | 0 .../{Guides => How-to Guides}/time-traveling-queries.md | 0 11 files changed, 5 insertions(+), 5 deletions(-) delete mode 100644 docs/defradb/Guides/_category_.json create mode 100644 docs/defradb/How-to Guides/_category_.json rename docs/defradb/{Guides => How-to Guides}/akash-deployment.md (100%) rename docs/defradb/{Guides => How-to Guides}/deployment.md (100%) rename docs/defradb/{Guides => How-to Guides}/explain-queries.md (100%) rename docs/defradb/{Guides => How-to Guides}/peer-to-peer-how-to.md (100%) rename docs/defradb/{Guides => How-to Guides}/peer-to-peer.md (100%) rename docs/defradb/{Guides => How-to Guides}/schema-migration.md (100%) rename docs/defradb/{Guides => How-to Guides}/schema-relationship.md (100%) rename docs/defradb/{Guides => How-to Guides}/secondary-index.md (100%) rename docs/defradb/{Guides => How-to Guides}/time-traveling-queries.md (100%) diff --git a/docs/defradb/Guides/_category_.json b/docs/defradb/Guides/_category_.json deleted file mode 100644 index 7494f24..0000000 --- a/docs/defradb/Guides/_category_.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "label": "Guides", - "position": 2 - } - \ No newline at end of file diff --git a/docs/defradb/How-to Guides/_category_.json b/docs/defradb/How-to Guides/_category_.json new file mode 100644 index 0000000..7366eae --- /dev/null +++ b/docs/defradb/How-to Guides/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "How-to Guides", + "position": 2 + } + \ No newline at end of file diff --git a/docs/defradb/Guides/akash-deployment.md b/docs/defradb/How-to Guides/akash-deployment.md similarity index 100% rename from docs/defradb/Guides/akash-deployment.md rename to docs/defradb/How-to Guides/akash-deployment.md diff --git a/docs/defradb/Guides/deployment.md b/docs/defradb/How-to Guides/deployment.md similarity index 100% rename from docs/defradb/Guides/deployment.md rename to docs/defradb/How-to Guides/deployment.md diff --git a/docs/defradb/Guides/explain-queries.md b/docs/defradb/How-to Guides/explain-queries.md similarity index 100% rename from docs/defradb/Guides/explain-queries.md rename to docs/defradb/How-to Guides/explain-queries.md diff --git a/docs/defradb/Guides/peer-to-peer-how-to.md b/docs/defradb/How-to Guides/peer-to-peer-how-to.md similarity index 100% rename from docs/defradb/Guides/peer-to-peer-how-to.md rename to docs/defradb/How-to Guides/peer-to-peer-how-to.md diff --git a/docs/defradb/Guides/peer-to-peer.md b/docs/defradb/How-to Guides/peer-to-peer.md similarity index 100% rename from docs/defradb/Guides/peer-to-peer.md rename to docs/defradb/How-to Guides/peer-to-peer.md diff --git a/docs/defradb/Guides/schema-migration.md b/docs/defradb/How-to Guides/schema-migration.md similarity index 100% rename from docs/defradb/Guides/schema-migration.md rename to docs/defradb/How-to Guides/schema-migration.md diff --git a/docs/defradb/Guides/schema-relationship.md b/docs/defradb/How-to Guides/schema-relationship.md similarity index 100% rename from docs/defradb/Guides/schema-relationship.md rename to docs/defradb/How-to Guides/schema-relationship.md diff --git a/docs/defradb/Guides/secondary-index.md b/docs/defradb/How-to Guides/secondary-index.md similarity index 100% rename from docs/defradb/Guides/secondary-index.md rename to docs/defradb/How-to Guides/secondary-index.md diff --git a/docs/defradb/Guides/time-traveling-queries.md b/docs/defradb/How-to Guides/time-traveling-queries.md similarity index 100% rename from docs/defradb/Guides/time-traveling-queries.md rename to docs/defradb/How-to Guides/time-traveling-queries.md From 36b3d6510c7ee186fac470c3cdc3722b71b5be72 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Mon, 8 Dec 2025 13:33:28 -0800 Subject: [PATCH 09/13] update --- docs/defradb/How-to Guides/peer-to-peer-how-to.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/defradb/How-to Guides/peer-to-peer-how-to.md b/docs/defradb/How-to Guides/peer-to-peer-how-to.md index 4331963..2da454f 100644 --- a/docs/defradb/How-to Guides/peer-to-peer-how-to.md +++ b/docs/defradb/How-to Guides/peer-to-peer-how-to.md @@ -1,3 +1,8 @@ +--- +sidebar_label: Peer-to-Peer How-to Guide +sidebar_position: 10 +--- + # Peer-to-peer how-to guides This guide provides step-by-step instructions for configuring and managing peer-to-peer networking in DefraDB. From 3074235e5c0c23fe1f7f680170a9813cec495fb6 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Mon, 8 Dec 2025 13:35:06 -0800 Subject: [PATCH 10/13] update --- docs/defradb/{How-to Guides => Archive}/peer-to-peer.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/defradb/{How-to Guides => Archive}/peer-to-peer.md (100%) diff --git a/docs/defradb/How-to Guides/peer-to-peer.md b/docs/defradb/Archive/peer-to-peer.md similarity index 100% rename from docs/defradb/How-to Guides/peer-to-peer.md rename to docs/defradb/Archive/peer-to-peer.md From 57365f9a282c3bcb936dbe33e8cb65028e4fd7e7 Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Tue, 9 Dec 2025 10:37:16 -0800 Subject: [PATCH 11/13] remove lint errors --- docs/defradb/Archive/peer-to-peer.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/defradb/Archive/peer-to-peer.md b/docs/defradb/Archive/peer-to-peer.md index f504407..b81797e 100644 --- a/docs/defradb/Archive/peer-to-peer.md +++ b/docs/defradb/Archive/peer-to-peer.md @@ -9,6 +9,7 @@ sidebar_position: 10 DefraDB leverages P2P networking via libp2p to synchronize data directly between distributed nodes, enabling **offline-first applications without a central server**. **Key capabilities:** + - **Passive replication** – Automatic broadcasting of updates via PubSub (similar to UDP) - **Active replication** – Direct, point-to-point synchronization between specific nodes (similar to TCP) - **NAT traversal** – Circuit relays and hole punching to connect nodes behind firewalls @@ -34,17 +35,17 @@ Libp2p is modular, meaning it can be customized and integrated into different P2 The high-level distinction between a document is as follows: -* A document is a single record that contains multiple fields. These documents are bound by schema. For example, each row in an SQL table has multiple individual columns. These rows are analogous to documents with multiple individual fields. +- A document is a single record that contains multiple fields. These documents are bound by schema. For example, each row in an SQL table has multiple individual columns. These rows are analogous to documents with multiple individual fields. -* A collection refers to a collection of documents under a single schema. For example, a table from an SQL database comprising of rows and columns is analogous to collections. +- A collection refers to a collection of documents under a single schema. For example, a table from an SQL database comprising of rows and columns is analogous to collections. ## Need for P2P Networking in DefraDB The DefraDB database requires peer-to-peer (P2P) networking to facilitate data synchronization between nodes. This is necessary because DefraDB can store documents and individual IPLD blocks on various nodes around the world, which may be used by a single application or multiple applications. P2P networking allows local instances of DefraDB, whether on a single device or in a web browser, to replicate information with other devices owned by the user or with trusted third parties. These third parties may serve as historical archival nodes or may be other users with whom the user is collaborating. For example, if a collaborative document powered by DefraDB is being shared with others, it should be transmitted over a P2P network to avoid the need for a trusted intermediary node. DefraDB offers two types of replication over the P2P network: -* Passive replication +- Passive replication -* Active replication +- Active replication ## How it works @@ -52,7 +53,7 @@ There are two, concrete types of data replication within DefraDB, i.e., active, ### Passive Replication -In DefraDB, passive replication is a type of data replication in which updates are automatically broadcast to the network and its peers without explicit coordination. This occurs over a global publish-subscribe network (PubSub), which is a way to broadcast updates on a specific topic and receive updates on that topic. +In DefraDB, passive replication is a type of data replication in which updates are automatically broadcast to the network and its peers without explicit coordination. This occurs over a global publish-subscribe network (PubSub), which is a way to broadcast updates on a specific topic and receive updates on that topic. This is called passive replication because it is similar to a "fire and forget" scenario. Passive replication is enabled for all nodes by default and all nodes will always publish to the larger PubSub network. Passive replication can be compared to the connectionless protocol UDP, while active replication can be compared to the connection-oriented protocol TCP. @@ -70,7 +71,6 @@ In passive replication, updates are broadcasted on a per-document level over the One major difference between active and passive networks is that an active network can focus on both collections and individual documents, while a passive network is only focused on individual documents. Active networks operate over a direct, point-to-point connection and allow you to select an entire collection to replicate to another node. For example, if you have a collection of books and specify a target node for active replication, the entire collection will be replicated to that node, including any updates to individual books. However, it is also possible to replicate granularly by selecting specific books within the collection for replication. Passive networks, on the other hand, are only concerned with replicating individual documents. - ## Concrete Features of P2P in DefraDB ### Passive Replication Features @@ -90,27 +90,27 @@ Jan 2 10:15:49.163 INF node Providing GraphQL endpoint at http://127.0.0.1:9181 This host has a Peer ID, which is a function of a secret private key generated when the node is started for the first time. The Peer ID is important to know as it may be relevant for different parts of the peer-to-peer networking system. The libp2p networking stack can be enabled or disabled. ```bash -$ defradb start --no-p2p +defradb start --no-p2p ``` The passive networking system can also be enabled or disabled. By default, if the P2P network is online, the passive networking system is turned on. ```bash -$ defradb start --peers /ip4/0.0.0.0/tcp/9171/p2p/ +defradb start --peers /ip4/0.0.0.0/tcp/9171/p2p/ ``` A node automatically listens on multiple addresses or ports when the P2P module is instantiated. These are referred to as the peer-to-peer address, which is expressed as a multi-address. A multi-address is a string that represents a network address and includes information about the transport protocol and addresses for multiple layers of the network stack. - ```bash /ip4/0.0.0.0/tcp/9171/p2p/ scheme/ip_address/protocol/port/protocol/peer_id ``` + The peer listens in on the p2p port 9171​ by default, which can be customized through the CLI or the configuration file. ```bash -$ defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 +defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 ``` The peer-to-peer address is the first of the addresses that the peer listens in on. @@ -124,7 +124,7 @@ When a node is started, it specifies a list of peers that it wants to stay conne To use the active replication feature in DefraDB, you can submit an add replicator Remote Procedure Call (RPC) command through the client API. You will need to specify the multi-address and Peer ID of the peer that you want to include in the replicator set, as well as the name of the collection that you want to replicate to that peer. These steps handle the process of defining which peers you want to connect to, enabling or disabling the underlying subsystems, and sending additional RPC commands to add any necessary replicators. ```bash -$ defradb client p2p replicator set -c Books +defradb client p2p replicator set -c Books ``` ## Benefits of the P2P System From 3fbe07de8c437180da7ef9e43041d2c6b9f8723e Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Tue, 9 Dec 2025 10:40:08 -0800 Subject: [PATCH 12/13] update --- docs/defradb/Concepts/peer-to-peer.md | 9 +++++++-- docs/defradb/How-to Guides/peer-to-peer-how-to.md | 4 ++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/docs/defradb/Concepts/peer-to-peer.md b/docs/defradb/Concepts/peer-to-peer.md index 51686a3..5ead4c9 100644 --- a/docs/defradb/Concepts/peer-to-peer.md +++ b/docs/defradb/Concepts/peer-to-peer.md @@ -11,6 +11,7 @@ DefraDB is a decentralized database built on this idea. Instead of the tradition DefraDB leverages P2P networking via libp2p to synchronize data directly between distributed nodes, enabling **offline-first applications without a central server**. **Key capabilities:** + - **Passive replication** – Automatic broadcasting of updates via PubSub (similar to UDP) - **Active replication** – Direct, point-to-point synchronization between specific nodes (similar to TCP) - **NAT traversal** – Circuit relays and hole punching to connect nodes behind firewalls @@ -127,12 +128,14 @@ When DefraDB starts, it creates a Peer ID—a unique identifier based on a priva A node automatically listens on multiple addresses or ports when the P2P module is instantiated. These are expressed as multi-addresses—strings that represent network addresses and include information about transport protocols and multiple network stack layers. Format: -``` + +```bash /ip4//tcp//p2p/ ``` Example: -``` + +```bash /ip4/0.0.0.0/tcp/9171/p2p/12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 ``` @@ -147,12 +150,14 @@ By default, DefraDB listens on P2P port `9171`. **Multi-hop between subnets**: Currently, synchronizing between subnets requires going through the global network, requiring multiple hops. The team is exploring multi-hop mechanisms to address this. **Bitswap and DHT scalability**: Current limitations are being addressed through: + - **PubSub-based query system**: Allows queries and updates through the global PubSub network using query topics independent of document topics. - **GraphSync**: A Protocol Labs protocol that may resolve Bitswap algorithm and DHT issues. ### Future improvements **Head Exchange protocol**: A new protocol in development to address issues with syncing the Merkle DAG when updates have been missed or concurrent, diverged updates have been made. It aims to efficiently: + - Establish the most recent update seen by each node - Determine if there are divergent updates - Find the most efficient way to synchronize nodes with minimal communication diff --git a/docs/defradb/How-to Guides/peer-to-peer-how-to.md b/docs/defradb/How-to Guides/peer-to-peer-how-to.md index 2da454f..73efb09 100644 --- a/docs/defradb/How-to Guides/peer-to-peer-how-to.md +++ b/docs/defradb/How-to Guides/peer-to-peer-how-to.md @@ -13,7 +13,7 @@ Before following these guides, ensure you have: - DefraDB installed on your system - Basic familiarity with command-line interfaces -- Understanding of [P2P networking concepts](/defradb/Conceptual/peer-to-peer-conceptual.md) +- Understanding of [P2P networking concepts](/defradb/Concepts/peer-to-peer.md) ## Start and configure DefraDB @@ -223,4 +223,4 @@ If peers can't connect within the same home Wi-Fi network, this is typically due 2. Configuring NAT hole punching 3. Connecting peers through the internet rather than the local network -See the [P2P Conceptual](/defradb/Conceptual/peer-to-peer-conceptual.md) page for more information on NAT traversal. +See the [P2P Conceptual](/defradb/Concepts/peer-to-peer.md) page for more information on NAT traversal. From b3aa6c62707f176890c64eb8ce6b2a2197bdfd6c Mon Sep 17 00:00:00 2001 From: pradhanashutosh Date: Tue, 9 Dec 2025 11:08:56 -0800 Subject: [PATCH 13/13] update --- docs/defradb/Archive/peer-to-peer.md | 158 --------------------------- 1 file changed, 158 deletions(-) delete mode 100644 docs/defradb/Archive/peer-to-peer.md diff --git a/docs/defradb/Archive/peer-to-peer.md b/docs/defradb/Archive/peer-to-peer.md deleted file mode 100644 index b81797e..0000000 --- a/docs/defradb/Archive/peer-to-peer.md +++ /dev/null @@ -1,158 +0,0 @@ ---- -sidebar_label: Peer-to-Peer Networking -sidebar_position: 10 ---- -# A Guide to Peer-to-Peer Networking in DefraDB - -:::tip[Key Points] - -DefraDB leverages P2P networking via libp2p to synchronize data directly between distributed nodes, enabling **offline-first applications without a central server**. - -**Key capabilities:** - -- **Passive replication** – Automatic broadcasting of updates via PubSub (similar to UDP) -- **Active replication** – Direct, point-to-point synchronization between specific nodes (similar to TCP) -- **NAT traversal** – Circuit relays and hole punching to connect nodes behind firewalls -- **Resilient synchronization** – Updates queue offline and sync automatically when connectivity returns - -DefraDB stores documents as update graphs (similar to Git) using IPLD blocks distributed across nodes. - -::: - -## Overview - -P2P networking is a way for devices to communicate and share data directly with each other without the need for a central server. In a P2P network, all devices, also known as peers, are equal and can both send and receive data. DefraDB is a database that uses P2P networking instead of the traditional client-server model. - -One advantage of this is that it allows for the development of offline-first or local-first applications. These are apps that can still work even when there is no internet connection and can sync data between multiple devices without the need for a central server to facilitate the synchronization. This makes it possible for a peer-to-peer network and database like DefraDB to function in a trustless environment, where no one device is more important or trustworthy than any other. This aligns with the goals of a decentralized, private, and user-centric database. - -P2P networking is the primary method of communication used in DefraDB, a decentralized database. The libp2p library was developed specifically for this purpose and forms the technological foundation of the database. In DefraDB, documents are replicated and combined into an update graph, similar to a version control client like Git or a hash chain or a hash graph. P2P networking allows nodes in DefraDB to communicate directly with each other, without the need for an intermediate node, making it easier to synchronize the updates within the update graph of a document. - -Libp2p is a decentralized network framework that enables the development of P2P applications. It consists of a set of protocols, specifications, and libraries created by Protocol Labs for the IPFS project. As the network layer for IPFS, libp2p provides various features for P2P communication such as transport, security, peer routing, and content discovery. - -Libp2p is modular, meaning it can be customized and integrated into different P2P projects and applications. It is designed to work with the IPLD (Inter Planetary Linked Data) data model, which is a suite of technologies for representing and navigating hash-linked data. IPLD allows for the unification of all data models that link data with hashes as instances of IPLD, making it a suitable choice for use with libp2p in P2P networking. - -## Documents and Collections - -The high-level distinction between a document is as follows: - -- A document is a single record that contains multiple fields. These documents are bound by schema. For example, each row in an SQL table has multiple individual columns. These rows are analogous to documents with multiple individual fields. - -- A collection refers to a collection of documents under a single schema. For example, a table from an SQL database comprising of rows and columns is analogous to collections. - -## Need for P2P Networking in DefraDB - -The DefraDB database requires peer-to-peer (P2P) networking to facilitate data synchronization between nodes. This is necessary because DefraDB can store documents and individual IPLD blocks on various nodes around the world, which may be used by a single application or multiple applications. P2P networking allows local instances of DefraDB, whether on a single device or in a web browser, to replicate information with other devices owned by the user or with trusted third parties. These third parties may serve as historical archival nodes or may be other users with whom the user is collaborating. For example, if a collaborative document powered by DefraDB is being shared with others, it should be transmitted over a P2P network to avoid the need for a trusted intermediary node. DefraDB offers two types of replication over the P2P network: - -- Passive replication - -- Active replication - -## How it works - -There are two, concrete types of data replication within DefraDB, i.e., active, and passive replication. Both these replication types serve different use cases and are implemented using different mechanics. - -### Passive Replication - -In DefraDB, passive replication is a type of data replication in which updates are automatically broadcast to the network and its peers without explicit coordination. This occurs over a global publish-subscribe network (PubSub), which is a way to broadcast updates on a specific topic and receive updates on that topic. - -This is called passive replication because it is similar to a "fire and forget" scenario. Passive replication is enabled for all nodes by default and all nodes will always publish to the larger PubSub network. Passive replication can be compared to the connectionless protocol UDP, while active replication can be compared to the connection-oriented protocol TCP. - -### Active Replication - -In active replication, data is replicated between nodes in a direct, point-to-point manner. This means that a specific node is chosen to constantly receive updates from the local node. In contrast, passive replication uses the Gossip protocol, which is a peer-to-peer communication mechanism in which nodes exchange state information about themselves and other nodes they know about. In the Gossip protocol, each node initiates a gossip round every second to exchange information with another random node, and the process is repeated until the whole system is synchronized. One difference between active and passive replication is that the Gossip protocol is a multi-hop protocol, meaning that there may be multiple connections between nodes in the network. Active replication, on the other hand, creates a direct connection between two nodes and ensures that updates are actively pushed to the other node, which then acknowledges receipt of the update to establish two-way communication. - -Passive replication is a good choice for situations where you want your peers to be able to follow your updates without requiring much coordination from you. It is often used in collaborative environments where multiple people are working on a document and want to ensure that both peers are in sync with each other. On the other hand, active replication is better for situations where you have a specific peer you are collaborating with and want to ensure that all of your data is being replicated to an archival node. This is because active replication involves a direct, point-to-point connection between the two nodes, allowing for more efficient and reliable data replication. - -## Implementation of Peer-to-Peer Networking in DefraDB - -In the DefraDB software architecture, a PubSub system is used for peer-to-peer networking. In this system, publishers send messages without specifying specific receivers, and subscribers express interest in certain types of messages without knowing which publishers they come from. This allows for a more dynamic network topology and better scalability. In the DefraDB PubSub network, nodes can publish or subscribe to specific topics. When a node publishes a message in passive replication, it is broadcasted to all nodes in the network. These nodes then coordinate with each other, re-broadcast the message, and use a process called "gossiping" to spread the published information through multiple connections, or "hops." This is known as the Gossip protocol. - -In passive replication, updates are broadcasted on a per-document level over the global PubSub network. Each document has its own topic, and nodes can subscribe to the topic corresponding to a specific document to receive updates passively. This is useful in environments where certain documents are in high demand or are being frequently updated, as the connections to these "hot documents" can be kept open to ensure they are kept up-to-date. However, if a document has not been accessed in a while, it is less important for it to be constantly updated and it is easy to resync these "cold documents" by submitting a query for the relevant updates. Passive replication and the PubSub system are therefore focused on individual documents. - -One major difference between active and passive networks is that an active network can focus on both collections and individual documents, while a passive network is only focused on individual documents. Active networks operate over a direct, point-to-point connection and allow you to select an entire collection to replicate to another node. For example, if you have a collection of books and specify a target node for active replication, the entire collection will be replicated to that node, including any updates to individual books. However, it is also possible to replicate granularly by selecting specific books within the collection for replication. Passive networks, on the other hand, are only concerned with replicating individual documents. - -## Concrete Features of P2P in DefraDB - -### Passive Replication Features - -The Defra Command Line Interface (CLI) allows you to modify the behavior of the peer-to-peer data network. When a DefraDB node starts up, it is assigned a libp2p host by default. - -```bash -$ defradb start -... -Jan 2 10:15:49.124 INF cli Starting DefraDB -Jan 2 10:15:49.161 INF net Created LibP2P host PeerId=12D3KooWEFCQ1iGMobsmNTPXb758kJkFc7XieQyGKpsuMxeDktz4 Address=[/ip4/127.0.0.1/tcp/9171] -Jan 2 10:15:49.162 INF net Starting internal broadcaster for pubsub network -Jan 2 10:15:49.163 INF node Providing HTTP API at http://127.0.0.1:9181 PlaygroundEnabled=false -Jan 2 10:15:49.163 INF node Providing GraphQL endpoint at http://127.0.0.1:9181/api/v0/graphql -``` - -This host has a Peer ID, which is a function of a secret private key generated when the node is started for the first time. The Peer ID is important to know as it may be relevant for different parts of the peer-to-peer networking system. The libp2p networking stack can be enabled or disabled. - -```bash -defradb start --no-p2p -``` - -The passive networking system can also be enabled or disabled. By default, if the P2P network is online, the passive networking system is turned on. - -```bash -defradb start --peers /ip4/0.0.0.0/tcp/9171/p2p/ -``` - -A node automatically listens on multiple addresses or ports when the P2P module is instantiated. These are referred to as the peer-to-peer address, which is expressed as a multi-address. A multi-address is a string that represents a network address and includes information about the transport protocol and addresses for multiple layers of the network stack. - -```bash -/ip4/0.0.0.0/tcp/9171/p2p/ - -scheme/ip_address/protocol/port/protocol/peer_id -``` - -The peer listens in on the p2p port 9171​ by default, which can be customized through the CLI or the configuration file. - -```bash -defradb start --p2paddr /ip4/0.0.0.0/tcp/9172 -``` - -The peer-to-peer address is the first of the addresses that the peer listens in on. - -At the start of a node, flags can be specified to enable, disable, or switch the host that the peer is listening on. When a new node is started, every existing or new document goes through an LRU (Least Recently Used) cache to identify the most important, relevant, or frequently used documents over a specific period of time. Then, by default, the passive replication system automatically subscribes to and creates the corresponding document topics on the PubSub network. - -When a node is started, it specifies a list of peers that it wants to stay connected to. The peer-to-peer node is self-organizing, meaning that if a node joins a new topic, it asks the larger network for other peers that are sharing information on that topic. This ensures that the node is always connected to some relevant nodes. A node also tries to find other relevant nodes, particularly when an individual topic is joined, subscribed to, or published. - -### Active Replication Features - -To use the active replication feature in DefraDB, you can submit an add replicator Remote Procedure Call (RPC) command through the client API. You will need to specify the multi-address and Peer ID of the peer that you want to include in the replicator set, as well as the name of the collection that you want to replicate to that peer. These steps handle the process of defining which peers you want to connect to, enabling or disabling the underlying subsystems, and sending additional RPC commands to add any necessary replicators. - -```bash -defradb client p2p replicator set -c Books -``` - -## Benefits of the P2P System - -One of the main benefits of the peer-to-peer (P2P) system is its robustness and ability to work even in the event of network failures. This allows developers to create local-first, offline-first applications. If a developer's node loses its internet connection, the P2P system will continue making changes and queue up updates. When the system is back online and reconnects to the network, it will automatically resolve the updates and resume publishing or replicating to the nodes specified by the developer. This means that the developer can rely on a trustless mechanism and does not need to rely on a central, trusted peer for data replication or repositories to save data. Instead, data is directly passed from the developer's node to any other collaborating node. This global P2P network allows developers to collaborate with anyone across the internet with no fundamental limitations. Additionally, since the P2P system is built on top of libp2p, developers have access to other useful features as well. These factors make it highly advantageous to work with a P2P network, especially from a local-first perspective. - -In DefraDB, the peer-to-peer system has several benefits. It is easy to connect to a server in a data center because each server has its own individual IP address. However, in a home network, there is a single IP for the modem and multiple devices connected to it are protected by a NAT firewall, making it difficult for other nodes to connect directly. The libp2p framework offers two solutions to this problem: - -Circuit Relays - This allow you to specify a third-party node that acts as an intermediary to resolve the NAT firewall issue. This works when you connect to the firewall/circuit relay node, which is a publicly accessible node, and another node connects to it as well. The third-party node acts as a conduit in this situation. This process requires trust in the third-party node to properly relay information, but it operates over encrypted transport layers, so the third-party node cannot use man-in-the-middle attacks to listen in on the data exchange. However, it does require the third-party node to be online and accessible. - -NAT Hole Punching - This is a technique that allows nodes to connect directly to a device behind a NAT firewall. This ensures that a user can directly connect with another node and vice versa, without the need for a trusted intermediary within the peer-to-peer network. - -## Current Limitations and Future Outlook - -Here are some of the limitations of the P2P system: - -One limitation of the peer-to-peer system is the potential scalability issue with having every document have its own independent topic. This can lead to overhead if a user has thousands or tens of thousands of documents in their node, or if an application developer has hundreds of thousands or millions of documents in their node. To address this issue, the team is exploring ways to create aggregate topics that can be scoped to subnets. These subnets can be group-specific or application-specific. Multiple hops are required between subnets. This means that if a user wants to synchronize and broadcast updates from their subnet to another subnet, they have to go from their subnet to the global net and back to the other subnet. The team is exploring ways to navigate this limitation through multi-hop mechanisms. - -In a peer-to-peer network, when a user broadcasts an update, it is sent to other nodes on the network. However, if a node is offline or experiences some other issue, it may miss some updates. In DefraDB's passive replication mode, the most recent update is broadcasted through the network using a Merkle DAG (directed acyclic graph). The broadcasting node does not verify that the receiving node has received all previous updates, so it is the responsibility of the receiving node to ensure it has received all necessary updates. If a node misses a couple of updates and then receives a new update, it must synchronize all previous updates before considering the document up to date. This is because the internal data model of the document is based on all changes made over time, not just the most recent change. When broadcasting the most recent update, it is sent over the peer-to-peer PubSub network. However, if a node needs to go back in time through the Merkle DAG to get updates from previous broadcasts, it uses a different system called the Distributed Hash Table (DHT). - -The scalability of Bitswap and the Distributed Hash Table (DHT) have been identified as limitations in the peer-to-peer (P2P) system. To address these issues, we are exploring the use of two new protocols.: - -PubSub based query system - This that allows users to query and receive updates through the global PubSub network using query topics that are independent of document topics. - -Graph Sync - This is a protocol developed by Protocol Labs, which has the potential to resolve issues with the Bitswap algorithm and DHT. These two approaches show promise in improving the scalability of the P2P system. - -There are currently some limitations with the peer-to-peer system being used. One issue is that replicators, which are added to a node, do not persist through updates or restarts. This means that the user must re-add the replicators every time the node is restarted. However, this issue will be resolved in the next version of the system. - -Currently, when a replicator is added to a node, it doesn't persist between node updates or node restarts. This means that every time there is a restart, the user must re-add these replicators. This is a minor oversight that the Source team plans to fix in a future release. In the meantime, they are also working on a new protocol called Head Exchange to address issues with syncing the Merkel DAG when updates have been missed or concurrent, diverged updates have been made. The Head Exchange protocol aims to efficiently establish the most recent update seen by each node, determine if there are any divergent updates, and figure out the most efficient way to synchronize the nodes with the least amount of communication. - -One issue with peer-to-peer local-first development is that it can be difficult for nodes to connect with each other when they are running on devices within the same home Wi-Fi network. This is due to a NAT firewall, which is a router that operates to protect private networks. A NAT firewall only allows internet traffic to pass through if it was requested by a device on the private network. It protects the identity of a network by not exposing internal IP addresses to the internet. This can make it difficult for other nodes to connect directly to a node running behind a NAT firewall.