diff --git a/.cursor/plans/sequencer_fastsync-style_refactor_7924dc46.plan.md b/.cursor/plans/sequencer_fastsync-style_refactor_7924dc46.plan.md new file mode 100644 index 00000000..6fea3577 --- /dev/null +++ b/.cursor/plans/sequencer_fastsync-style_refactor_7924dc46.plan.md @@ -0,0 +1,192 @@ +--- +name: Sequencer FastSync-style refactor +overview: Restructure the Sequencer and related stream handling to mirror JMDN-FastSync (single framing, Communicator, StreamRouter), unify duplicate SubmitMessageProtocol handlers, and—motivated by profiling showing JSON parse as a bottleneck—inventory hot-path payloads and migrate them to protobuf with definitions centralized under proto/ for maintainability. +todos: + - id: transport-layer + content: "Add Sequencer/transport: single JSON+delimiter Read/Write frame helpers; replace scattered ReadString/Write in Sequencer first" + status: pending + - id: communicator + content: Add Sequencer/protocol/communication.Communicator; move outbound streams from Consensus/Triggers to use transport + status: pending + - id: stream-router + content: Add StreamRouter dispatch by ACK.Stage; extract handler funcs from MessageListener/ListenerHandler switches + status: pending + - id: unify-handlers + content: Merge StructListener vs ListenerHandler SubmitMessageProtocol paths; align node.go and Streaming.go registration + status: pending + - id: split-consensus + content: Split Consensus.go into phase files (mechanical); rename Sequencer/Router if confusing + status: pending + - id: proto-inventory + content: "Profile/inventory: map JSON hot paths (stream + PubSub) to message types; document in a short ADR or checklist" + status: pending + - id: proto-schema + content: Add proto/sequencer (or extend proto/) with messages for stream envelopes + payloads; buf/protoc generation wired in Makefile + status: pending + - id: proto-framing + content: Implement length-delimited protobuf framing (reuse pattern from JMDN-FastSync pbstream); new protocol ID or negotiate version + status: pending + - id: proto-migrate-rollout + content: Migrate handlers and Communicator to proto on hot paths; dual-read or network upgrade window; remove JSON from critical path + status: pending + - id: decouple-maps + content: "Optional: move Triggers/Maps out of AVC→Sequencer import cycle" + status: pending +isProject: false +--- + +# Sequencer module alignment with FastSync-style layering + +## Reference pattern (what you want to mirror) + +From [JMDN-FastSync/core/sync/sync_protocols.go](file:///Users/neeraj/CodeSection/JM/JMDN-FastSync/core/sync/sync_protocols.go), [JMDN-FastSync/internal/pbstream/pbstream.go](file:///Users/neeraj/CodeSection/JM/JMDN-FastSync/internal/pbstream/pbstream.go), and [JMDN-FastSync/core/protocol/router/data_router.go](file:///Users/neeraj/CodeSection/JM/JMDN-FastSync/core/protocol/router/data_router.go): + + +| Layer | Role | +| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Stream handlers** | Thin: deadlines, `Read`*, call router, `Write`*, close | +| **Framing** | Single place: length-delimited protobuf in FastSync (`WriteDelimited` / `ReadDelimited`) | +| **Communication** | [communication.go](file:///Users/neeraj/CodeSection/JM/JMDN-FastSync/core/protocol/communication/communication.go): `Communicator` interface; all outbound `NewStream` + encode + read response | +| **Router** | `Datarouter`: `HandleX(ctx, req, remote) -> resp` — business logic without raw I/O | + + +## Profiling motivation: JSON as bottleneck + +Profiling identified **JSON parsing as a significant bottleneck** on the Sequencer-related paths. That strengthens the case for **protobuf on the wire** for high-frequency and large payloads: smaller frames, faster unmarshal than `encoding/json` on hot loops, and **schemas owned in one place** (`proto/`), which improves change safety and review (field numbers, deprecations, oneof for versioned envelopes). + +**Suggested follow-up before coding protos**: capture one profile artifact (pprof CPU + a short list of top frames involving `json.Unmarshal` / `DeferenceMessage` / stream read paths) so migration priorities stay evidence-based. + +## Current jmdn reality (why it feels “clumsy”) + +- **Wire format**: JSON + `config.Delimiter` (`0x1E`), not protobuf. Framing is reimplemented in many places via `bufio.Reader` + `ReadString(Delimiter)` and `Write(... + delimiter)` (e.g. [Sequencer/Consensus.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Consensus.go), [AVC/BuddyNodes/MessagePassing/MessageListener.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/MessageListener.go), [ListenerHandler.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/ListenerHandler.go)). Every path pays JSON parse cost. +- **Two different `HandleSubmitMessageStream` implementations** (~848 lines in `MessageListener.go` vs ~1900+ in `ListenerHandler.go`): + - [Streaming.go `NewListenerNode](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/Streaming.go)` registers `**StructListener.HandleSubmitMessageStream`**. + - [node/node.go](file:///Users/neeraj/CodeSection/JM/jmdn/node/node.go) registers `**ListenerHandler.HandleSubmitMessageStream`** on the same `SubmitMessageProtocol` when `ForListner` is missing or for alternate startup paths. + This splits behavior and makes debugging “which handler ran?” harder. +- **Routing**: Giant `switch` on `message.GetACK().GetStage()` scattered across those files; [Sequencer/Router/Router.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Router/Router.go) is **only** PubSub verification, not stream dispatch — naming collides with the FastSync “router” idea. +- **Sequencer package**: [Consensus.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Consensus.go) is **~2061 lines**; [Communication.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Communication.go) mixes subscription ACK correlation (`ResponseHandler`), `AskForSubscription`, and verification helpers. +- **Cross-package coupling**: `ListenerHandler` imports `gossipnode/Sequencer/Triggers/Maps` for vote maps — AVC depends on Sequencer for globals. + +```mermaid +flowchart LR + subgraph today [Current mess] + A[Consensus.go] --> B[raw stream writes] + C[MessageListener] --> D[duplicate read/parse/switch] + E[ListenerHandler] --> D + B --> F[Delimiter everywhere] + D --> F + D --> J[JSON parse hot path] + end +``` + + + +## Target architecture (phased: structure first, then proto end state) + +**Near term:** Introduce FastSync-shaped layers with **JSON framing centralized** in `transport` so refactors are mechanical and profiling stays comparable. + +**End state (aligned with profiling):** **Length-delimited protobuf** framing (same pattern as [JMDN-FastSync `pbstream](file:///Users/neeraj/CodeSection/JM/JMDN-FastSync/internal/pbstream/pbstream.go)`), **single `proto/` package** for Sequencer stream messages (and any shared envelopes), `**Communicator`** using only generated types + `ReadDelimited`/`WriteDelimited`. JSON delimiter path either removed or reserved for admin/legacy only. + + +| New package / area | Responsibility | +| ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `**Sequencer/transport`** | First: JSON frame helpers; later: **only** protobuf length-delimited read/write (or thin wrapper around shared `internal/pbstream`-style package). | +| `**Sequencer/protocol/communication`** | `Communicator`: all outbound streams; encode/decode via proto types on hot paths. | +| `**Sequencer/protocol/router`** | `StreamRouter`: dispatch by message type / oneof envelope — **no** per-handler `json.Unmarshal` of ad-hoc maps. | +| `**proto/`** (e.g. `proto/sequencer/v1/…`) | **Single source of truth** for wire types: subscription, vote result, BFT request/result envelopes, optional `StreamMessage` oneof for extensibility (like FastSync heartbeats). | +| `**AVC/BuddyNodes/MessagePassing`** | Thin handlers + unified inbound path calling `StreamRouter`. | + + +**Relevant data formats to migrate (inventory in `proto-inventory` todo):** + +- Stream payloads today built around `[config/PubSubMessages.Message](file:///Users/neeraj/CodeSection/JM/jmdn/config/PubSubMessages/Pubsub.go)` / `ACK` + `Stage` — replace with a **versioned envelope** proto (e.g. `SequencerStreamMessage` with `oneof payload` or `stage` enum + typed sub-messages). +- BFT request JSON in [ListenerHandler `handleBFTRequest](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/ListenerHandler.go)` — align with existing [AVC/BFT/proto/bft.proto](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BFT/proto/bft.proto) where possible, or nest under the new envelope to avoid two competing schemas. +- Vote result / subscription / verification messages that are unmarshaled on every stream — prioritize whatever the profiler ranks highest. + +**Network compatibility:** New `**protocol.ID`** for proto-backed streams (e.g. extend [config/constants.go](file:///Users/neeraj/CodeSection/JM/jmdn/config/constants.go) — `BFTConsensusProtocol` is currently unused and could be repurposed or superseded by an explicit `SequencerStreamProtocolV2`) **or** negotiate version in first frame; keep old `SubmitMessageProtocol` + JSON until fleet upgrades. + +```mermaid +flowchart TB + SetHandler[SetStreamHandler proto protocol] + SetHandler --> Thin[Thin handler: ReadDelimited] + Thin --> R[StreamRouter Dispatch typed proto] + R --> H1[Subscription handler] + R --> H2[BFT handler] + R --> H3[Vote handlers] + SeqOut[Sequencer Communicator] + SeqOut --> PB[pbstream WriteDelimited] + PB --> Peer[Remote peer] +``` + + + +## Phased implementation plan + +### Phase 1 — Framing and outbound communication (low risk) + +1. Add `**Sequencer/transport**` with `ReadDelimitedMessage` / `WriteDelimitedMessage` operating on `io.Reader`/`io.Writer` and existing JSON types in `[config/PubSubMessages](file:///Users/neeraj/CodeSection/JM/jmdn/config/PubSubMessages)`. +2. Implement `**Sequencer/protocol/communication**` wrapping: + - Calls currently in [Sequencer/Consensus.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Consensus.go) (`requestVoteResultFromBuddy`, stream write/read around lines ~1595–1722). + - Paths in [Sequencer/Triggers/Triggers.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Triggers/Triggers.go) that open streams and append delimiter. + - Subscription sends already funneled through `StructListener.SendMessageToPeer` in [MessageListener.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/MessageListener.go) — either move client side into `Communicator` or add transport helpers there first, then consolidate. +3. Replace ad-hoc `fmt.Printf` debug in [Communication.go `ResponseHandler](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Communication.go)` with structured logger (optional but improves “debuggability”). + +**Exit criteria**: No new `ReadString(config.Delimiter)` in Sequencer except inside `transport`. + +### Phase 2 — Inbound router and handler extraction + +1. Introduce `**StreamRouter`** in `Sequencer/protocol/router` (or `Sequencer/inbound`) with a **registry** `map[string]StageHandler` keyed by `config.Type_*` / `ACK.Stage`. +2. Move bodies out of the giant switches in `MessageListener` / `ListenerHandler` into `**stage_*.go`** files under `MessagePassing` or under `Sequencer/protocol/handlers` with explicit dependencies (host, listener node, response handler). +3. `**ListenerHandler` stays the place for BFT/vote state** initially; only the **dispatch** and **I/O** become uniform. + +**Exit criteria**: Single dispatch path for `SubmitMessageProtocol` inbound messages; switches reduced to router registration. + +### Phase 3 — Unify duplicate `HandleSubmitMessageStream` + +1. Compare behavior of `[StructListener.HandleSubmitMessageStream](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/MessageListener.go)` vs `[ListenerHandler.HandleSubmitMessageStream](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/ListenerHandler.go)` (case coverage: `Type_AskForSubscription`, `Type_BFTRequest`, `Type_VoteResult`, legacy flags, etc.). +2. Pick **one** implementation path: + - Either always construct `ListenerHandler` inside `StructListener` and delegate, or + - Merge into one function that uses `StreamRouter`. +3. Align [node/node.go](file:///Users/neeraj/CodeSection/JM/jmdn/node/node.go) and [Streaming.go `NewListenerNode](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/Streaming.go)` so **the same handler** is registered (avoid divergent production behavior). + +**Exit criteria**: One primary inbound implementation; second path removed or thin wrapper. + +### Phase 4 — Split `Consensus.go` and clarify `Sequencer/Router` + +1. Split [Consensus.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Consensus.go) by concern (mirroring FastSync “phase” files): e.g. `start.go` (orchestration), `subscription.go`, `event_flow.go`, `votes.go`, `bls.go`, `broadcast.go` — **pure move**, no logic change first. +2. Rename or namespace `**Sequencer/Router`** to something like `**verification`** or `**pubsub_verify`** to avoid confusion with the new **stream** router. + +### Phase 5 — Protobuf migration (performance + maintainability) + +**Goal:** Remove JSON parse from hot Sequencer/stream paths; **centralize contracts in `proto/`** so changes are explicit and reviewable. + +1. **Inventory (`proto-inventory`)**: From profiler + code search, list message types and call sites: stream `Message`/`ACK`/`Stage`, BFT JSON blobs, vote result payloads, PubSub gossip wrappers if they show up in top frames. +2. **Schema (`proto-schema`)**: Add package under repo `proto/` (follow existing [proto/](file:///Users/neeraj/CodeSection/JM/jmdn/proto) layout). Prefer a **single top-level envelope** with `oneof` for stage-specific payloads to mirror one router dispatch. Reuse or wrap [AVC/BFT/proto/bft.proto](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BFT/proto/bft.proto) to avoid duplicate BFT shapes. +3. **Framing (`proto-framing`)**: Implement or vendor **length-delimited** read/write (identical idea to FastSync `pbstream`). Register **new protocol ID** or version negotiation; document in config/constants. +4. **Rollout (`proto-migrate-rollout`)**: Implement dual stack if needed (read proto OR JSON for one release), then default proto and deprecate JSON on that protocol. Update `Communicator` and `StreamRouter` to use generated types only on migrated paths. + +**Exit criteria**: Profiler shows JSON unmarshaling no longer in top CPU for Sequencer round-trip; all new feature work touches **proto + generated Go** first. + +### Phase 6 (optional) — Decouple `Sequencer/Triggers/Maps` from AVC + +- Move vote-result map to `**config/PubSubMessages`** or a small `**consensus/state`** package, or inject an interface into handlers so `AVC` does not import `Sequencer` for globals. + +## Risk and testing notes + +- **Regression risk** is highest in Phase 3 (two handlers) and any change to delimiter framing; **proto rollout** adds network compatibility risk — mitigate with new protocol ID + dual-read window. +- Add **table-driven tests** for `transport` (round-trip JSON, then round-trip proto), and **integration tests** for one full subscription + one vote-result round-trip if your environment allows. +- Re-profile after proto migration to confirm JSON bottleneck is gone. +- Run `**make build`** / `**make lint`** per repo norms; full `make test` may need ImmuDB per [CLAUDE.md](file:///Users/neeraj/CodeSection/JM/jmdn/CLAUDE.md). + +## Key files to touch (summary) + + +| Area | Files | +| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Framing + comm | New under `Sequencer/transport`, `Sequencer/protocol/communication`; refactor [Sequencer/Consensus.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Consensus.go), [Sequencer/Triggers/Triggers.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Triggers/Triggers.go) | +| Inbound router | New `Sequencer/protocol/router` (stream dispatch); slim [MessageListener.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/MessageListener.go), [ListenerHandler.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/ListenerHandler.go) | +| Handler registration | [Streaming.go](file:///Users/neeraj/CodeSection/JM/jmdn/AVC/BuddyNodes/MessagePassing/Streaming.go), [node/node.go](file:///Users/neeraj/CodeSection/JM/jmdn/node/node.go) | +| Naming | [Sequencer/Router/Router.go](file:///Users/neeraj/CodeSection/JM/jmdn/Sequencer/Router/Router.go) (verification) vs new stream router | +| Proto | New/updated under `proto/`, generated `*.pb.go`, [config/constants.go](file:///Users/neeraj/CodeSection/JM/jmdn/config/constants.go) for protocol IDs, Makefile or buf for codegen | + + diff --git a/.gitignore b/.gitignore index 749fe652..6683a673 100644 --- a/.gitignore +++ b/.gitignore @@ -46,4 +46,25 @@ __debug_bin vendor/ # Internal team references -docs/SONARQUBE_SETUP_GUIDE.md \ No newline at end of file +docs/SONARQUBE_SETUP_GUIDE.md +ADR-001-JMDT-Native-EVM-Smart-Contracts.docx +jmdn.yaml +contract_storage_pebble/000004.log +*.log +*.logs +*.db +*.db-wal +contract_storage_pebble/CURRENT +contract_storage_pebble/LOCK +contract_storage_pebble/MANIFEST-000001 +contract_storage_pebble/MANIFEST-000005 +contract_storage_pebble/* +SmartContract/artifacts/* +SmartContract/artifacts/HelloWorld.json +/SmartContract/artifacts +SmartContract/artifacts/HelloWorld.json +docs/refactor-contractDBOps.md +internal/WAL/.tmp/* +.claude/* +.code-review-graph/* +.cursor/* \ No newline at end of file diff --git a/AVC/BuddyNodes/MessagePassing/ListenerHandler.go b/AVC/BuddyNodes/MessagePassing/ListenerHandler.go index b99db208..2783310c 100644 --- a/AVC/BuddyNodes/MessagePassing/ListenerHandler.go +++ b/AVC/BuddyNodes/MessagePassing/ListenerHandler.go @@ -1110,6 +1110,13 @@ func (lh *ListenerHandler) handleSubmitVote(logger_ctx context.Context, s networ ion.String("topic", TOPIC), ion.String("function", "MessagePassing.handleSubmitVote")) + // Notify the sequencer's vote collector (if this node IS the sequencer) + NotifyVoteCollector(AVCStruct.VoteNotification{ + PeerID: remotePeer.String(), + BlockHash: blockHash, + Vote: int8(voteValue), + }) + // Now publish the vote to pubsub so ALL other buddy nodes can receive it if pubSubNode != nil && pubSubNode.PubSub != nil { logger().NamedLogger.Info(voteSpanCtx, "Republishing vote to pubsub for all buddy nodes", @@ -1904,7 +1911,11 @@ func (lh *ListenerHandler) TriggerForBFTFromSequencer(s network.Stream, message if err := json.Unmarshal([]byte(msg.Message), &resultData); err == nil { if result, ok := resultData["result"].(float64); ok { voteResult := int8(result) - Maps.StoreVoteResult(buddyID.String(), voteResult) + resultBlockHash := "" + if bh, ok := resultData["block_hash"].(string); ok { + resultBlockHash = bh + } + Maps.StoreVoteResult(resultBlockHash, buddyID.String(), voteResult) fmt.Printf("✅ Stored vote result for peer %s: %d\n", buddyID.String(), voteResult) responsesMutex.Lock() responsesReceived++ diff --git a/AVC/BuddyNodes/MessagePassing/vote_collector.go b/AVC/BuddyNodes/MessagePassing/vote_collector.go new file mode 100644 index 00000000..e2cea314 --- /dev/null +++ b/AVC/BuddyNodes/MessagePassing/vote_collector.go @@ -0,0 +1,53 @@ +package MessagePassing + +import ( + "log" + "sync" + + PubSubMessages "gossipnode/config/PubSubMessages" +) + +// activeVoteCollector is the channel the sequencer registers to receive +// vote notifications pushed from handleSubmitVote. Only one consensus +// round is active per node at a time, so a single global channel is safe. +var ( + activeVoteCollector chan<- PubSubMessages.VoteNotification + voteCollectorMu sync.RWMutex +) + +// RegisterVoteCollector sets the active vote notification channel. +// The sequencer calls this at the start of its event-driven vote collection loop. +func RegisterVoteCollector(ch chan<- PubSubMessages.VoteNotification) { + voteCollectorMu.Lock() + activeVoteCollector = ch + voteCollectorMu.Unlock() +} + +// UnregisterVoteCollector nils out the active collector. +// Called via defer when the consensus round completes or times out. +func UnregisterVoteCollector() { + voteCollectorMu.Lock() + activeVoteCollector = nil + voteCollectorMu.Unlock() +} + +// NotifyVoteCollector sends a vote notification to the active collector +// (if one is registered). The send is non-blocking; if the channel is full +// or no collector is registered, the notification is dropped with a log warning. +func NotifyVoteCollector(notification PubSubMessages.VoteNotification) { + voteCollectorMu.RLock() + collector := activeVoteCollector + voteCollectorMu.RUnlock() + + if collector == nil { + return + } + + select { + case collector <- notification: + log.Printf("VoteCollector: notified sequencer of vote from %s for block %s (vote=%d)", + notification.PeerID, notification.BlockHash, notification.Vote) + default: + log.Printf("VoteCollector: channel full, dropping vote notification from %s", notification.PeerID) + } +} diff --git a/CLAUDE.md b/CLAUDE.md index e75f1fda..3c120a95 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -179,3 +179,42 @@ Proto definitions live in `proto/`. The gRPC services are: ## Linter Notes Active linters: `govet`, `ineffassign`, `unused`, `nolintlint`. `staticcheck`, `errcheck`, and `gosec` are disabled pending backlog cleanup — do not re-enable them in a PR without addressing existing violations first. + + +## MCP Tools: code-review-graph + +**IMPORTANT: This project has a knowledge graph. ALWAYS use the +code-review-graph MCP tools BEFORE using Grep/Glob/Read to explore +the codebase.** The graph is faster, cheaper (fewer tokens), and gives +you structural context (callers, dependents, test coverage) that file +scanning cannot. + +### When to use graph tools FIRST + +- **Exploring code**: `semantic_search_nodes` or `query_graph` instead of Grep +- **Understanding impact**: `get_impact_radius` instead of manually tracing imports +- **Code review**: `detect_changes` + `get_review_context` instead of reading entire files +- **Finding relationships**: `query_graph` with callers_of/callees_of/imports_of/tests_for +- **Architecture questions**: `get_architecture_overview` + `list_communities` + +Fall back to Grep/Glob/Read **only** when the graph doesn't cover what you need. + +### Key Tools + +| Tool | Use when | +|------|----------| +| `detect_changes` | Reviewing code changes — gives risk-scored analysis | +| `get_review_context` | Need source snippets for review — token-efficient | +| `get_impact_radius` | Understanding blast radius of a change | +| `get_affected_flows` | Finding which execution paths are impacted | +| `query_graph` | Tracing callers, callees, imports, tests, dependencies | +| `semantic_search_nodes` | Finding functions/classes by name or keyword | +| `get_architecture_overview` | Understanding high-level codebase structure | +| `refactor_tool` | Planning renames, finding dead code | + +### Workflow + +1. The graph auto-updates on file changes (via hooks). +2. Use `detect_changes` for code review. +3. Use `get_affected_flows` to understand impact. +4. Use `query_graph` pattern="tests_for" to check coverage. diff --git a/CLAUDE_CONSENSUS.md b/CLAUDE_CONSENSUS.md new file mode 100644 index 00000000..44c8950e --- /dev/null +++ b/CLAUDE_CONSENSUS.md @@ -0,0 +1,439 @@ +# CLAUDE_CONSENSUS.md + +This file is a full context document for Claude Code. Reading this file gives complete understanding of: +1. The consensus system overhaul already implemented (branch: `fix/Fastsync`) +2. The startup sync feature already implemented +3. The planned Sequencer FastSync-style refactor (6 phases, not yet implemented) + +--- + +## Part 1: Consensus System — What Was Changed and Why + +### Intended Vote Architecture (Important — Do Not Misread This) + +The vote routing uses **deliberate scattering via consistent hashing**, not direct-to-sequencer voting. Understanding this is critical before reading the bug list. + +**Flow:** +1. Sequencer/block publisher broadcasts the block to buddy nodes AND all nodes in the network +2. Each voting node hashes its own peer ID via `PickListnerWithOffset` to pick exactly **1 out of N buddy nodes** to send its vote to (deterministic — same node always picks the same buddy) +3. Votes are scattered across the N buddy nodes (~1/N of all votes per buddy) +4. Each buddy node aggregates the votes it received +5. Buddy nodes push the combined result to the sequencer +6. Sequencer converges on a final result from N buddy reports + +**Why this design:** +- **Security**: votes are spread across N nodes — an attacker must compromise all N buddy nodes to intercept all votes, not just one +- **Network efficiency**: fan-in is all_nodes/N per buddy, then N→sequencer; far less congestion than all_nodes→sequencer directly +- **Compute**: sequencer processes N aggregated results, not all_nodes raw votes + +`PickListnerWithOffset` is intentional and correct. Do not treat it as a bug. + +--- + +### The Problems (Before This Branch) + +The original consensus system had these bugs: + +1. **Vote trigger sent to all peers, not just committee**: `BroadcastVoteTrigger` was broadcasting to all `h.Network().Peers()`. Non-committee nodes (not buddy nodes) received the trigger and attempted to vote, causing unnecessary network traffic and compute. The trigger should only go to the committee (the N buddy nodes). + +2. **Hard-coded sleep**: `Consensus.go` used `time.Sleep(15 * time.Second)` to wait for votes. No event-driven signaling. If votes arrived early, the sequencer waited for no reason. If votes arrived late, they were missed. + +3. **Two competing vote collection paths**: `Sequencer/Triggers/Triggers.go` had `ListeningTrigger` / `BFTTrigger` / `StartBFTConsensus` with their own timers, while `Consensus.go` had a separate pull-based collection. They conflicted. + +4. **Vote results stored without block hash scope**: `Maps.StoreVoteResult` used `map[string]int8` (peerID→vote). Cross-round contamination was possible if two consensus rounds overlapped. + +### What Was Changed + +#### `config/PubSubMessages/Consensus.go` +Added two new fields to `ConsensusMessage`: +```go +SequencerID string // peer.ID of the sequencer running this round +RoundID string // == blockHash, scopes this round uniquely +``` + +#### `config/PubSubMessages/Consensus_Builder.go` +Added `SetSequencerID`/`GetSequencerID`, `SetRoundID`/`GetRoundID` getters/setters. Updated `NewConsensusMessageBuilder` to copy both fields. + +#### `config/PubSubMessages/vote_notification.go` (NEW FILE) +```go +type VoteNotification struct { + PeerID string + BlockHash string + Vote int8 +} +``` +Used to push vote events from the AVC listener into the sequencer's vote collection loop. + +#### `AVC/BuddyNodes/MessagePassing/vote_collector.go` (NEW FILE) +```go +var activeVoteCollector chan<- PubSubMessages.VoteNotification +var voteCollectorMu sync.RWMutex + +func RegisterVoteCollector(ch chan<- PubSubMessages.VoteNotification) +func UnregisterVoteCollector() +func NotifyVoteCollector(notification PubSubMessages.VoteNotification) +``` +The sequencer registers a channel before each consensus round. When a vote arrives in `ListenerHandler.handleSubmitVote`, it calls `NotifyVoteCollector` which pushes to that channel — non-blocking (drops if no collector registered). + +#### `AVC/BuddyNodes/MessagePassing/ListenerHandler.go` +After a vote is successfully stored in the CRDT (inside `handleSubmitVote`): +```go +NotifyVoteCollector(AVCStruct.VoteNotification{ + PeerID: remotePeer.String(), + BlockHash: blockHash, + Vote: int8(voteValue), +}) +``` +Also fixed: `Maps.StoreVoteResult` calls were updated to pass `blockHash` as the first parameter (after the Maps API changed to be block-hash scoped). + +#### `Sequencer/Triggers/Maps/vote_results.go` +Changed from `map[string]int8` to `map[string]map[string]int8` (blockHash → peerID → vote): +```go +var voteResults = make(map[string]map[string]int8) + +func StoreVoteResult(blockHash string, peerID string, vote int8) +func GetVoteResultsCount(blockHash string) int +func GetAllVoteResults(blockHash string) map[string]int8 +func ClearVoteResultsForBlock(blockHash string) +``` +All 5 call sites updated to pass `blockHash`. + +#### `Sequencer/consensus_statemachine.go` +Added to `Consensus` struct: +```go +voteNotifyCh chan PubSubMessages.VoteNotification +roundCtx context.Context +roundCancel context.CancelFunc +``` +`roundCancel()` called in `CleanupSubscriptions()`. + +#### `Sequencer/Consensus.go` — Key Changes + +**SequencerID embedded in broadcast:** +```go +// After SetZKBlockData: +consensus.ZKBlockData.SetSequencerID(consensus.Host.ID().String()) +consensus.ZKBlockData.SetRoundID(zkblock.BlockHash.Hex()) +``` + +**Event-driven vote collection (replaced time.Sleep):** +```go +roundCtx, roundCancel := context.WithTimeout(trace_ctx, config.ConsensusTimeout) +consensus.roundCtx = roundCtx +consensus.roundCancel = roundCancel + +voteNotifyCh := make(chan PubSubMessages.VoteNotification, config.MaxMainPeers) +consensus.voteNotifyCh = voteNotifyCh +MessagePassing.RegisterVoteCollector(voteNotifyCh) +defer MessagePassing.UnregisterVoteCollector() + +for { + select { + case notification := <-voteNotifyCh: + // store notification, check if enough votes collected + if enoughVotes { goto VOTES_COLLECTED } + case <-roundCtx.Done(): + goto VOTES_COLLECTED + } +} +VOTES_COLLECTED: +// CollectVoteResultsFromBuddies → VerifyConsensusWithBLS → BroadcastAndProcessBlock +``` + +**Targeted vote trigger (committee only):** +```go +messaging.BroadcastVoteTriggerToCommittee(consensus.Host, consensus.ZKBlockData, consensus.PeerList.MainPeers) +``` + +**`isCommitteeMember` helper added** (package-level function before `VerifySubscriptions`): +```go +func isCommitteeMember(peerIDStr string, mainPeers []peer.ID) bool +``` + +**`Maps.StoreVoteResult` call fixed** to pass `blockHash` as first arg. + +#### `Vote/Trigger.go` +`SubmitVote()` still uses `PickListnerWithOffset` to route votes from voting nodes to a buddy node (the correct, intentional design). The `SequencerID` field is used by **buddy nodes** to know which peer to report their aggregated result back to — not by voting nodes to bypass the buddy scatter. Falls back gracefully if `SequencerID` is empty. + +#### `messaging/broadcast.go` +Added: +```go +func BroadcastVoteTriggerToCommittee(h host.Host, consensusMessage *PubSubMessages.ConsensusMessage, committeePeers []peer.ID) error +``` +Same as `BroadcastVoteTrigger` but sends only to `committeePeers` instead of all `h.Network().Peers()`. + +#### `Sequencer/Triggers/Triggers.go` +Updated all `Maps.StoreVoteResult`, `GetVoteResultsCount`, `GetAllVoteResults` calls to pass `blockHash` as the first argument. + +### Dead Code (Should Eventually Be Removed) +The timer-based path in `Sequencer/Triggers/Triggers.go` — `ListeningTrigger`, `BFTTrigger`, `StartBFTConsensus` — is now superseded by the event-driven path in `Consensus.go`. It's inert but still compiles. Do not re-enable it. + +--- + +## Part 2: Startup Sync — What Was Changed and Why + +### The Problem +When a node restarts, it may be behind the network. Previously, syncing only happened when manually triggered via `fastsyncv2 ` CLI command. + +### Implementation + +#### `FastsyncV2/fastsyncv2.go` +- Old `HandleSync(targetPeer string) error` body extracted into `handleSyncInternal(targetPeer string, startBlock uint64) error` +- `HandleSync` becomes: `return fs.handleSyncInternal(targetPeer, 0)` (preserves existing CLI behavior) +- `PriorSync` call in `handleSyncInternal` uses `startBlock` instead of hardcoded `0`: + ```go + fs.PriorRouter.PriorSync(startBlock, localBlockNum, startBlock, math.MaxUint64, targetNodeInfo, availResp.Auth) + ``` +- New method: + ```go + func (fs *FastsyncV2) HandleStartupSync(peerID peer.ID, addrs []multiaddr.Multiaddr) error { + targetMultiaddr := fmt.Sprintf("%s/p2p/%s", addrs[0].String(), peerID.String()) + localBlockNum := fs.blockInfoAdapter.GetBlockDetails().Blocknumber + startBlock := localBlockNum // 0 if fresh node → full sync + return fs.handleSyncInternal(targetMultiaddr, startBlock) + } + ``` + **Important**: multiaddr string must be built as `addrs[0].String() + "/p2p/" + peerID.String()` — the protocol functions require a full multiaddr, even for already-connected peers. + +#### `main.go` +After `fastSyncerV2 = initFastsyncV2(n)`, a background goroutine: +```go +if fastSyncerV2 != nil { + goMaybeTracked(MainLM, GRO.MainAM, GRO.MainLM, GRO.StartupSyncThread, func(ctx context.Context) error { + time.Sleep(5 * time.Second) // let peer connections establish + + peers := n.Host.Network().Peers() + if len(peers) == 0 { + // TODO: Query seed node for available sync peers when no direct peers are connected + log.Info().Msg("[StartupSync] No peers connected, skipping startup sync") + return nil + } + + for _, peerID := range peers { + addrs := n.Host.Peerstore().Addrs(peerID) + if len(addrs) == 0 { continue } + + log.Info().Str("peer", peerID.String()).Msg("[StartupSync] Attempting startup sync") + if err := fastSyncerV2.HandleStartupSync(peerID, addrs); err != nil { + log.Warn().Err(err).Str("peer", peerID.String()).Msg("[StartupSync] Failed, trying next peer") + continue + } + log.Info().Str("peer", peerID.String()).Msg("[StartupSync] Sync completed successfully") + return nil + } + + log.Warn().Msg("[StartupSync] Failed to sync with any connected peer") + return nil + }) +} +``` + +#### `config/GRO/constants.go` +Added: +```go +StartupSyncThread = "thread:startup:sync" +``` + +--- + +## Part 3: Planned Sequencer FastSync-Style Refactor (NOT YET IMPLEMENTED) + +### Why This Refactor Is Needed + +Six specific problems in the Sequencer module: + +1. **Two competing `HandleSubmitMessageStream` implementations** + - `StructListener.HandleSubmitMessageStream` in `AVC/BuddyNodes/MessagePassing/MessageListener.go:34` — stateless, immediately delegates to `ListenerHandler` + - `ListenerHandler.HandleSubmitMessageStream` in `AVC/BuddyNodes/MessagePassing/ListenerHandler.go:67` — stateful (`bftContexts map[string]*BFTContext`, `sequencerPeerID`), has the actual logic + - Both have near-identical switch statements on `ACK.Stage`. `StructListener` is a dead wrapper. + +2. **No transport abstraction** + - Current framing: JSON + `0x1E` ASCII delimiter (`config.Delimiter`) + - `bufio.ReadString(config.Delimiter)` scattered across 11+ files + - Write: `stream.Write([]byte(msg + string(rune(config.Delimiter))))` + - No shared framing package + +3. **Bidirectional AVC ↔ Sequencer dependency cycle** + - `Sequencer/Consensus.go` imports `AVC/BuddyNodes/MessagePassing` (and BLS_Signer, BLS_Verifier, Service) + - `AVC/BuddyNodes/MessagePassing/ListenerHandler.go:21` imports `gossipnode/Sequencer/Triggers/Maps` + - Both modules are tightly coupled — hard to test or refactor independently + +4. **`Consensus.go` is a 2,061-line monolith** + Mixes: connectivity checks, PubSub management, subscription negotiation, vote collection orchestration, BLS verification, CRDT synchronization + +5. **Package-level globals in `Sequencer/Triggers/Triggers.go`** + `globalVoteData`, `subscriptionService`, `bftEngine`, `consensusCancel` — not safe for concurrent rounds + +6. **JSON on hot paths** + JSON + `0x1E` on vote submission, BFT requests = identified CPU bottleneck + +### Reference Architecture: JMDN-FastSync + +Located at `/Users/neeraj/CodeSection/JM/JMDN-FastSync/`. + +#### pbstream (`internal/pbstream/pbstream.go`) +```go +func WriteDelimited(w io.Writer, msg proto.Message) error // uvarint len + proto bytes +func ReadDelimited(r io.Reader, msg proto.Message) error // read uvarint len → read bytes → unmarshal +``` +Uses `bufio.NewReader` for efficient variable-length prefix reads. Language-independent. + +#### Communicator interface (`core/protocol/communication/communication.go`) +Abstracts all outbound request-response patterns: +```go +type Communicator interface { + SendPriorSync(ctx, merkle, peer, data) (*PriorSyncMessage, error) + SendMerkleRequest(ctx, peerNode, req) (*MerkleMessage, error) + SendHeaderSyncRequest(ctx, peerNode, req) (*HeaderSyncResponse, error) + SendDataSyncRequest(ctx, peerNode, req) (*DataSyncResponse, error) + SendAvailabilityRequest(ctx, peerNode, req) (*AvailabilityResponse, error) + SendPoTSRequest(ctx, peerNode, req) (*PoTSResponse, error) +} +``` + +#### DataRouter (`core/protocol/router/data_router.go`) +Dispatches by `req.Phase.PresentPhase` constant: +```go +switch state { +case constants.SYNC_REQUEST: Data := router.SYNC_REQUEST(ctx, req.Priorsync, peerNode, remote) +case constants.REQUEST_MERKLE: Data := router.REQUEST_MERKLE(ctx, merkleRange, config, remote) +// ... +} +``` + +#### Thin stream handlers (`core/sync/sync_protocols.go`) +Short ops: +``` +defer str.Close() → SetReadDeadline → ReadDelimited(req) → extract remote peer → router.Handle*(ctx, req, remote) → SetWriteDeadline → WriteDelimited(resp) +``` +Long ops: add heartbeat goroutine on a ticker; cancels compute context if heartbeat write fails. + +### The 6 Phases + +--- + +#### Phase 1: Transport Framing + SequencerCommunicator Interface + +**Create `config/transport/transport.go`** (neutral location, no new dependency cycles): +```go +func WriteMessage(w io.Writer, msg *PubSubMessages.Message) error // JSON marshal + 0x1E +func ReadMessage(r io.Reader) (*PubSubMessages.Message, error) // ReadString(0x1E) + unmarshal +``` + +**Create `Sequencer/protocol/communication/communication.go`**: +```go +type VoteResultResponse struct { PeerID, BlockHash string; Vote int8; BLSSig []byte; ... } + +type SequencerCommunicator interface { + AskForSubscription(ctx context.Context, peers []peer.ID, topic string, callbackCh chan<- bool) error + RequestVoteResult(ctx context.Context, peerID peer.ID, consensusMsg *PubSubMessages.ConsensusMessage) (*VoteResultResponse, error) +} + +func New(h host.Host) SequencerCommunicator +``` + +**Modify**: +- `Sequencer/consensus_statemachine.go` — add `communicator SequencerCommunicator` to `Consensus` struct +- `Sequencer/Consensus.go` — replace `requestVoteResultFromBuddy` (~115 lines) + `readVoteResultResponse` with `consensus.communicator.RequestVoteResult(...)`; replace raw framing with `transport.WriteMessage`/`transport.ReadMessage` +- `Sequencer/Communication.go` — replace inline framing in `AskForSubscription` helpers +- `Sequencer/Triggers/Triggers.go` — replace inline framing in `RequestVoteResultsFromBuddies` + +**Wire format stays JSON + 0x1E. AVC files untouched.** + +--- + +#### Phase 2: StreamRouter (Inbound Dispatch) + +Create `Sequencer/protocol/router/stream_router.go`: +- `StreamRouter` type that dispatches on `ACK.Stage` constant +- Each stage maps to a handler method (replacing the giant switch in `HandleSubmitMessageStream`) +- `RegisterHandler(stage string, fn HandlerFunc)` pattern + +Register handlers for: `Type_AskForSubscription`, `Type_SubscriptionResponse`, `Type_SubmitVote`, `Type_VoteResult`, `Type_BFTRequest`, `Type_BFTResult`, `Type_VerifySubscription` + +--- + +#### Phase 3: Unify Duplicate Handlers + +- Delete `StructListener.HandleSubmitMessageStream` entirely (it's a dead wrapper) +- Wire `ListenerHandler.HandleSubmitMessageStream` directly as the protocol's stream handler in `AVC/BuddyNodes/MessagePassing/Streaming.go` +- Optionally delete `StructListener` type if it has no other methods +- Replace remaining raw framing in `ListenerHandler.go` with `transport.ReadMessage`/`transport.WriteMessage` + +--- + +#### Phase 4: Split `Consensus.go` into Phase Files + +Break the 2,061-line monolith: +| File | Contents | +|------|----------| +| `Sequencer/consensus_init.go` | `ConnectedNessCheck`, `warmup`, startup helpers | +| `Sequencer/consensus_subscribe.go` | `RequestSubscriptionPermission`, `startEventDrivenFlowAfterSubscriptionPermission`, `VerifySubscriptions` | +| `Sequencer/consensus_vote.go` | `BroadcastVoteTrigger`, `ProcessVoteCollection`, `CollectVoteResultsFromBuddies`, event loop | +| `Sequencer/consensus_verify.go` | `VerifyConsensusWithBLS`, `BroadcastAndProcessBlock` | +| `Sequencer/Consensus.go` | Just `Start()` orchestrating the above + package-level doc comment | + +All in the same `Sequencer` package — pure file split, no interface changes. + +--- + +#### Phase 5: Protobuf Migration + +1. Inventory hot paths: `Type_SubmitVote`, `Type_BFTRequest`, `Type_VoteResult` send/receive +2. Add `proto/sequencer/v1/` schemas (message.proto, vote.proto, bft.proto) +3. Swap `config/transport/transport.go` implementation to `WriteDelimited`/`ReadDelimited` with protobuf instead of JSON + `0x1E` +4. Update message builders to output proto instead of JSON +5. Roll out message type by message type on hot paths; keep JSON for cold paths until complete + +--- + +#### Phase 6: Decouple `Sequencer/Triggers/Maps` from AVC + +The cycle source: `AVC/BuddyNodes/MessagePassing/ListenerHandler.go:21` imports `gossipnode/Sequencer/Triggers/Maps`. + +Fix options: +- Move `Maps/vote_results.go` into `config/VoteMaps/` (neutral location) +- Or move it into `AVC/BuddyNodes/` since AVC is the one writing to it +- Update all import paths in both `ListenerHandler.go` and `Sequencer/Triggers/Triggers.go` + +After this phase: AVC and Sequencer have no import cycle. + +--- + +### Key Files Quick Reference + +| File | Role | Lines | +|------|------|-------| +| `AVC/BuddyNodes/MessagePassing/MessageListener.go` | `StructListener` — dead wrapper over `ListenerHandler` | ~300 | +| `AVC/BuddyNodes/MessagePassing/ListenerHandler.go` | Real inbound handler — stateful BFT contexts | ~1600 | +| `AVC/BuddyNodes/MessagePassing/Streaming.go` | Stream handler registration (`SetStreamHandler`) | ~200 | +| `AVC/BuddyNodes/MessagePassing/vote_collector.go` | `RegisterVoteCollector`/`NotifyVoteCollector` push mechanism | ~50 | +| `Sequencer/Consensus.go` | Main consensus orchestration — monolith | 2061 | +| `Sequencer/consensus_statemachine.go` | `Consensus` struct definition | ~300 | +| `Sequencer/Communication.go` | `AskForSubscription`, `VerifySubscriptions` | 657 | +| `Sequencer/Triggers/Triggers.go` | Timer-based triggers + `RequestVoteResultsFromBuddies` | 717 | +| `Sequencer/Triggers/Maps/vote_results.go` | Block-hash scoped vote result storage | 73 | +| `Sequencer/Router/Router.go` | Pass-through to VerificationService | 190 | +| `Vote/Trigger.go` | `SubmitVote()` — routes to `SequencerID` or fallback | ~200 | +| `messaging/broadcast.go` | `BroadcastVoteTriggerToCommittee` | ~500 | +| `config/PubSubMessages/Consensus.go` | `ConsensusMessage` with `SequencerID`, `RoundID` | ~100 | +| `config/PubSubMessages/vote_notification.go` | `VoteNotification` struct | ~15 | +| `JMDN-FastSync/internal/pbstream/pbstream.go` | Reference: length-delimited framing | ~80 | +| `JMDN-FastSync/core/protocol/communication/communication.go` | Reference: Communicator interface | ~200 | +| `JMDN-FastSync/core/protocol/router/data_router.go` | Reference: DataRouter dispatch | ~500 | +| `JMDN-FastSync/core/sync/sync_protocols.go` | Reference: thin stream handlers | ~400 | + +### Protocol Constants +- `config.SubmitMessageProtocol` = `"/p2p/submit/message/1.0.0"` — main sequencer/buddy protocol +- `config.BuddyNodesMessageProtocol` — buddy→sequencer callback (BFT results) +- `config.Delimiter` = `0x1E` (ASCII Record Separator) + +### ACK Stage Constants (message type discriminator on the wire) +- `config.Type_AskForSubscription` +- `config.Type_SubscriptionResponse` +- `config.Type_SubmitVote` +- `config.Type_VoteResult` +- `config.Type_BFTRequest` +- `config.Type_BFTResult` +- `config.Type_VerifySubscription` +- `config.Type_ACK_True` / `config.Type_ACK_False` diff --git a/CLI/CLI.go b/CLI/CLI.go index a5d2ba30..ca2581d0 100644 --- a/CLI/CLI.go +++ b/CLI/CLI.go @@ -61,6 +61,7 @@ type CommandHandler struct { ChainID int FacadePort int WSPort int + PullAllowed bool } // Simple helper to print the CLI prompt in color @@ -106,9 +107,8 @@ func PrintFuncs() { fmt.Println(" mempoolStats - Show mempool statistics") fmt.Println(" stats - Show messaging statistics") fmt.Println(" broadcast - Broadcast a message to all connected peers") - fmt.Println(" fastsync - Fast sync blockchain data with a peer (V1)") - fmt.Println(" fastsyncv2 - Fast sync blockchain data with a peer (V2)") - fmt.Println(" firstsync - First sync: get all data from peer (server) or receive all data (client)") + fmt.Println(" fastsync - Fast sync blockchain data with a peer (V2 Engine)") + fmt.Println(" accountsync - Sync missing accounts only (skip block sync)") fmt.Println(" dbstate - Show current ImmuDB database state") fmt.Println(" propagateDID - Propagate a DID to the network") fmt.Println(" getDID - Get a DID document from the network") @@ -266,12 +266,10 @@ func (h *CommandHandler) handleCommand(parts []string) { h.handleShowStats() case "broadcast": h.handleBroadcast(parts) - case "fastsync": + case "fastsync", "fastsyncv2", "firstsync": h.handleFastSync(parts) - case "fastsyncv2": - h.handleFastSyncV2(parts) - case "firstsync": - h.handleFirstSync(parts) + case "accountsync": + h.handleAccountSync(parts) case "propagateDID": h.handlePropagateDID(parts) case "syncinfo": @@ -582,15 +580,8 @@ func (h *CommandHandler) handleFastSync(parts []string) { return } - err := h.checkDBClient() - if err != nil { - fmt.Printf("Database client not initialized: %v\n", err) - return - } - - err = h.checkDIDClient() - if err != nil { - fmt.Printf("DID database client not initialized: %v\n", err) + if h.FastSyncerV2 == nil { + fmt.Println("Error: FastsyncV2 engine is not initialized") return } @@ -608,172 +599,60 @@ func (h *CommandHandler) handleFastSync(parts []string) { return } - // Get both database states before sync - mainState, err := DB_OPs.GetDatabaseState(h.MainClient.Client) - if err != nil { - fmt.Printf("Failed to get main database state: %v\n", err) - return - } - - accountsState, err := DB_OPs.GetDatabaseState(h.DIDClient.Client) - if err != nil { - fmt.Printf("Failed to get accounts database state: %v\n", err) - return - } - - fmt.Printf("Starting blockchain sync with peer %s\n", addrInfo.ID.String()) - fmt.Printf("Our current main DB state: TxID=%d, Root=%x\n", mainState.TxId, mainState.TxHash) - fmt.Printf("Our current accounts DB state: TxID=%d, Root=%x\n", accountsState.TxId, accountsState.TxHash) - - // Start the sync process - startTime := time.Now().UTC() - - maxRetries := 3 - var syncErr error - - for retry := 0; retry < maxRetries; retry++ { - if retry > 0 { - fmt.Printf("Retry %d/%d after error: %v\n", retry+1, maxRetries, syncErr) - time.Sleep(2 * time.Second) - } - - _, syncErr = h.FastSyncer.HandleSync(addrInfo.ID) - if syncErr == nil { - break + // Show pre-sync DB state if clients are available + if h.MainClient != nil && h.DIDClient != nil { + mainState, err := DB_OPs.GetDatabaseState(h.MainClient.Client) + if err == nil { + fmt.Printf("Pre-sync main DB state: TxID=%d, Root=%x\n", mainState.TxId, mainState.TxHash) } } - if syncErr != nil { - fmt.Printf("Sync failed after %d attempts: %v\n", maxRetries, syncErr) - return - } - - // Get post-sync states - newMainState, err := DB_OPs.GetDatabaseState(h.MainClient.Client) - if err != nil { - fmt.Printf("Failed to get main database state after sync: %v\n", err) - return - } - - newAccountsState, err := DB_OPs.GetDatabaseState(h.DIDClient.Client) - if err != nil { - fmt.Printf("Failed to get accounts database state after sync: %v\n", err) - return - } - - fmt.Printf("Sync completed in %v\n", time.Since(startTime)) - fmt.Printf("New main DB state: TxID=%d, Root=%x\n", newMainState.TxId, newMainState.TxHash) - fmt.Printf("New accounts DB state: TxID=%d, Root=%x\n", newAccountsState.TxId, newAccountsState.TxHash) - printDashes() -} - -func (h *CommandHandler) handleFastSyncV2(parts []string) { - if len(parts) != 2 { - fmt.Println("Usage: fastsyncv2 ") - return - } - - // Parse the multiaddr - addr, err := ma.NewMultiaddr(parts[1]) - if err != nil { - fmt.Printf("Invalid multiaddress: %v\n", err) - return - } - - // Extract peer ID from multiaddr - addrInfo, err := peer.AddrInfoFromP2pAddr(addr) - if err != nil { - fmt.Printf("Failed to extract peer info: %v\n", err) - return - } - - fmt.Printf("Starting V2 blockchain fastsync with peer %s\n", addrInfo.ID.String()) + fmt.Printf("Starting blockchain fastsync (V2 Engine) with peer %s\n", addrInfo.ID.String()) startTime := time.Now().UTC() syncErr := h.FastSyncerV2.HandleSync(parts[1]) if syncErr != nil { - fmt.Printf("First sync failed: %v\n", syncErr) - return - } - fmt.Printf("FastsyncV2 completed completely in %v\n", time.Since(startTime)) - printDashes() -} - -func (h *CommandHandler) handleFirstSync(parts []string) { - if len(parts) != 3 { - fmt.Println("Usage: firstsync ") - fmt.Println(" server - Export and send all data from this node") - fmt.Println(" client - Receive and load all data from peer") + fmt.Printf("Fastsync failed: %v\n", syncErr) return } - err := h.checkDBClient() - if err != nil { - fmt.Printf("Database client not initialized: %v\n", err) - return + // Show post-sync DB state if clients are available + if h.MainClient != nil && h.DIDClient != nil { + newMainState, err := DB_OPs.GetDatabaseState(h.MainClient.Client) + if err == nil { + fmt.Printf("Post-sync main DB state: TxID=%d, Root=%x\n", newMainState.TxId, newMainState.TxHash) + } + newAccountsState, err := DB_OPs.GetDatabaseState(h.DIDClient.Client) + if err == nil { + fmt.Printf("Post-sync accounts DB state: TxID=%d, Root=%x\n", newAccountsState.TxId, newAccountsState.TxHash) + } } - err = h.checkDIDClient() - if err != nil { - fmt.Printf("DID database client not initialized: %v\n", err) - return - } + fmt.Printf("Fastsync completed in %v\n", time.Since(startTime)) + printDashes() +} - // Parse the multiaddr - addr, err := ma.NewMultiaddr(parts[1]) - if err != nil { - fmt.Printf("Invalid multiaddress: %v\n", err) - return - } - // Extract peer ID from multiaddr - addrInfo, err := peer.AddrInfoFromP2pAddr(addr) - if err != nil { - fmt.Printf("Failed to extract peer info: %v\n", err) +func (h *CommandHandler) handleAccountSync(parts []string) { + if len(parts) != 2 { + fmt.Println("Usage: accountsync ") return } - - mode := strings.ToLower(parts[2]) - if mode != "server" && mode != "client" { - fmt.Printf("Invalid mode: %s. Must be 'server' or 'client'\n", parts[2]) + if h.FastSyncerV2 == nil { + fmt.Println("Error: FastsyncV2 engine is not initialized") return } - fmt.Printf("Starting first sync with peer %s (mode: %s)\n", addrInfo.ID.String(), mode) + fmt.Printf("Starting account-only sync with peer %s\n", parts[1]) startTime := time.Now().UTC() - var syncErr error - if mode == "server" { - // Server mode: export and send all data - fmt.Println(">>> Running in SERVER mode - exporting all data...") - syncErr = h.FastSyncer.FirstSyncServer(addrInfo.ID) - } else { - // Client mode: receive and load all data - fmt.Println(">>> Running in CLIENT mode - receiving all data...") - syncErr = h.FastSyncer.FirstSyncClient(addrInfo.ID) - } - - if syncErr != nil { - fmt.Printf("First sync failed: %v\n", syncErr) - return - } - - // Get post-sync states - newMainState, err := DB_OPs.GetDatabaseState(h.MainClient.Client) - if err != nil { - fmt.Printf("Failed to get main database state after sync: %v\n", err) - return - } - - newAccountsState, err := DB_OPs.GetDatabaseState(h.DIDClient.Client) + synced, err := h.FastSyncerV2.AccountSyncOnly(parts[1]) if err != nil { - fmt.Printf("Failed to get accounts database state after sync: %v\n", err) + fmt.Printf("AccountSync failed: %v\n", err) return } - fmt.Printf("First sync completed in %v\n", time.Since(startTime)) - fmt.Printf("New main DB state: TxID=%d, Root=%x\n", newMainState.TxId, newMainState.TxHash) - fmt.Printf("New accounts DB state: TxID=%d, Root=%x\n", newAccountsState.TxId, newAccountsState.TxHash) + fmt.Printf("AccountSync complete: %d missing accounts synced in %v\n", synced, time.Since(startTime)) printDashes() } diff --git a/CLI/CLI_GRPC.go b/CLI/CLI_GRPC.go index a6dadb04..efc2fa58 100644 --- a/CLI/CLI_GRPC.go +++ b/CLI/CLI_GRPC.go @@ -226,6 +226,9 @@ func (h *CommandHandler) HandleFastSync(peeraddr string) (SyncStats, error) { if peeraddr == "" { return SyncStats{}, fmt.Errorf("usage: fastsync ") } + if !h.PullAllowed { + return SyncStats{}, fmt.Errorf("node is configured as a serve-only participant (pulling disabled). cannot pull data") + } err := h.checkDBClient() if err != nil { @@ -295,6 +298,9 @@ func (h *CommandHandler) HandleFastSyncV2(peeraddr string) (SyncStats, error) { if peeraddr == "" { return SyncStats{}, fmt.Errorf("usage: fastsyncv2 ") } + if !h.PullAllowed { + return SyncStats{}, fmt.Errorf("node is configured as a serve-only participant (pulling disabled). cannot pull data") + } // Make sure engine exists if h.FastSyncerV2 == nil { @@ -307,9 +313,15 @@ func (h *CommandHandler) HandleFastSyncV2(peeraddr string) (SyncStats, error) { return SyncStats{}, fmt.Errorf("FastsyncV2 failed: %w", err) } - // Re-fetch states to report - newMainState, _ := DB_OPs.GetDatabaseState(h.MainClient.Client) - newAccountsState, _ := DB_OPs.GetDatabaseState(h.DIDClient.Client) + // Re-fetch DB states to report. FastsyncV2 doesn't require MainClient/DIDClient + // for the sync itself, so guard against nil before querying. + var newMainState, newAccountsState *schema.ImmutableState + if h.MainClient != nil { + newMainState, _ = DB_OPs.GetDatabaseState(h.MainClient.Client) + } + if h.DIDClient != nil { + newAccountsState, _ = DB_OPs.GetDatabaseState(h.DIDClient.Client) + } return SyncStats{ TimeTaken: time.Since(startTime), @@ -318,6 +330,34 @@ func (h *CommandHandler) HandleFastSyncV2(peeraddr string) (SyncStats, error) { }, nil } +func (h *CommandHandler) HandleAccountSync(peeraddr string) (SyncStats, error) { + if peeraddr == "" { + return SyncStats{}, fmt.Errorf("usage: accountsync ") + } + if !h.PullAllowed { + return SyncStats{}, fmt.Errorf("node is configured as a serve-only participant (pulling disabled). cannot pull data") + } + if h.FastSyncerV2 == nil { + return SyncStats{}, fmt.Errorf("FastsyncV2 engine is inactive") + } + + startTime := time.Now().UTC() + _, err := h.FastSyncerV2.AccountSyncOnly(peeraddr) + if err != nil { + return SyncStats{}, fmt.Errorf("AccountSync failed: %w", err) + } + + var newAccountsState *schema.ImmutableState + if h.DIDClient != nil { + newAccountsState, _ = DB_OPs.GetDatabaseState(h.DIDClient.Client) + } + + return SyncStats{ + TimeTaken: time.Since(startTime), + AccountsState: newAccountsState, + }, nil +} + func (h *CommandHandler) HandleFirstSync(peeraddr string, mode string) (SyncStats, error) { if peeraddr == "" { return SyncStats{}, fmt.Errorf("usage: firstsync ") @@ -327,6 +367,11 @@ func (h *CommandHandler) HandleFirstSync(peeraddr string, mode string) (SyncStat return SyncStats{}, fmt.Errorf("usage: firstsync ") } + modeLower := strings.ToLower(mode) + if modeLower == "client" && !h.PullAllowed { + return SyncStats{}, fmt.Errorf("node is configured as a serve-only participant (pulling disabled). cannot pull data") + } + err := h.checkDBClient() if err != nil { return SyncStats{}, fmt.Errorf("database client not initialized: %v", err) @@ -349,7 +394,6 @@ func (h *CommandHandler) HandleFirstSync(peeraddr string, mode string) (SyncStat return SyncStats{}, fmt.Errorf("failed to extract peer info: %v", err) } - modeLower := strings.ToLower(mode) if modeLower != "server" && modeLower != "client" { return SyncStats{}, fmt.Errorf("invalid mode: %s. Must be 'server' or 'client'", mode) } diff --git a/CLI/GRPC_Server.go b/CLI/GRPC_Server.go index 18882eb7..1c849a9d 100644 --- a/CLI/GRPC_Server.go +++ b/CLI/GRPC_Server.go @@ -242,6 +242,17 @@ func (s *CLIServer) FastSyncV2(ctx context.Context, req *pb.PeerRequest) (*pb.Sy }, nil } +func (s *CLIServer) AccountSync(ctx context.Context, req *pb.PeerRequest) (*pb.SyncStats, error) { + stats, err := s.handler.HandleAccountSync(req.Peer) + if err != nil { + return &pb.SyncStats{Error: err.Error()}, nil + } + return &pb.SyncStats{ + TimeTaken: int64(stats.TimeTaken.Seconds()), + AccountsState: convertDBState(stats.AccountsState), + }, nil +} + func (s *CLIServer) FirstSync(ctx context.Context, req *pb.FirstSyncRequest) (*pb.SyncStats, error) { stats, err := s.handler.HandleFirstSync(req.Peer, req.Mode) if err != nil { @@ -269,6 +280,9 @@ func (s *CLIServer) GetDatabaseState(ctx context.Context, _ *emptypb.Empty) (*pb // Helper function to convert database state func convertDBState(state *schema.ImmutableState) *pb.DatabaseState { + if state == nil { + return &pb.DatabaseState{} + } return &pb.DatabaseState{ TxId: state.TxId, TxHash: state.TxHash, diff --git a/CLI/client.go b/CLI/client.go index 96be6157..a019353f 100644 --- a/CLI/client.go +++ b/CLI/client.go @@ -161,6 +161,12 @@ func (c *Client) FastSyncV2(peerAddr string) (*pb.SyncStats, error) { return c.conn.FastSyncV2(ctx, &pb.PeerRequest{Peer: peerAddr}) } +// AccountSync syncs missing accounts only (skips block sync) +func (c *Client) AccountSync(peerAddr string) (*pb.SyncStats, error) { + ctx := context.Background() + return c.conn.AccountSync(ctx, &pb.PeerRequest{Peer: peerAddr}) +} + // FirstSync performs first synchronization with a peer (server or client mode) func (c *Client) FirstSync(peerAddr string, mode string) (*pb.SyncStats, error) { // ctx, cancel := context.WithTimeout(context.Background(), 600*time.Second) diff --git a/CLI/proto/Connection.pb.go b/CLI/proto/Connection.pb.go index 1f038e3f..39e3c52a 100644 --- a/CLI/proto/Connection.pb.go +++ b/CLI/proto/Connection.pb.go @@ -1,7 +1,7 @@ // Code generated by protoc-gen-go. DO NOT EDIT. // versions: -// protoc-gen-go v1.36.9 -// protoc v6.32.0 +// protoc-gen-go v1.36.11 +// protoc v7.34.1 // source: Connection.proto package proto @@ -1310,7 +1310,8 @@ const file_Connection_proto_rawDesc = "" + "\x0eDatabaseStates\x12+\n" + "\amain_db\x18\x01 \x01(\v2\x12.cli.DatabaseStateR\x06mainDb\x123\n" + "\vaccounts_db\x18\x02 \x01(\v2\x12.cli.DatabaseStateR\n" + - "accountsDb2\xeb\t\n" + + "accountsDb2\x9e\n" + + "\n" + "\n" + "CLIService\x124\n" + "\tListPeers\x12\x16.google.protobuf.Empty\x1a\r.cli.PeerList\"\x00\x125\n" + @@ -1328,7 +1329,8 @@ const file_Connection_proto_rawDesc = "" + "\fPropagateDID\x12\x1a.cli.DIDPropagationRequest\x1a\x16.cli.OperationResponse\"\x00\x12.\n" + "\bFastSync\x12\x10.cli.PeerRequest\x1a\x0e.cli.SyncStats\"\x00\x120\n" + "\n" + - "FastSyncV2\x12\x10.cli.PeerRequest\x1a\x0e.cli.SyncStats\"\x00\x124\n" + + "FastSyncV2\x12\x10.cli.PeerRequest\x1a\x0e.cli.SyncStats\"\x00\x121\n" + + "\vAccountSync\x12\x10.cli.PeerRequest\x1a\x0e.cli.SyncStats\"\x00\x124\n" + "\tFirstSync\x12\x15.cli.FirstSyncRequest\x1a\x0e.cli.SyncStats\"\x00\x12A\n" + "\x10GetDatabaseState\x12\x16.google.protobuf.Empty\x1a\x13.cli.DatabaseStates\"\x00\x123\n" + "\vReturnAddrs\x12\x16.google.protobuf.Empty\x1a\n" + @@ -1397,37 +1399,39 @@ var file_Connection_proto_depIdxs = []int32{ 12, // 17: cli.CLIService.PropagateDID:input_type -> cli.DIDPropagationRequest 8, // 18: cli.CLIService.FastSync:input_type -> cli.PeerRequest 8, // 19: cli.CLIService.FastSyncV2:input_type -> cli.PeerRequest - 13, // 20: cli.CLIService.FirstSync:input_type -> cli.FirstSyncRequest - 21, // 21: cli.CLIService.GetDatabaseState:input_type -> google.protobuf.Empty - 21, // 22: cli.CLIService.ReturnAddrs:input_type -> google.protobuf.Empty - 21, // 23: cli.CLIService.GetSyncInfo:input_type -> google.protobuf.Empty - 21, // 24: cli.CLIService.GetGethStatus:input_type -> google.protobuf.Empty - 21, // 25: cli.CLIService.DiscoverNeighbors:input_type -> google.protobuf.Empty - 21, // 26: cli.CLIService.ListAliases:input_type -> google.protobuf.Empty - 21, // 27: cli.CLIService.GetNodeVersion:input_type -> google.protobuf.Empty - 1, // 28: cli.CLIService.ListPeers:output_type -> cli.PeerList - 17, // 29: cli.CLIService.AddPeer:output_type -> cli.OperationResponse - 17, // 30: cli.CLIService.RemovePeer:output_type -> cli.OperationResponse - 18, // 31: cli.CLIService.CleanPeers:output_type -> cli.CleanPeersResponse - 17, // 32: cli.CLIService.SendMessage:output_type -> cli.OperationResponse - 17, // 33: cli.CLIService.SendYggdrasilMessage:output_type -> cli.OperationResponse - 17, // 34: cli.CLIService.SendFile:output_type -> cli.OperationResponse - 17, // 35: cli.CLIService.BroadcastMessage:output_type -> cli.OperationResponse - 2, // 36: cli.CLIService.GetMessageStats:output_type -> cli.MessageStats - 4, // 37: cli.CLIService.GetDID:output_type -> cli.DIDDocument - 17, // 38: cli.CLIService.PropagateDID:output_type -> cli.OperationResponse - 5, // 39: cli.CLIService.FastSync:output_type -> cli.SyncStats - 5, // 40: cli.CLIService.FastSyncV2:output_type -> cli.SyncStats - 5, // 41: cli.CLIService.FirstSync:output_type -> cli.SyncStats - 19, // 42: cli.CLIService.GetDatabaseState:output_type -> cli.DatabaseStates - 6, // 43: cli.CLIService.ReturnAddrs:output_type -> cli.Addrs - 14, // 44: cli.CLIService.GetSyncInfo:output_type -> cli.SyncInfo - 15, // 45: cli.CLIService.GetGethStatus:output_type -> cli.GethStatus - 17, // 46: cli.CLIService.DiscoverNeighbors:output_type -> cli.OperationResponse - 16, // 47: cli.CLIService.ListAliases:output_type -> cli.AliasList - 7, // 48: cli.CLIService.GetNodeVersion:output_type -> cli.VersionInfo - 28, // [28:49] is the sub-list for method output_type - 7, // [7:28] is the sub-list for method input_type + 8, // 20: cli.CLIService.AccountSync:input_type -> cli.PeerRequest + 13, // 21: cli.CLIService.FirstSync:input_type -> cli.FirstSyncRequest + 21, // 22: cli.CLIService.GetDatabaseState:input_type -> google.protobuf.Empty + 21, // 23: cli.CLIService.ReturnAddrs:input_type -> google.protobuf.Empty + 21, // 24: cli.CLIService.GetSyncInfo:input_type -> google.protobuf.Empty + 21, // 25: cli.CLIService.GetGethStatus:input_type -> google.protobuf.Empty + 21, // 26: cli.CLIService.DiscoverNeighbors:input_type -> google.protobuf.Empty + 21, // 27: cli.CLIService.ListAliases:input_type -> google.protobuf.Empty + 21, // 28: cli.CLIService.GetNodeVersion:input_type -> google.protobuf.Empty + 1, // 29: cli.CLIService.ListPeers:output_type -> cli.PeerList + 17, // 30: cli.CLIService.AddPeer:output_type -> cli.OperationResponse + 17, // 31: cli.CLIService.RemovePeer:output_type -> cli.OperationResponse + 18, // 32: cli.CLIService.CleanPeers:output_type -> cli.CleanPeersResponse + 17, // 33: cli.CLIService.SendMessage:output_type -> cli.OperationResponse + 17, // 34: cli.CLIService.SendYggdrasilMessage:output_type -> cli.OperationResponse + 17, // 35: cli.CLIService.SendFile:output_type -> cli.OperationResponse + 17, // 36: cli.CLIService.BroadcastMessage:output_type -> cli.OperationResponse + 2, // 37: cli.CLIService.GetMessageStats:output_type -> cli.MessageStats + 4, // 38: cli.CLIService.GetDID:output_type -> cli.DIDDocument + 17, // 39: cli.CLIService.PropagateDID:output_type -> cli.OperationResponse + 5, // 40: cli.CLIService.FastSync:output_type -> cli.SyncStats + 5, // 41: cli.CLIService.FastSyncV2:output_type -> cli.SyncStats + 5, // 42: cli.CLIService.AccountSync:output_type -> cli.SyncStats + 5, // 43: cli.CLIService.FirstSync:output_type -> cli.SyncStats + 19, // 44: cli.CLIService.GetDatabaseState:output_type -> cli.DatabaseStates + 6, // 45: cli.CLIService.ReturnAddrs:output_type -> cli.Addrs + 14, // 46: cli.CLIService.GetSyncInfo:output_type -> cli.SyncInfo + 15, // 47: cli.CLIService.GetGethStatus:output_type -> cli.GethStatus + 17, // 48: cli.CLIService.DiscoverNeighbors:output_type -> cli.OperationResponse + 16, // 49: cli.CLIService.ListAliases:output_type -> cli.AliasList + 7, // 50: cli.CLIService.GetNodeVersion:output_type -> cli.VersionInfo + 29, // [29:51] is the sub-list for method output_type + 7, // [7:29] is the sub-list for method input_type 7, // [7:7] is the sub-list for extension type_name 7, // [7:7] is the sub-list for extension extendee 0, // [0:7] is the sub-list for field type_name diff --git a/CLI/proto/Connection.proto b/CLI/proto/Connection.proto index 14fde448..95ee21a0 100644 --- a/CLI/proto/Connection.proto +++ b/CLI/proto/Connection.proto @@ -92,6 +92,7 @@ service CLIService { // Database Operations rpc FastSync(PeerRequest) returns (SyncStats) {} rpc FastSyncV2(PeerRequest) returns (SyncStats) {} + rpc AccountSync(PeerRequest) returns (SyncStats) {} rpc FirstSync(FirstSyncRequest) returns (SyncStats) {} rpc GetDatabaseState(google.protobuf.Empty) returns (DatabaseStates) {} diff --git a/CLI/proto/Connection_grpc.pb.go b/CLI/proto/Connection_grpc.pb.go index 15e6b8fe..4c4d24b5 100644 --- a/CLI/proto/Connection_grpc.pb.go +++ b/CLI/proto/Connection_grpc.pb.go @@ -1,7 +1,7 @@ // Code generated by protoc-gen-go-grpc. DO NOT EDIT. // versions: -// - protoc-gen-go-grpc v1.5.1 -// - protoc v6.32.0 +// - protoc-gen-go-grpc v1.6.2 +// - protoc v7.34.1 // source: Connection.proto package proto @@ -33,6 +33,7 @@ const ( CLIService_PropagateDID_FullMethodName = "/cli.CLIService/PropagateDID" CLIService_FastSync_FullMethodName = "/cli.CLIService/FastSync" CLIService_FastSyncV2_FullMethodName = "/cli.CLIService/FastSyncV2" + CLIService_AccountSync_FullMethodName = "/cli.CLIService/AccountSync" CLIService_FirstSync_FullMethodName = "/cli.CLIService/FirstSync" CLIService_GetDatabaseState_FullMethodName = "/cli.CLIService/GetDatabaseState" CLIService_ReturnAddrs_FullMethodName = "/cli.CLIService/ReturnAddrs" @@ -66,6 +67,7 @@ type CLIServiceClient interface { // Database Operations FastSync(ctx context.Context, in *PeerRequest, opts ...grpc.CallOption) (*SyncStats, error) FastSyncV2(ctx context.Context, in *PeerRequest, opts ...grpc.CallOption) (*SyncStats, error) + AccountSync(ctx context.Context, in *PeerRequest, opts ...grpc.CallOption) (*SyncStats, error) FirstSync(ctx context.Context, in *FirstSyncRequest, opts ...grpc.CallOption) (*SyncStats, error) GetDatabaseState(ctx context.Context, in *emptypb.Empty, opts ...grpc.CallOption) (*DatabaseStates, error) // Node Operations @@ -217,6 +219,16 @@ func (c *cLIServiceClient) FastSyncV2(ctx context.Context, in *PeerRequest, opts return out, nil } +func (c *cLIServiceClient) AccountSync(ctx context.Context, in *PeerRequest, opts ...grpc.CallOption) (*SyncStats, error) { + cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...) + out := new(SyncStats) + err := c.cc.Invoke(ctx, CLIService_AccountSync_FullMethodName, in, out, cOpts...) + if err != nil { + return nil, err + } + return out, nil +} + func (c *cLIServiceClient) FirstSync(ctx context.Context, in *FirstSyncRequest, opts ...grpc.CallOption) (*SyncStats, error) { cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...) out := new(SyncStats) @@ -320,6 +332,7 @@ type CLIServiceServer interface { // Database Operations FastSync(context.Context, *PeerRequest) (*SyncStats, error) FastSyncV2(context.Context, *PeerRequest) (*SyncStats, error) + AccountSync(context.Context, *PeerRequest) (*SyncStats, error) FirstSync(context.Context, *FirstSyncRequest) (*SyncStats, error) GetDatabaseState(context.Context, *emptypb.Empty) (*DatabaseStates, error) // Node Operations @@ -342,67 +355,70 @@ type CLIServiceServer interface { type UnimplementedCLIServiceServer struct{} func (UnimplementedCLIServiceServer) ListPeers(context.Context, *emptypb.Empty) (*PeerList, error) { - return nil, status.Errorf(codes.Unimplemented, "method ListPeers not implemented") + return nil, status.Error(codes.Unimplemented, "method ListPeers not implemented") } func (UnimplementedCLIServiceServer) AddPeer(context.Context, *PeerRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method AddPeer not implemented") + return nil, status.Error(codes.Unimplemented, "method AddPeer not implemented") } func (UnimplementedCLIServiceServer) RemovePeer(context.Context, *PeerRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method RemovePeer not implemented") + return nil, status.Error(codes.Unimplemented, "method RemovePeer not implemented") } func (UnimplementedCLIServiceServer) CleanPeers(context.Context, *emptypb.Empty) (*CleanPeersResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method CleanPeers not implemented") + return nil, status.Error(codes.Unimplemented, "method CleanPeers not implemented") } func (UnimplementedCLIServiceServer) SendMessage(context.Context, *MessageRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method SendMessage not implemented") + return nil, status.Error(codes.Unimplemented, "method SendMessage not implemented") } func (UnimplementedCLIServiceServer) SendYggdrasilMessage(context.Context, *MessageRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method SendYggdrasilMessage not implemented") + return nil, status.Error(codes.Unimplemented, "method SendYggdrasilMessage not implemented") } func (UnimplementedCLIServiceServer) SendFile(context.Context, *FileRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method SendFile not implemented") + return nil, status.Error(codes.Unimplemented, "method SendFile not implemented") } func (UnimplementedCLIServiceServer) BroadcastMessage(context.Context, *MessageRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method BroadcastMessage not implemented") + return nil, status.Error(codes.Unimplemented, "method BroadcastMessage not implemented") } func (UnimplementedCLIServiceServer) GetMessageStats(context.Context, *emptypb.Empty) (*MessageStats, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetMessageStats not implemented") + return nil, status.Error(codes.Unimplemented, "method GetMessageStats not implemented") } func (UnimplementedCLIServiceServer) GetDID(context.Context, *DIDRequest) (*DIDDocument, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetDID not implemented") + return nil, status.Error(codes.Unimplemented, "method GetDID not implemented") } func (UnimplementedCLIServiceServer) PropagateDID(context.Context, *DIDPropagationRequest) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method PropagateDID not implemented") + return nil, status.Error(codes.Unimplemented, "method PropagateDID not implemented") } func (UnimplementedCLIServiceServer) FastSync(context.Context, *PeerRequest) (*SyncStats, error) { - return nil, status.Errorf(codes.Unimplemented, "method FastSync not implemented") + return nil, status.Error(codes.Unimplemented, "method FastSync not implemented") } func (UnimplementedCLIServiceServer) FastSyncV2(context.Context, *PeerRequest) (*SyncStats, error) { - return nil, status.Errorf(codes.Unimplemented, "method FastSyncV2 not implemented") + return nil, status.Error(codes.Unimplemented, "method FastSyncV2 not implemented") +} +func (UnimplementedCLIServiceServer) AccountSync(context.Context, *PeerRequest) (*SyncStats, error) { + return nil, status.Error(codes.Unimplemented, "method AccountSync not implemented") } func (UnimplementedCLIServiceServer) FirstSync(context.Context, *FirstSyncRequest) (*SyncStats, error) { - return nil, status.Errorf(codes.Unimplemented, "method FirstSync not implemented") + return nil, status.Error(codes.Unimplemented, "method FirstSync not implemented") } func (UnimplementedCLIServiceServer) GetDatabaseState(context.Context, *emptypb.Empty) (*DatabaseStates, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetDatabaseState not implemented") + return nil, status.Error(codes.Unimplemented, "method GetDatabaseState not implemented") } func (UnimplementedCLIServiceServer) ReturnAddrs(context.Context, *emptypb.Empty) (*Addrs, error) { - return nil, status.Errorf(codes.Unimplemented, "method ReturnAddrs not implemented") + return nil, status.Error(codes.Unimplemented, "method ReturnAddrs not implemented") } func (UnimplementedCLIServiceServer) GetSyncInfo(context.Context, *emptypb.Empty) (*SyncInfo, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetSyncInfo not implemented") + return nil, status.Error(codes.Unimplemented, "method GetSyncInfo not implemented") } func (UnimplementedCLIServiceServer) GetGethStatus(context.Context, *emptypb.Empty) (*GethStatus, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetGethStatus not implemented") + return nil, status.Error(codes.Unimplemented, "method GetGethStatus not implemented") } func (UnimplementedCLIServiceServer) DiscoverNeighbors(context.Context, *emptypb.Empty) (*OperationResponse, error) { - return nil, status.Errorf(codes.Unimplemented, "method DiscoverNeighbors not implemented") + return nil, status.Error(codes.Unimplemented, "method DiscoverNeighbors not implemented") } func (UnimplementedCLIServiceServer) ListAliases(context.Context, *emptypb.Empty) (*AliasList, error) { - return nil, status.Errorf(codes.Unimplemented, "method ListAliases not implemented") + return nil, status.Error(codes.Unimplemented, "method ListAliases not implemented") } func (UnimplementedCLIServiceServer) GetNodeVersion(context.Context, *emptypb.Empty) (*VersionInfo, error) { - return nil, status.Errorf(codes.Unimplemented, "method GetNodeVersion not implemented") + return nil, status.Error(codes.Unimplemented, "method GetNodeVersion not implemented") } func (UnimplementedCLIServiceServer) mustEmbedUnimplementedCLIServiceServer() {} func (UnimplementedCLIServiceServer) testEmbeddedByValue() {} @@ -415,7 +431,7 @@ type UnsafeCLIServiceServer interface { } func RegisterCLIServiceServer(s grpc.ServiceRegistrar, srv CLIServiceServer) { - // If the following call pancis, it indicates UnimplementedCLIServiceServer was + // If the following call panics, it indicates UnimplementedCLIServiceServer was // embedded by pointer and is nil. This will cause panics if an // unimplemented method is ever invoked, so we test this at initialization // time to prevent it from happening at runtime later due to I/O. @@ -659,6 +675,24 @@ func _CLIService_FastSyncV2_Handler(srv interface{}, ctx context.Context, dec fu return interceptor(ctx, in, info, handler) } +func _CLIService_AccountSync_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) { + in := new(PeerRequest) + if err := dec(in); err != nil { + return nil, err + } + if interceptor == nil { + return srv.(CLIServiceServer).AccountSync(ctx, in) + } + info := &grpc.UnaryServerInfo{ + Server: srv, + FullMethod: CLIService_AccountSync_FullMethodName, + } + handler := func(ctx context.Context, req interface{}) (interface{}, error) { + return srv.(CLIServiceServer).AccountSync(ctx, req.(*PeerRequest)) + } + return interceptor(ctx, in, info, handler) +} + func _CLIService_FirstSync_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) { in := new(FirstSyncRequest) if err := dec(in); err != nil { @@ -862,6 +896,10 @@ var CLIService_ServiceDesc = grpc.ServiceDesc{ MethodName: "FastSyncV2", Handler: _CLIService_FastSyncV2_Handler, }, + { + MethodName: "AccountSync", + Handler: _CLIService_AccountSync_Handler, + }, { MethodName: "FirstSync", Handler: _CLIService_FirstSync_Handler, diff --git a/DB_OPs/Nodeinfo/account_sync_enqueue_test.go b/DB_OPs/Nodeinfo/account_sync_enqueue_test.go new file mode 100644 index 00000000..5ed9bf61 --- /dev/null +++ b/DB_OPs/Nodeinfo/account_sync_enqueue_test.go @@ -0,0 +1,136 @@ +// White-box test for the bounded-enqueue chunking logic (enqueueRecordsChunked). +// Lives in package NodeInfo because the helper, the RedisStreamer constants, and the +// payload-type tags are unexported. No live Redis/ImmuDB needed — a recording mock +// streamer captures every XADD so we can assert chunk boundaries. +// +// NOTE: craftcode Phase 6 prefers tests under a tests/ tree; Go package-internal +// visibility forces this same-dir _test.go. Matches the repo convention in +// DB_OPs/sqlops/sqlops_test.go. +package NodeInfo + +import ( + "context" + "encoding/json" + "errors" + "testing" + "time" +) + +// recordingStreamer captures Enqueue payloads and optionally fails selected chunks. +// Only Enqueue is exercised; the rest satisfy RedisStreamer with inert returns. +type recordingStreamer struct { + messages []map[string]any + calls int + failEach int // if >0, every Nth Enqueue call returns an error +} + +func (r *recordingStreamer) Enqueue(_ context.Context, _ string, values map[string]any) (string, error) { + r.calls++ + if r.failEach > 0 && r.calls%r.failEach == 0 { + return "", errors.New("simulated XADD failure") + } + r.messages = append(r.messages, values) + return "id", nil +} + +func (r *recordingStreamer) EnsureConsumerGroup(context.Context, string, string) error { return nil } +func (r *recordingStreamer) ReadGroup(context.Context, string, string, string, int64, time.Duration) ([]StreamEntry, error) { + return nil, nil +} +func (r *recordingStreamer) Ack(context.Context, string, string, ...string) error { return nil } +func (r *recordingStreamer) Delete(context.Context, string, ...string) error { return nil } +func (r *recordingStreamer) AutoClaim(context.Context, string, string, string, time.Duration, string, int64) ([]StreamEntry, string, error) { + return nil, "0-0", nil +} +func (r *recordingStreamer) Len(context.Context, string) (int64, error) { return 0, nil } +func (r *recordingStreamer) PendingCount(context.Context, string, string) (int64, error) { return 0, nil } + +// decodeCount returns how many records a recorded message's "data" field holds. +func decodeCount(t *testing.T, msg map[string]any) int { + t.Helper() + data, ok := msg["data"].(string) + if !ok { + t.Fatalf("message missing string data field: %#v", msg) + } + var recs []json.RawMessage + if err := json.Unmarshal([]byte(data), &recs); err != nil { + t.Fatalf("data is not a JSON array: %v", err) + } + return len(recs) +} + +func TestEnqueueRecordsChunked_Boundaries(t *testing.T) { + cases := []struct { + name string + n int + wantMsgs int + }{ + {"empty", 0, 0}, + {"single", 1, 1}, + {"under_one_chunk", 499, 1}, + {"exactly_one_chunk", 500, 1}, + {"one_over", 501, 2}, + {"two_chunks", 1000, 2}, + {"uneven", 2500, 5}, + {"uneven_remainder", 2501, 6}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + items := make([]int, tc.n) + for i := range items { + items[i] = i + } + rs := &recordingStreamer{} + err := enqueueRecordsChunked(context.Background(), rs, payloadTypeAccounts, items) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if len(rs.messages) != tc.wantMsgs { + t.Fatalf("message count = %d, want %d", len(rs.messages), tc.wantMsgs) + } + total := 0 + for _, msg := range rs.messages { + if tag, _ := msg["type"].(string); tag != string(payloadTypeAccounts) { + t.Fatalf("type tag = %q, want %q", tag, payloadTypeAccounts) + } + c := decodeCount(t, msg) + if c > maxRecordsPerMessage { + t.Fatalf("chunk holds %d records, exceeds cap %d", c, maxRecordsPerMessage) + } + total += c + } + if total != tc.n { + t.Fatalf("total records across messages = %d, want %d", total, tc.n) + } + }) + } +} + +// TestEnqueueRecordsChunked_BestEffort verifies that a transient failure on one chunk +// does not drop the others: the helper attempts every chunk, returns an aggregated +// error, yet the successful chunks are still enqueued. +func TestEnqueueRecordsChunked_BestEffort(t *testing.T) { + const n = 2500 // 5 chunks of 500 + items := make([]int, n) + rs := &recordingStreamer{failEach: 3} // fail the 3rd Enqueue call + + err := enqueueRecordsChunked(context.Background(), rs, payloadTypeAccounts, items) + if err == nil { + t.Fatal("expected aggregated error from failed chunk, got nil") + } + if rs.calls != 5 { + t.Fatalf("Enqueue attempted %d times, want 5 (all chunks attempted despite failure)", rs.calls) + } + if len(rs.messages) != 4 { + t.Fatalf("recorded %d successful messages, want 4 (one chunk failed)", len(rs.messages)) + } +} + +func TestChunkCount(t *testing.T) { + cases := map[int]int{0: 0, 1: 1, 499: 1, 500: 1, 501: 2, 1000: 2, 2500: 5} + for n, want := range cases { + if got := chunkCount(n); got != want { + t.Errorf("chunkCount(%d) = %d, want %d", n, got, want) + } + } +} diff --git a/DB_OPs/Nodeinfo/account_sync_redis.go b/DB_OPs/Nodeinfo/account_sync_redis.go new file mode 100644 index 00000000..d00edfe1 --- /dev/null +++ b/DB_OPs/Nodeinfo/account_sync_redis.go @@ -0,0 +1,254 @@ +// MODULE: DB_OPs/Nodeinfo/account_sync_redis +// PURPOSE: Define the Redis stream transport abstraction (RedisStreamer interface) and +// adapt *redis.Client to it. Owns zero DB or business logic — pure transport. +// +// CORE DATA STRUCTURES: +// - StreamEntry: ephemeral; one per stream message read. Count per ReadGroup call +// is bounded by AccountSyncWorkerConfig.MaxDrainItems at the call site. +// - pkgAccountStreamer / pkgWorkerManager (package-level): set once by InstallAccountQueue. +// Read by every WriteAccounts / BatchUpdateAccounts call. Never replaced after set. +// +// TO MODIFY BEHAVIOR: +// - Change stream backend: implement RedisStreamer → pass to StartAccountSyncWorker. +// - Change stream key / consumer group name: update constants below; no logic changes. +// - Add a new stream key: define a new constant; add a corresponding Enqueue call in +// immudb_account_manager.go and a new case in processBatch. +// +// DO NOT: +// - Import *redis.Client outside redisStreamerAdapter — it is the only concrete import. +// - Store request-scoped state on redisStreamerAdapter (stateless wrapper by design). +// - Replace pkgAccountStreamer with a per-call parameter — types.AccountManager interface +// signatures are fixed by the external JMDN-FastSync module and cannot be changed. +// +// EXTENSION POINT: new queue backends → implement RedisStreamer; inject via StartAccountSyncWorker. +// +// CHANGE SCENARIOS: +// Swap Redis client lib: rewrite redisStreamerAdapter methods — interface unchanged. +// Add new stream key: add constant + Enqueue call in account_manager — this file unchanged. +// Change group/consumer: edit constants — no logic change required. + +package NodeInfo + +import ( + "context" + "strings" + "sync" + "time" + + "github.com/redis/go-redis/v9" +) + +// ─── Stream constants ───────────────────────────────────────────────────────── + +const ( + // accountSyncStream is the Redis stream key for all account sync payloads. + accountSyncStream = "accountsync:accounts" + // accountSyncGroup is the consumer group name. One group = one logical processor. + accountSyncGroup = "accountsync-workers" + // accountSyncConsumer is the consumer name within the group. Single worker model. + accountSyncConsumer = "worker-0" +) + +// syncPayloadType discriminates between WriteAccounts and BatchUpdateAccounts payloads +// stored in the same stream. +type syncPayloadType string + +const ( + payloadTypeAccounts syncPayloadType = "accounts" // payload: []*types.Account (JSON) + payloadTypeUpdates syncPayloadType = "updates" // payload: []accountUpdateWire (JSON) +) + +// ─── Domain types ───────────────────────────────────────────────────────────── + +// StreamEntry is a single Redis stream message with its assigned stream ID. +// ID is used for XACK after successful DB write. +// Values contains the raw message fields as returned by go-redis. +type StreamEntry struct { + ID string + Values map[string]any +} + +// ─── RedisStreamer interface ────────────────────────────────────────────────── + +// RedisStreamer is the minimal Redis stream surface required by the account sync worker. +// It uses only domain-level types — no go-redis types leak through the interface. +// The concrete implementation is redisStreamerAdapter (wraps *redis.Client). +// Tests may substitute a mock implementing this interface. +type RedisStreamer interface { + // Enqueue appends a message to the named stream. Returns the assigned message ID. + // Time: O(1) — single XADD round trip. + Enqueue(ctx context.Context, stream string, values map[string]any) (string, error) + + // EnsureConsumerGroup creates the consumer group on the stream, creating the stream + // itself if it does not exist. Idempotent: no-op if the group already exists. + // Time: O(1) — single XGROUP CREATE round trip. + EnsureConsumerGroup(ctx context.Context, stream, group string) error + + // ReadGroup performs a blocking read from the stream under the given consumer group. + // Reads at most count new (undelivered) entries; blocks up to blockDur waiting for data. + // Returns nil, nil on timeout (no data within blockDur). + // Read entries move to the Pending Entries List (PEL) until ACKed. + // Time: O(count) — single XREADGROUP round trip. + ReadGroup(ctx context.Context, stream, group, consumer string, count int64, blockDur time.Duration) ([]StreamEntry, error) + + // Ack acknowledges the given message IDs, removing them from the PEL. + // Only call after the DB write succeeds — unACKed entries are replayed via AutoClaim. + // Time: O(|ids|) — single XACK round trip. + Ack(ctx context.Context, stream, group string, ids ...string) error + + // Delete removes message IDs from the stream body (XDEL), reclaiming memory. + // Call in a pipeline with Ack after every successful DB commit. XACK alone leaves + // the payload resident in the stream; XDEL is required to reclaim that space. + // Time: O(|ids|) — single XDEL round trip. + Delete(ctx context.Context, stream string, ids ...string) error + + // AutoClaim reclaims pending entries that have been idle longer than minIdle. + // start is the minimum PEL cursor ID ("0-0" to scan from the beginning). + // Returns reclaimed entries and the next cursor ID. + // "0-0" as the returned cursor means the full PEL was scanned. + // Time: O(count) — single XAUTOCLAIM round trip. + AutoClaim(ctx context.Context, stream, group, consumer string, minIdle time.Duration, start string, count int64) ([]StreamEntry, string, error) + + // Len returns the total number of messages currently in the stream (XLEN). + // Time: O(1). + Len(ctx context.Context, stream string) (int64, error) + + // PendingCount returns the count of unacked messages in the PEL for the given group. + // Time: O(1) — single XPENDING round trip. + PendingCount(ctx context.Context, stream, group string) (int64, error) +} + +// ─── Concrete adapter ───────────────────────────────────────────────────────── + +// redisStreamerAdapter adapts *redis.Client to the RedisStreamer interface. +// It is the ONLY place in DB_OPs/Nodeinfo that imports a concrete Redis type. +type redisStreamerAdapter struct { + client *redis.Client +} + +// NewRedisStreamer wraps a *redis.Client as a RedisStreamer. +// Construct in main.go and pass the result to StartAccountSyncWorker. +// +// Time: O(1) +func NewRedisStreamer(client *redis.Client) RedisStreamer { + return &redisStreamerAdapter{client: client} +} + +// Time: O(1) — single XADD round trip +func (r *redisStreamerAdapter) Enqueue(ctx context.Context, stream string, values map[string]any) (string, error) { + return r.client.XAdd(ctx, &redis.XAddArgs{ + Stream: stream, + Values: values, + }).Result() +} + +// Time: O(1) — single XGROUP CREATECONSUMER or XGROUP CREATE round trip. +// BUSYGROUP error means the group already exists; treated as success. +func (r *redisStreamerAdapter) EnsureConsumerGroup(ctx context.Context, stream, group string) error { + err := r.client.XGroupCreateMkStream(ctx, stream, group, "0").Err() + if err != nil && !strings.Contains(err.Error(), "BUSYGROUP") { + return err + } + return nil +} + +// Time: O(count) — XREADGROUP COUNT count BLOCK blockDur ms +// Redis.Nil is returned on timeout; mapped to (nil, nil) so callers don't treat it as an error. +func (r *redisStreamerAdapter) ReadGroup(ctx context.Context, stream, group, consumer string, count int64, blockDur time.Duration) ([]StreamEntry, error) { + result, err := r.client.XReadGroup(ctx, &redis.XReadGroupArgs{ + Group: group, + Consumer: consumer, + Streams: []string{stream, ">"}, + Count: count, + Block: blockDur, + NoAck: false, + }).Result() + if err != nil { + if err == redis.Nil { + return nil, nil // timeout — no data; caller loops + } + return nil, err + } + var entries []StreamEntry + for _, s := range result { + for _, msg := range s.Messages { + entries = append(entries, StreamEntry{ID: msg.ID, Values: msg.Values}) + } + } + return entries, nil +} + +// Time: O(|ids|) — single XACK round trip +func (r *redisStreamerAdapter) Ack(ctx context.Context, stream, group string, ids ...string) error { + return r.client.XAck(ctx, stream, group, ids...).Err() +} + +// Time: O(|ids|) — single XDEL round trip +func (r *redisStreamerAdapter) Delete(ctx context.Context, stream string, ids ...string) error { + if len(ids) == 0 { + return nil + } + return r.client.XDel(ctx, stream, ids...).Err() +} + +// Time: O(count) — single XAUTOCLAIM round trip +// go-redis v9 XAutoClaimCmd.Result() returns ([]XMessage, string, error) — three values. +func (r *redisStreamerAdapter) AutoClaim(ctx context.Context, stream, group, consumer string, minIdle time.Duration, start string, count int64) ([]StreamEntry, string, error) { + messages, next, err := r.client.XAutoClaim(ctx, &redis.XAutoClaimArgs{ + Stream: stream, + Group: group, + Consumer: consumer, + MinIdle: minIdle, + Start: start, + Count: count, + }).Result() + if err != nil { + return nil, "0-0", err + } + var entries []StreamEntry + for _, msg := range messages { + entries = append(entries, StreamEntry{ID: msg.ID, Values: msg.Values}) + } + return entries, next, nil +} + +func (r *redisStreamerAdapter) Len(ctx context.Context, stream string) (int64, error) { + return r.client.XLen(ctx, stream).Result() +} + +func (r *redisStreamerAdapter) PendingCount(ctx context.Context, stream, group string) (int64, error) { + info, err := r.client.XPending(ctx, stream, group).Result() + if err != nil { + return 0, err + } + return info.Count, nil +} + +// ─── Package-level queue singleton ─────────────────────────────────────────── + +// pkgAccountStreamer and pkgWorkerManager are set once by InstallAccountQueue. +// Read by every WriteAccounts / BatchUpdateAccounts call. types.AccountManager +// interface signatures are fixed externally — package-level injection is the only path. +var ( + pkgAccountStreamer RedisStreamer + pkgWorkerManager *WorkerManager + pkgAccountQueueMu sync.RWMutex +) + +// InstallAccountQueue stores the streamer and manager together. +// Called once from StartAccountSyncWorker during node startup. +func InstallAccountQueue(s RedisStreamer, m *WorkerManager) { + pkgAccountQueueMu.Lock() + pkgAccountStreamer = s + pkgWorkerManager = m + pkgAccountQueueMu.Unlock() +} + +// getAccountQueue returns the package-level streamer and worker manager. +// Both are nil if InstallAccountQueue has not yet been called. +// Time: O(1) +func getAccountQueue() (RedisStreamer, *WorkerManager) { + pkgAccountQueueMu.RLock() + defer pkgAccountQueueMu.RUnlock() + return pkgAccountStreamer, pkgWorkerManager +} diff --git a/DB_OPs/Nodeinfo/account_sync_worker.go b/DB_OPs/Nodeinfo/account_sync_worker.go new file mode 100644 index 00000000..e7ac1c53 --- /dev/null +++ b/DB_OPs/Nodeinfo/account_sync_worker.go @@ -0,0 +1,475 @@ +// MODULE: DB_OPs/Nodeinfo/account_sync_worker +// PURPOSE: Drain the accountsync Redis stream and write account batches to ImmuDB. +// Owns the at-least-once delivery contract: ACK only after successful DB write. +// +// CORE DATA STRUCTURES: +// - []StreamEntry: ephemeral per runWorker iteration. +// Bounded by AccountSyncWorkerConfig.MaxDrainItems (default 100). +// - []dbEntry: ephemeral per processBatch call. +// Bounded by MaxDrainItems × maxRecordsPerMessage (producer caps each message at +// maxRecordsPerMessage records — see immudb_account_manager.go). DID refs may add +// up to one extra entry per account. +// Sub-batched into chunks of MaxAccountsPerBatch before each BatchRestoreAccounts call. +// - PEL (Redis-side, not in-process): unacked entries in flight. +// Evicted by AutoClaim after PendingIdleTimeout; no in-process growth. +// +// TO MODIFY BEHAVIOR: +// - Tuning (batch size, timeouts): change AccountSyncWorkerConfig fields — no code change. +// - Add new payload type: add case in processBatch switch + enqueue helper in +// immudb_account_manager.go. This file changes only at the switch statement. +// - Change DB write path: edit processBatch — impacts ACK semantics and batch split. +// +// DO NOT: +// - Start this worker from a constructor. StartAccountSyncWorker is the only entry point. +// - ACK entries before BatchRestoreAccounts succeeds — breaks at-least-once guarantee. +// - Acquire the DB connection via GetAccountConnectionandPutBack — its auto-return +// goroutine fires on the scoped ctx deadline and can recycle the connection mid-write +// (data race). Use GetAccountsConnections + defer PutAccountsConnection, and thread the +// scoped writeCtx into BatchRestoreAccounts so the deadline bounds the DB ops directly. +// - Replace []dbEntry with a map — sequential append + slice-of-chunks is the right +// access pattern for BatchRestoreAccounts (ordered, fixed-size sub-batches). +// +// EXTENSION POINT: new payload types → add case in processBatch switch; add parse helper. +// +// CHANGE SCENARIOS: +// Add payload type: add case in processBatch switch + parse helper + enqueue in account_manager +// Change batch limits: edit DefaultWorkerConfig or pass custom AccountSyncWorkerConfig +// Change DB write: edit processBatch; ACK block is the only invariant that must not move + +package NodeInfo + +import ( + "context" + "encoding/json" + "fmt" + "log" + "math/big" + "sync/atomic" + "time" + + "github.com/JupiterMetaLabs/JMDN-FastSync/common/types" + "github.com/ethereum/go-ethereum/common" + "gossipnode/DB_OPs" +) + +// ─── dbEntry type alias ─────────────────────────────────────────────────────── + +// dbEntry is a type alias for the anonymous struct expected by DB_OPs.BatchRestoreAccounts. +// Using a type alias (=) ensures []dbEntry is assignment-compatible with the parameter type +// without a conversion loop. Access pattern: sequential append, read-once for sub-batching. +// Growth bound: MaxDrainItems × avg-accounts-per-payload (ephemeral per processBatch call). +type dbEntry = struct { + Key string + Value []byte +} + +// ─── Wire type for BatchUpdateAccounts payloads ─────────────────────────────── + +// accountUpdateWire is the stable JSON representation of types.AccountUpdate used +// in the stream payload. Explicit wire type prevents big.Int JSON serialization +// surprises (math/big.Int marshals as a quoted decimal string, but that behaviour +// is implementation-defined and not guaranteed across versions). +// +// Stored in the stream as: {"address":"0x...","new_balance":"1000000","nonce":42} +type accountUpdateWire struct { + Address string `json:"address"` + NewBalance string `json:"new_balance"` // decimal string from big.Int.String() + Nonce uint64 `json:"nonce"` +} + +// ─── Configuration ──────────────────────────────────────────────────────────── + +// AccountSyncWorkerConfig holds tuning parameters for the account sync worker. +// All fields have safe production defaults; use DefaultWorkerConfig() to get them. +type AccountSyncWorkerConfig struct { + // MaxDrainItems is the maximum number of stream entries read per XREADGROUP call. + // Higher values coalesce more work per ImmuDB commit but increase per-batch memory. + // Default: 100. + MaxDrainItems int64 + + // MaxAccountsPerBatch is the maximum number of accounts per single BatchRestoreAccounts call. + // Prevents oversized ImmuDB writes. If a coalesced batch exceeds this, it is split into chunks. + // Default: 500. + MaxAccountsPerBatch int + + // BlockTimeout is the XREADGROUP BLOCK duration. + // The worker goroutine sleeps inside Redis until data arrives or this duration elapses. + // Must be short enough to allow clean ctx cancellation. Default: 5s. + BlockTimeout time.Duration + + // PendingIdleTimeout is the minimum idle duration before XAUTOCLAIM reclaims a PEL entry. + // Entries stuck in the PEL longer than this (due to worker crash/restart) are replayed. + // Must exceed the worst-case BatchRestoreAccounts latency to avoid spurious reclaims. + // Default: 30s. + PendingIdleTimeout time.Duration + + // DBWriteTimeout bounds each GetAccountConnectionandPutBack + BatchRestoreAccounts call. + // Must exceed the observed worst-case ImmuDB commit latency (~15 s). Default: 60s. + DBWriteTimeout time.Duration +} + +// DefaultWorkerConfig returns production-tuned defaults. +// Time: O(1) +func DefaultWorkerConfig() AccountSyncWorkerConfig { + return AccountSyncWorkerConfig{ + MaxDrainItems: 100, + MaxAccountsPerBatch: 500, + BlockTimeout: 30 * time.Second, + PendingIdleTimeout: 30 * time.Second, + DBWriteTimeout: 60 * time.Second, + } +} + +// ─── WorkerManager — atomic lifecycle ──────────────────────────────────────── + +// WorkerManager manages the drain goroutine lifecycle with lock-free atomics. +// The worker starts lazily on the first WriteAccounts call and shuts down after +// BlockTimeout of idle time. Producers restart it automatically via EnsureActive. +type WorkerManager struct { + isOnline atomic.Bool // true = drain goroutine is running + resetInflight atomic.Bool // true = a lastActivity-reset goroutine is in flight + lastActivity atomic.Int64 // UnixNano — last successful commit or explicit reset + + streamer RedisStreamer + cfg AccountSyncWorkerConfig +} + +// EnsureActive is called by WriteAccounts before every XADD. +// If the worker is offline it wins a CAS to start it; if it is near its idle +// deadline it wins a CAS to extend lastActivity. Always returns immediately. +// Hot-path cost (online + healthy): two atomic loads + subtract + compare ≈ single-digit ns. +func (wm *WorkerManager) EnsureActive() { + if !wm.isOnline.Load() { + if wm.isOnline.CompareAndSwap(false, true) { + wm.lastActivity.Store(time.Now().UnixNano()) + log.Printf("[accountqueue] worker offline — restarting") + go wm.runWorker() + } + // CAS loss = another caller already claimed the spawn; worker is starting. + return + } + + // Online — check remaining idle budget. Refresh if under 50%. + elapsed := time.Since(time.Unix(0, wm.lastActivity.Load())) + if wm.cfg.BlockTimeout-elapsed < wm.cfg.BlockTimeout/2 { + if wm.resetInflight.CompareAndSwap(false, true) { + go func() { + defer wm.resetInflight.Store(false) + wm.lastActivity.Store(time.Now().UnixNano()) + }() + } + } +} + +// ─── Lifecycle ──────────────────────────────────────────────────────────────── + +// StartAccountSyncWorker creates a WorkerManager, installs it as the package-level +// queue, and returns. The drain goroutine starts lazily on the first WriteAccounts call. +// +// MUST be called exactly once from main.go before any WriteAccounts or BatchUpdateAccounts. +// If not called, both methods log an error and skip the enqueue (no write occurs). +// +// Time: O(1) — no Redis round trip; EnsureConsumerGroup is deferred to the first runWorker call. +func StartAccountSyncWorker(streamer RedisStreamer, cfg AccountSyncWorkerConfig) *WorkerManager { + m := &WorkerManager{streamer: streamer, cfg: cfg} + InstallAccountQueue(streamer, m) + return m +} + +// ─── Worker loop ───────────────────────────────────────────────────────────── + +// runWorker is the drain loop running as a method on WorkerManager. +// It exits when BlockTimeout elapses with no data AND lastActivity is stale. +// defer sets isOnline=false so even a panic marks the worker offline. +func (wm *WorkerManager) runWorker() { + defer wm.isOnline.Store(false) + log.Printf("[accountqueue] worker started (stream=%s group=%s consumer=%s)", + accountSyncStream, accountSyncGroup, accountSyncConsumer) + defer log.Printf("[accountqueue] worker stopped") + + if err := wm.streamer.EnsureConsumerGroup(context.Background(), accountSyncStream, accountSyncGroup); err != nil { + log.Printf("[accountqueue] ERROR: EnsureConsumerGroup: %v — worker exiting", err) + return + } + + // Reclaim any entries left unACKed by a prior worker run. + if err := reclaimPending(wm.streamer, wm.cfg); err != nil { + log.Printf("[accountqueue] WARN: startup reclaimPending error: %v", err) + } + + for { + entries, err := wm.streamer.ReadGroup( + context.Background(), + accountSyncStream, accountSyncGroup, accountSyncConsumer, + wm.cfg.MaxDrainItems, + wm.cfg.BlockTimeout, + ) + if err != nil { + log.Printf("[accountqueue] ReadGroup error: %v — retrying in 1s", err) + time.Sleep(time.Second) + continue + } + if entries == nil { + // BlockTimeout elapsed with no data — check idle window. + if time.Since(time.Unix(0, wm.lastActivity.Load())) >= wm.cfg.BlockTimeout { + log.Printf("[accountqueue] worker idle for %s — going offline", wm.cfg.BlockTimeout) + return + } + // lastActivity was refreshed by a concurrent EnsureActive reset; keep going. + continue + } + + if err := processBatch(wm.streamer, entries, wm.cfg); err != nil { + // Do NOT ACK. Entries remain in PEL and are replayed by reclaimPending on next start. + // BatchRestoreAccounts is LWW-idempotent — replays are safe. + log.Printf("[accountqueue] processBatch error: %v — %d entries remain in PEL for retry", + err, len(entries)) + } else { + wm.lastActivity.Store(time.Now().UnixNano()) + } + } +} + +// reclaimPending reclaims and processes all PEL entries whose idle time exceeds +// cfg.PendingIdleTimeout. Called once on worker startup to replay entries left +// unACKed by a previous crash. +// +// Iterates via cursor until the full PEL is scanned ("0-0" returned as next cursor). +// Each DB op uses context.Background() with cfg.DBWriteTimeout — no external cancellation. +// +// Time: O(PEL size / MaxDrainItems) XAUTOCLAIM round trips + processBatch cost per page. +func reclaimPending(s RedisStreamer, cfg AccountSyncWorkerConfig) error { + cursor := "0-0" + for { + entries, next, err := s.AutoClaim( + context.Background(), + accountSyncStream, accountSyncGroup, accountSyncConsumer, + cfg.PendingIdleTimeout, + cursor, + cfg.MaxDrainItems, + ) + if err != nil { + return fmt.Errorf("XAUTOCLAIM cursor=%s: %w", cursor, err) + } + + if len(entries) > 0 { + log.Printf("[accountqueue] reclaiming %d pending entries (cursor=%s)", len(entries), cursor) + if err := processBatch(s, entries, cfg); err != nil { + return fmt.Errorf("process reclaimed entries at cursor=%s: %w", cursor, err) + } + } + + // "0-0" means the full PEL was scanned — no more pending entries. + if next == "0-0" || next == "" { + break + } + cursor = next + } + return nil +} + +// ─── Batch processor ───────────────────────────────────────────────────────── + +// processBatch deserializes all stream entries, merges their accounts into a flat +// list, writes to ImmuDB in sub-batches of MaxAccountsPerBatch, and ACKs all +// entries only after every sub-batch succeeds. +// +// Poison pill handling: entries with undecodable payloads (parse error or unknown type) +// are ACKed immediately and discarded. They will never succeed and must not block the queue. +// +// At-least-once guarantee: +// - goodIDs are ACKed only after BatchRestoreAccounts succeeds for all chunks. +// - If any chunk fails, goodIDs are not ACKed → entries stay in PEL → replayed on restart. +// - Replay safety: BatchRestoreAccounts uses LWW (UpdatedAt timestamp) — duplicate writes +// overwrite with the same data and do not corrupt state. +// +// Time: O(N/MaxAccountsPerBatch) BatchRestoreAccounts round trips, where N = total accounts. +// Space: O(N) — ephemeral []dbEntry freed after ACK. +func processBatch(s RedisStreamer, entries []StreamEntry, cfg AccountSyncWorkerConfig) error { + var ( + writeEntries []dbEntry // accounts to persist to ImmuDB + goodIDs []string // stream IDs to ACK+XDEL after successful DB write + poisonIDs []string // stream IDs to ACK+XDEL immediately (unrecoverable) + ) + + for _, entry := range entries { + payloadType, _ := entry.Values["type"].(string) + dataStr, _ := entry.Values["data"].(string) + + switch syncPayloadType(payloadType) { + case payloadTypeAccounts: + parsed, err := parseAccountsPayload(dataStr) + if err != nil { + log.Printf("[accountqueue] WARN: poison pill — undecodable accounts entry %s: %v", entry.ID, err) + poisonIDs = append(poisonIDs, entry.ID) + continue + } + writeEntries = append(writeEntries, parsed...) + goodIDs = append(goodIDs, entry.ID) + + case payloadTypeUpdates: + parsed, err := parseUpdatesPayload(dataStr) + if err != nil { + log.Printf("[accountqueue] WARN: poison pill — undecodable updates entry %s: %v", entry.ID, err) + poisonIDs = append(poisonIDs, entry.ID) + continue + } + writeEntries = append(writeEntries, parsed...) + goodIDs = append(goodIDs, entry.ID) + + default: + log.Printf("[accountqueue] WARN: poison pill — unknown payload type %q in entry %s", payloadType, entry.ID) + poisonIDs = append(poisonIDs, entry.ID) + } + } + + // ACK + XDEL poison pills immediately — unrecoverable, must not block the PEL. + if len(poisonIDs) > 0 { + ackCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + if err := s.Ack(ackCtx, accountSyncStream, accountSyncGroup, poisonIDs...); err != nil { + log.Printf("[accountqueue] WARN: failed to ACK %d poison pills: %v", len(poisonIDs), err) + } else if err := s.Delete(ackCtx, accountSyncStream, poisonIDs...); err != nil { + log.Printf("[accountqueue] WARN: failed to XDEL %d poison pills: %v", len(poisonIDs), err) + } + cancel() + } + + if len(writeEntries) == 0 { + return nil + } + + // Scope a timeout to this DB write. writeCtx bounds connection acquisition AND + // (threaded into BatchRestoreAccounts) every GetAll/ExecAll inside the write. + writeCtx, writeCancel := context.WithTimeout(context.Background(), cfg.DBWriteTimeout) + defer writeCancel() + + // Acquire explicitly and return on processBatch exit — NOT via + // GetAccountConnectionandPutBack. That helper's auto-return goroutine fires when + // writeCtx hits its deadline, which can recycle the connection back into the pool + // while a multi-chunk BatchRestoreAccounts is still issuing gRPC on it (data race). + conn, err := DB_OPs.GetAccountsConnections(writeCtx) + if err != nil { + return fmt.Errorf("get account DB connection: %w", err) + } + defer DB_OPs.PutAccountsConnection(conn) + + // Write in sub-batches to bound individual ImmuDB commit size. + // All chunks must succeed before any ACK is issued. + start := time.Now() + for i := 0; i < len(writeEntries); i += cfg.MaxAccountsPerBatch { + end := i + cfg.MaxAccountsPerBatch + if end > len(writeEntries) { + end = len(writeEntries) + } + if err := DB_OPs.BatchRestoreAccounts(writeCtx, conn, writeEntries[i:end]); err != nil { + return fmt.Errorf("BatchRestoreAccounts chunk [%d:%d] of %d: %w", i, end, len(writeEntries), err) + } + } + commitDur := time.Since(start) + + // All sub-batches succeeded — ACK + XDEL in one pipeline round-trip. + // XACK removes entries from the PEL; XDEL removes the payload from the stream body. + // Without XDEL, ACKed entries accumulate in the stream indefinitely. + // Replay safety: BatchRestoreAccounts is LWW-idempotent if ACK fails and entries replay. + ackCtx, ackCancel := context.WithTimeout(context.Background(), 5*time.Second) + defer ackCancel() + if err := s.Ack(ackCtx, accountSyncStream, accountSyncGroup, goodIDs...); err != nil { + log.Printf("[accountqueue] WARN: ACK failed for %d entries after successful DB write: %v — will be reclaimed and re-written (safe, LWW)", len(goodIDs), err) + } else if err := s.Delete(ackCtx, accountSyncStream, goodIDs...); err != nil { + log.Printf("[accountqueue] WARN: XDEL failed for %d entries after ACK: %v", len(goodIDs), err) + } else { + log.Printf("[accountqueue] wrote %d accounts from %d entries in %s; ACKed + XDELed", + len(writeEntries), len(goodIDs), commitDur.Round(time.Millisecond)) + } + + return nil +} + +// ─── Payload parsers ───────────────────────────────────────────────────────── + +// parseAccountsPayload deserializes a payloadTypeAccounts JSON blob into a flat +// list of DB write entries ready for BatchRestoreAccounts. +// +// Time: O(N) where N = number of accounts in the payload. +// Space: O(N) — one dbEntry per account. +func parseAccountsPayload(dataStr string) ([]dbEntry, error) { + var accs []*types.Account + if err := json.Unmarshal([]byte(dataStr), &accs); err != nil { + return nil, fmt.Errorf("unmarshal []*types.Account: %w", err) + } + + // We might emit up to 2 entries per account (address: and did:) + entries := make([]dbEntry, 0, len(accs)*2) + for _, acc := range accs { + if acc == nil { + continue + } + dbAcc := &DB_OPs.Account{ + DIDAddress: acc.DIDAddress, + Address: acc.Address, + Balance: acc.Balance, + Nonce: acc.Nonce, + AccountType: acc.AccountType, + CreatedAt: acc.CreatedAt, + UpdatedAt: acc.UpdatedAt, + Metadata: acc.Metadata, + } + val, err := json.Marshal(dbAcc) + if err != nil { + return nil, fmt.Errorf("marshal DB_OPs.Account for address %s: %w", acc.Address.Hex(), err) + } + + // 1. Emit the primary address key + entries = append(entries, dbEntry{ + Key: DB_OPs.Prefix + acc.Address.Hex(), + Value: val, + }) + + // 2. Emit the DID key so BatchRestoreAccounts creates the bound reference + if acc.DIDAddress != "" { + entries = append(entries, dbEntry{ + Key: DB_OPs.DIDPrefix + acc.DIDAddress, + Value: val, + }) + } + } + return entries, nil +} + +// parseUpdatesPayload deserializes a payloadTypeUpdates JSON blob into a flat list +// of DB write entries ready for BatchRestoreAccounts. +// Reads accountUpdateWire (not types.AccountUpdate) to avoid big.Int JSON ambiguity. +// +// Time: O(N) where N = number of updates in the payload. +// Space: O(N) — one dbEntry per update. +func parseUpdatesPayload(dataStr string) ([]dbEntry, error) { + var wires []accountUpdateWire + if err := json.Unmarshal([]byte(dataStr), &wires); err != nil { + return nil, fmt.Errorf("unmarshal []accountUpdateWire: %w", err) + } + entries := make([]dbEntry, 0, len(wires)) + for _, w := range wires { + balance := new(big.Int) + if _, ok := balance.SetString(w.NewBalance, 10); !ok { + return nil, fmt.Errorf("invalid decimal balance %q for address %s", w.NewBalance, w.Address) + } + addr := common.HexToAddress(w.Address) + dbAcc := &DB_OPs.Account{ + DIDAddress: w.Address, + Address: addr, + Balance: balance.String(), + Nonce: w.Nonce, + AccountType: "user", + UpdatedAt: time.Now().UTC().UnixNano(), + } + val, err := json.Marshal(dbAcc) + if err != nil { + return nil, fmt.Errorf("marshal DB_OPs.Account for address %s: %w", w.Address, err) + } + entries = append(entries, dbEntry{ + Key: DB_OPs.Prefix + addr.Hex(), + Value: val, + }) + } + return entries, nil +} diff --git a/DB_OPs/Nodeinfo/immudb_account_manager.go b/DB_OPs/Nodeinfo/immudb_account_manager.go index d6eafb8a..7a6f9984 100644 --- a/DB_OPs/Nodeinfo/immudb_account_manager.go +++ b/DB_OPs/Nodeinfo/immudb_account_manager.go @@ -3,17 +3,85 @@ package NodeInfo import ( "context" "encoding/json" + "errors" "fmt" "math/big" + "sort" + "strings" "time" "github.com/JupiterMetaLabs/JMDN-FastSync/common/types" "github.com/ethereum/go-ethereum/common" "gossipnode/DB_OPs" + "gossipnode/config" ) type account_manager struct{} +// ─── Bounded enqueue (producer side) ────────────────────────────────────────── +// +// The library's AccountSync receive path (sync_protocols.go HandleAccountsSyncData) +// accumulates every page of a sync session and calls WriteAccounts ONCE at EOF with +// the whole batch — potentially millions of records. Packing that into a single XADD +// risks exceeding Redis proto-max-bulk-len (512 MiB) and stalls/fails the enqueue; a +// failed enqueue at EOF (after all pages were ACKed) collapses the session and drives +// the dispatcher into a retry→dead-letter storm. We split into fixed-size messages so +// every XADD is small and fast, and the worker's per-drain memory stays bounded. + +// maxRecordsPerMessage caps how many account/update records are packed into one Redis +// stream message (one XADD). 500 mirrors AccountSyncWorkerConfig.MaxAccountsPerBatch so +// a single message maps to roughly one ImmuDB sub-batch; at ~300 B/record a message is +// ~150 KB — three orders of magnitude under Redis's 512 MiB bulk limit. +const maxRecordsPerMessage = 500 + +// enqueueTimeout scales the enqueue deadline with chunk count: a 10 s base plus 5 ms per +// chunk covers large syncs (e.g. 2000 chunks → ~20 s) without an unbounded wait. The +// server is not blocked on this enqueue (pages were already ACKed), so a generous, +// bounded budget is safe. +// +// Time: O(1) +func enqueueTimeout(chunks int) time.Duration { + return 10*time.Second + time.Duration(chunks)*5*time.Millisecond +} + +// enqueueRecordsChunked splits items into chunks of at most maxRecordsPerMessage, +// marshals each chunk to JSON, and XADDs it to the account sync stream tagged ptype. +// Best-effort: every chunk is attempted and errors are aggregated (errors.Join), so a +// single transient XADD failure does not drop the remaining chunks. Any chunk that +// fails to enqueue is backfilled by the worker's LWW write on a later sync / +// reconciliation — strictly safer than the previous all-or-nothing single message. +// +// Time: O(N) marshal + O(ceil(N/maxRecordsPerMessage)) XADD round trips, N = len(items). +// Space: O(maxRecordsPerMessage) per message — never the whole batch at once. +// DS: input []T re-sliced in place into fixed-size windows; no intermediate copy. +func enqueueRecordsChunked[T any](ctx context.Context, s RedisStreamer, ptype syncPayloadType, items []T) error { + var errs []error + for start := 0; start < len(items); start += maxRecordsPerMessage { + end := start + maxRecordsPerMessage + if end > len(items) { + end = len(items) + } + data, err := json.Marshal(items[start:end]) + if err != nil { + errs = append(errs, fmt.Errorf("marshal chunk [%d:%d]: %w", start, end, err)) + continue + } + if _, err := s.Enqueue(ctx, accountSyncStream, map[string]any{ + "type": string(ptype), + "data": string(data), + }); err != nil { + errs = append(errs, fmt.Errorf("enqueue chunk [%d:%d]: %w", start, end, err)) + } + } + return errors.Join(errs...) +} + +// chunkCount returns the number of messages len(n) records split into maxRecordsPerMessage. +// Time: O(1) +func chunkCount(n int) int { + return (n + maxRecordsPerMessage - 1) / maxRecordsPerMessage +} + // Time Complexity: O(N) where N is the total number of transactions scanned or retrieved func (am *account_manager) GetTransactionsForAccount(accountAddress string) ([]types.DBTransaction, error) { ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) @@ -30,18 +98,9 @@ func (am *account_manager) GetTransactionsForAccount(accountAddress string) ([]t return nil, fmt.Errorf("failed to get transactions by account: %w", err) } - // Serialize and deserialize to map config.Transaction to types.DBTransaction. - // The JSON tags match between config.Transaction and types.Transaction (embedded in DBTransaction), - // so core fields are preserved. DB-specific fields (BlockNumber, TxIndex, CreatedAt) will be zero-valued. - var result []types.DBTransaction + result := make([]types.DBTransaction, 0, len(cfgTxs)) for _, tx := range cfgTxs { - b, err := json.Marshal(tx) - if err == nil { - var dbTx types.DBTransaction - if json.Unmarshal(b, &dbTx) == nil { - result = append(result, dbTx) - } - } + result = append(result, configTxToDBTx(tx)) } return result, nil } @@ -59,6 +118,9 @@ func (am *account_manager) GetAccountBalance(accountAddress string) (*big.Int, u addr := common.HexToAddress(accountAddress) acc, err := DB_OPs.GetAccount(conn, addr) if err != nil { + if strings.Contains(err.Error(), "key not found") { + return big.NewInt(0), 0, nil + } return nil, 0, fmt.Errorf("failed to get account: %w", err) } @@ -81,6 +143,9 @@ func (am *account_manager) UpdateAccountBalance(accountAddress string, balance * doc, err := DB_OPs.GetAccount(conn, addr) if err != nil { + if strings.Contains(err.Error(), "key not found") { + return am.CreateAccount(accountAddress, balance, nonce) + } return fmt.Errorf("failed to get account for update: %w", err) } @@ -133,44 +198,262 @@ func (am *account_manager) CreateAccount(accountAddress string, balance *big.Int return nil } -// Time Complexity: O(N) where N is the number of updates -func (am *account_manager) BatchUpdateAccounts(updates []types.AccountUpdate) error { - ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) +// Time Complexity: O(1) +func (am *account_manager) GetAccountByAddress(accountAddress string) (*types.Account, error) { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() conn, err := DB_OPs.GetAccountConnectionandPutBack(ctx) if err != nil { - return fmt.Errorf("failed to get account DB connection: %w", err) + return nil, fmt.Errorf("failed to get account DB connection: %w", err) } - var entries []struct { - Key string - Value []byte - } + // Strip "address:" DB key prefix if present — the external FastSync module may pass + // DB key format; common.HexToAddress expects bare hex (0x... or unprefixed). + accountAddress = strings.TrimPrefix(accountAddress, DB_OPs.Prefix) - for _, u := range updates { - addr := common.HexToAddress(u.Address) - acc := &DB_OPs.Account{ - DIDAddress: u.Address, - Address: addr, - Balance: u.NewBalance.String(), - Nonce: u.Nonce, - AccountType: "user", - UpdatedAt: time.Now().UTC().UnixNano(), + addr := common.HexToAddress(accountAddress) + acc, err := DB_OPs.GetAccount(conn, addr) + if err != nil { + if strings.Contains(err.Error(), "key not found") { + return nil, nil } + return nil, fmt.Errorf("failed to get account: %w", err) + } + return dbOpsToTypes(acc), nil +} + +// WriteAccounts enqueues accounts to the Redis stream for async DB write, split into +// fixed-size messages of at most maxRecordsPerMessage (see enqueueRecordsChunked). +// Returns immediately after the enqueue — the caller gets an ACK without waiting for +// the ImmuDB commit (which can take up to 15 s under load). +// +// The library hands this the entire end-of-stream batch (up to millions of accounts); +// chunking keeps each XADD small so it never exceeds Redis's bulk-string limit and the +// enqueue cannot fail the whole session. Enqueue is best-effort across chunks: a +// partial failure returns an aggregated error but does not drop successful chunks; the +// worker's LWW write backfills the rest on a later sync. +// +// StartAccountSyncWorker must be called before WriteAccounts or this returns an error. +// At-least-once delivery is guaranteed by the worker via PEL + XAUTOCLAIM. +// +// Time: O(N) serialization + O(ceil(N/maxRecordsPerMessage)) XADD round trips, N = len(accounts). +func (am *account_manager) WriteAccounts(accounts []*types.Account) error { + if len(accounts) == 0 { + return nil + } + s, mgr := getAccountQueue() + if s == nil { + return fmt.Errorf("WriteAccounts: account queue not initialized; call StartAccountSyncWorker before use") + } + mgr.EnsureActive() + + chunks := chunkCount(len(accounts)) + ctx, cancel := context.WithTimeout(context.Background(), enqueueTimeout(chunks)) + defer cancel() + if err := enqueueRecordsChunked(ctx, s, payloadTypeAccounts, accounts); err != nil { + return fmt.Errorf("WriteAccounts: enqueue %d accounts in %d messages: %w", len(accounts), chunks, err) + } + return nil +} + +// NewAccountNonceIterator returns a cursor-based iterator over all accounts. +// Each NextBatch call advances a seekKey cursor — O(N) total scan across all batches. +func (am *account_manager) NewAccountNonceIterator(batchSize int) types.AccountNonceIterator { + return &immudbNonceIter{ + batchSize: batchSize, + } +} + +// ─── immudbNonceIter ───────────────────────────────────────────────────────── + +// MODULE: DB_OPs/Nodeinfo (immudbNonceIter) +// PURPOSE: cursor-based iterator that pages all accounts from ImmuDB in ascending key order. +// +// CORE DATA STRUCTURES: +// - lastKey []byte: scan cursor — key of the last returned account; nil = start of DB. +// Fixed size (one key). Threaded across NextBatch calls so each call resumes where the +// previous left off instead of restarting from key 0. +// +// DO NOT: +// - Replace lastKey with an offset int — that restarts the scan from key 0 each call (O(N²)). +// - Add an in-memory account cache on this struct — 2.7M entries exhaust heap during sync. + +type immudbNonceIter struct { + batchSize int + lastKey []byte // scan cursor: key of last returned account, nil = start + done bool +} - val, err := json.Marshal(acc) +// Time: O(1) +func (it *immudbNonceIter) TotalAccounts() (uint64, error) { + count, err := DB_OPs.CountAccounts(nil) + return uint64(count), err +} + +// Time: O(batchSize) ImmuDB entries; Space: O(batchSize) +func (it *immudbNonceIter) NextBatch() ([]*types.Account, error) { + if it.done { + return nil, nil + } + + accs, lastKey, err := DB_OPs.ListAccountsPaginatedFrom(nil, it.batchSize, it.lastKey, "") + if err != nil { + return nil, fmt.Errorf("account nonce iterator: %w", err) + } + if len(accs) == 0 { + it.done = true + return nil, nil + } + + result := make([]*types.Account, len(accs)) + for i, acc := range accs { + result[i] = dbOpsToTypes(acc) + } + + sort.Slice(result, func(i, j int) bool { + return result[i].Nonce < result[j].Nonce + }) + + it.lastKey = lastKey + if len(accs) < it.batchSize { + it.done = true + } + return result, nil +} + +// GetAccountsByNonces scans all accounts once via cursor to find those matching the given nonces. +// Time: O(N) where N = total accounts; Space: O(|nonces|) +func (it *immudbNonceIter) GetAccountsByNonces(nonces []uint64) ([]*types.Account, error) { + if len(nonces) == 0 { + return nil, nil + } + + nonceSet := make(map[uint64]struct{}, len(nonces)) + for _, n := range nonces { + nonceSet[n] = struct{}{} + } + + result := make([]*types.Account, 0, len(nonces)) + var seekKey []byte + + for { + accs, lastKey, err := DB_OPs.ListAccountsPaginatedFrom(nil, 1000, seekKey, "") if err != nil { - return fmt.Errorf("failed to marshal account %s: %w", u.Address, err) + return nil, fmt.Errorf("GetAccountsByNonces scan: %w", err) + } + if len(accs) == 0 { + break } - entries = append(entries, struct { - Key string - Value []byte - }{ - Key: DB_OPs.Prefix + addr.Hex(), - Value: val, - }) + for _, acc := range accs { + ta := dbOpsToTypes(acc) + if _, ok := nonceSet[ta.Nonce]; ok { + result = append(result, ta) + if len(result) == len(nonces) { + return result, nil + } + } + } + if lastKey == nil || len(accs) < 1000 { + break + } + seekKey = lastKey } + return result, nil +} + +func (it *immudbNonceIter) Close() {} - return DB_OPs.BatchRestoreAccounts(conn, entries) +// ─── helpers ───────────────────────────────────────────────────────────────── + +func dbOpsToTypes(acc *DB_OPs.Account) *types.Account { + return &types.Account{ + DIDAddress: acc.DIDAddress, + Address: acc.Address, + Balance: acc.Balance, + Nonce: acc.Nonce, + AccountType: acc.AccountType, + CreatedAt: acc.CreatedAt, + UpdatedAt: acc.UpdatedAt, + Metadata: acc.Metadata, + } +} + +// BatchUpdateAccounts enqueues account balance/nonce updates to the Redis stream for +// async DB write, split into fixed-size messages of at most maxRecordsPerMessage. +// Returns immediately after the enqueue. Best-effort across chunks (see WriteAccounts). +// +// StartAccountSyncWorker must be called before BatchUpdateAccounts or this returns an error. +// At-least-once delivery is guaranteed by the worker via PEL + XAUTOCLAIM. +// +// Time: O(N) serialization + O(ceil(N/maxRecordsPerMessage)) XADD round trips, N = len(updates). +func (am *account_manager) BatchUpdateAccounts(updates []types.AccountUpdate) error { + if len(updates) == 0 { + return nil + } + s, mgr := getAccountQueue() + if s == nil { + return fmt.Errorf("BatchUpdateAccounts: account queue not initialized; call StartAccountSyncWorker before use") + } + mgr.EnsureActive() + // Convert to wire type for stable JSON serialization. + // big.Int.String() produces a decimal string; accountUpdateWire makes the format explicit. + wires := make([]accountUpdateWire, len(updates)) + for i, u := range updates { + wires[i] = accountUpdateWire{ + Address: u.Address, + NewBalance: u.NewBalance.String(), + Nonce: u.Nonce, + } + } + + chunks := chunkCount(len(wires)) + ctx, cancel := context.WithTimeout(context.Background(), enqueueTimeout(chunks)) + defer cancel() + if err := enqueueRecordsChunked(ctx, s, payloadTypeUpdates, wires); err != nil { + return fmt.Errorf("BatchUpdateAccounts: enqueue %d updates in %d messages: %w", len(updates), chunks, err) + } + return nil +} + +// configTxToDBTx converts a config.Transaction to types.DBTransaction via direct field copy. +// DB-specific fields (BlockNumber, TxIndex, CreatedAt) are zero-valued — not available from config.Transaction. +func configTxToDBTx(tx *config.Transaction) types.DBTransaction { + return types.DBTransaction{ + Transaction: types.Transaction{ + Hash: tx.Hash, + From: tx.From, + To: tx.To, + Value: tx.Value, + Type: tx.Type, + Timestamp: tx.Timestamp, + ChainID: tx.ChainID, + Nonce: tx.Nonce, + GasLimit: tx.GasLimit, + GasPrice: tx.GasPrice, + MaxFee: tx.MaxFee, + MaxPriorityFee: tx.MaxPriorityFee, + Data: tx.Data, + AccessList: configAccessListToTypes(tx.AccessList), + V: tx.V, + R: tx.R, + S: tx.S, + }, + } +} + +// configAccessListToTypes converts config.AccessList to types.AccessList. +// Both are structurally identical but defined in separate packages. +func configAccessListToTypes(al config.AccessList) types.AccessList { + if len(al) == 0 { + return nil + } + result := make(types.AccessList, len(al)) + for i, t := range al { + result[i] = types.AccessTuple{ + Address: t.Address, + StorageKeys: t.StorageKeys, + } + } + return result } diff --git a/DB_OPs/Nodeinfo/immudb_adapter.go b/DB_OPs/Nodeinfo/immudb_adapter.go index 2021b6e1..2c4e1313 100644 --- a/DB_OPs/Nodeinfo/immudb_adapter.go +++ b/DB_OPs/Nodeinfo/immudb_adapter.go @@ -2,12 +2,14 @@ package NodeInfo import ( "context" + "encoding/json" "log" "time" + "gossipnode/DB_OPs" + "github.com/JupiterMetaLabs/JMDN-FastSync/common/checksum/checksum_priorsync" "github.com/JupiterMetaLabs/JMDN-FastSync/common/types" - "gossipnode/DB_OPs" ) const ChecksumVersion = 2 @@ -23,18 +25,18 @@ func NewSyncStruct() types.BlockInfo { // Time Complexity: O(1) mostly, bounded by network round trip to ImmuDB. // GetBlockNumber retrieves the latest block number from the main ImmuDB. func (sync *sync_struct) GetBlockNumber() uint64 { - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) // Increased timeout defer cancel() conn, err := DB_OPs.GetMainDBConnectionandPutBack(ctx) if err != nil { - log.Printf("Error getting main DB connection for latest block number: %v", err) + log.Printf("[NodeInfo] ERROR: Failed to get main DB connection for block number: %v", err) return 0 } num, err := DB_OPs.GetLatestBlockNumber(conn) if err != nil { - log.Printf("Error getting latest block number from ImmuDB: %v", err) + log.Printf("[NodeInfo] ERROR: GetLatestBlockNumber failed: %v. Attempting manual reconciliation.", err) return 0 } return num @@ -58,6 +60,15 @@ func (sync *sync_struct) GetBlockDetails() types.PriorSync { return types.PriorSync{} } + // SyncConfirmation needs the actual highest block in DB (headers written by + // HeaderSync), not just the DataSync marker. Use whichever is higher. + if headerLatestBytes, readErr := DB_OPs.Read(conn, "header_latest_block"); readErr == nil { + var headerLatest uint64 + if jsonErr := json.Unmarshal(headerLatestBytes, &headerLatest); jsonErr == nil && headerLatest > latestNum { + latestNum = headerLatest + } + } + latestBlock, err := DB_OPs.GetZKBlockByNumber(conn, latestNum) if err != nil { log.Printf("Error getting latest block details: %v", err) diff --git a/DB_OPs/Nodeinfo/immudb_block_nonheaders.go b/DB_OPs/Nodeinfo/immudb_block_nonheaders.go index ef79e867..a9b57c81 100644 --- a/DB_OPs/Nodeinfo/immudb_block_nonheaders.go +++ b/DB_OPs/Nodeinfo/immudb_block_nonheaders.go @@ -78,11 +78,12 @@ func convertZKBlockToNonHeaders(b *config.ZKBlock) *blockpb.NonHeaders { for idx, tx := range b.Transactions { pbTx := &blockpb.Transaction{ - Hash: tx.Hash[:], - Type: uint32(tx.Type), - Nonce: tx.Nonce, - GasLimit: tx.GasLimit, - Data: tx.Data, + Hash: tx.Hash[:], + Type: uint32(tx.Type), + Timestamp: tx.Timestamp, + Nonce: tx.Nonce, + GasLimit: tx.GasLimit, + Data: tx.Data, } if tx.From != nil { pbTx.From = tx.From[:] @@ -93,6 +94,9 @@ func convertZKBlockToNonHeaders(b *config.ZKBlock) *blockpb.NonHeaders { if tx.Value != nil { pbTx.Value = tx.Value.Bytes() } + if tx.ChainID != nil { + pbTx.ChainId = tx.ChainID.Bytes() + } if tx.GasPrice != nil { pbTx.GasPrice = tx.GasPrice.Bytes() } @@ -102,6 +106,15 @@ func convertZKBlockToNonHeaders(b *config.ZKBlock) *blockpb.NonHeaders { if tx.MaxPriorityFee != nil { pbTx.MaxPriorityFee = tx.MaxPriorityFee.Bytes() } + for _, at := range tx.AccessList { + pbAT := &blockpb.AccessTuple{ + Address: at.Address[:], + } + for _, sk := range at.StorageKeys { + pbAT.StorageKeys = append(pbAT.StorageKeys, sk[:]) + } + pbTx.AccessList = append(pbTx.AccessList, pbAT) + } if tx.V != nil { pbTx.V = tx.V.Bytes() } @@ -112,6 +125,21 @@ func convertZKBlockToNonHeaders(b *config.ZKBlock) *blockpb.NonHeaders { pbTx.S = tx.S.Bytes() } + if tx.ChainID != nil { + pbTx.ChainId = tx.ChainID.Bytes() + } + if len(tx.AccessList) > 0 { + for _, al := range tx.AccessList { + pbAl := &blockpb.AccessTuple{ + Address: al.Address[:], + } + for _, sk := range al.StorageKeys { + pbAl.StorageKeys = append(pbAl.StorageKeys, sk[:]) + } + pbTx.AccessList = append(pbTx.AccessList, pbAl) + } + } + nh.Transactions = append(nh.Transactions, &blockpb.DBTransaction{ Tx: pbTx, TxIndex: uint32(idx), diff --git a/DB_OPs/Nodeinfo/immudb_blockheader_iterator.go b/DB_OPs/Nodeinfo/immudb_blockheader_iterator.go index 036b6cc5..21ce6c34 100644 --- a/DB_OPs/Nodeinfo/immudb_blockheader_iterator.go +++ b/DB_OPs/Nodeinfo/immudb_blockheader_iterator.go @@ -45,6 +45,7 @@ func (i *dbBlockHeaderIterator) GetBlockHeaders(blocknumbers []uint64) ([]*block GasLimit: b.GasLimit, GasUsed: b.GasUsed, BlockNumber: b.BlockNumber, + LogsBloom: b.LogsBloom, } if b.CoinbaseAddr != nil { h.CoinbaseAddr = b.CoinbaseAddr[:] @@ -87,6 +88,7 @@ func (i *dbBlockHeaderIterator) GetBlockHeadersRange(start, end uint64) ([]*bloc GasLimit: b.GasLimit, GasUsed: b.GasUsed, BlockNumber: b.BlockNumber, + LogsBloom: b.LogsBloom, } if b.CoinbaseAddr != nil { h.CoinbaseAddr = b.CoinbaseAddr[:] diff --git a/DB_OPs/Nodeinfo/immudb_data_writer.go b/DB_OPs/Nodeinfo/immudb_data_writer.go index 336a91c8..d5a04b97 100644 --- a/DB_OPs/Nodeinfo/immudb_data_writer.go +++ b/DB_OPs/Nodeinfo/immudb_data_writer.go @@ -2,14 +2,17 @@ package NodeInfo import ( "context" + "fmt" "math/big" + "strings" "time" + "gossipnode/DB_OPs" + "gossipnode/config" + blockpb "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/block" "github.com/JupiterMetaLabs/JMDN-FastSync/common/types" "github.com/ethereum/go-ethereum/common" - "gossipnode/config" - "gossipnode/DB_OPs" ) type DataWriter struct{} @@ -65,10 +68,11 @@ func (dw *DataWriter) WriteData(data []*blockpb.NonHeaders) error { } cfgTx := config.Transaction{ - Type: uint8(tx.Type), - Nonce: tx.Nonce, - GasLimit: tx.GasLimit, - Data: tx.Data, + Type: uint8(tx.Type), + Timestamp: tx.Timestamp, + Nonce: tx.Nonce, + GasLimit: tx.GasLimit, + Data: tx.Data, } if len(tx.Hash) > 0 { @@ -85,6 +89,9 @@ func (dw *DataWriter) WriteData(data []*blockpb.NonHeaders) error { if len(tx.Value) > 0 { cfgTx.Value = new(big.Int).SetBytes(tx.Value) } + if len(tx.ChainId) > 0 { + cfgTx.ChainID = new(big.Int).SetBytes(tx.ChainId) + } if len(tx.GasPrice) > 0 { cfgTx.GasPrice = new(big.Int).SetBytes(tx.GasPrice) } @@ -94,6 +101,18 @@ func (dw *DataWriter) WriteData(data []*blockpb.NonHeaders) error { if len(tx.MaxPriorityFee) > 0 { cfgTx.MaxPriorityFee = new(big.Int).SetBytes(tx.MaxPriorityFee) } + if len(tx.AccessList) > 0 { + cfgTx.AccessList = make(config.AccessList, 0, len(tx.AccessList)) + for _, pbAT := range tx.AccessList { + at := config.AccessTuple{ + Address: common.BytesToAddress(pbAT.Address), + } + for _, sk := range pbAT.StorageKeys { + at.StorageKeys = append(at.StorageKeys, common.BytesToHash(sk)) + } + cfgTx.AccessList = append(cfgTx.AccessList, at) + } + } if len(tx.V) > 0 { cfgTx.V = new(big.Int).SetBytes(tx.V) } @@ -103,6 +122,20 @@ func (dw *DataWriter) WriteData(data []*blockpb.NonHeaders) error { if len(tx.S) > 0 { cfgTx.S = new(big.Int).SetBytes(tx.S) } + if len(tx.ChainId) > 0 { + cfgTx.ChainID = new(big.Int).SetBytes(tx.ChainId) + } + if len(tx.AccessList) > 0 { + for _, al := range tx.AccessList { + cfgAl := config.AccessTuple{ + Address: common.BytesToAddress(al.Address), + } + for _, sk := range al.StorageKeys { + cfgAl.StorageKeys = append(cfgAl.StorageKeys, common.BytesToHash(sk)) + } + cfgTx.AccessList = append(cfgTx.AccessList, cfgAl) + } + } txs = append(txs, cfgTx) } @@ -112,7 +145,37 @@ func (dw *DataWriter) WriteData(data []*blockpb.NonHeaders) error { } if err := DB_OPs.StoreZKBlock(conn, b); err != nil { - return err + // if err not nill, then force write or update + if strings.Contains(err.Error(), "already exists") { + blockKey := fmt.Sprintf("%s%d", DB_OPs.PREFIX_BLOCK, b.BlockNumber) + if err2 := DB_OPs.Update(blockKey, b); err2 != nil { + return fmt.Errorf("force update block %d failed: %w", b.BlockNumber, err2) + } + + hashKey := fmt.Sprintf("%s%s", DB_OPs.PREFIX_BLOCK_HASH, b.BlockHash.Hex()) + if err2 := DB_OPs.Update(hashKey, blockKey); err2 != nil { + return fmt.Errorf("force update hash mapping failed: %w", err2) + } + + if err2 := DB_OPs.Update("latest_block", b.BlockNumber); err2 != nil { + return fmt.Errorf("force update latest block failed: %w", err2) + } + + // Write tx: → blockNumber index for each transaction. + // WriteHeaders stores blocks without transactions, so StoreZKBlock's tx + // indexing loop runs 0 times there. This is the only place those index + // entries get written — required for GetTransactionByHash to work. + for _, tx := range b.Transactions { + txKey := fmt.Sprintf("%s%s", DB_OPs.DEFAULT_PREFIX_TX, tx.Hash) + if err2 := DB_OPs.Create(conn, txKey, b.BlockNumber); err2 != nil { + if !strings.Contains(err2.Error(), "already exists") { + return fmt.Errorf("store tx index for %s: %w", tx.Hash, err2) + } + } + } + } else { + return err + } } } diff --git a/DB_OPs/Nodeinfo/immudb_headers_writer.go b/DB_OPs/Nodeinfo/immudb_headers_writer.go index 5af7a9ad..16ee0631 100644 --- a/DB_OPs/Nodeinfo/immudb_headers_writer.go +++ b/DB_OPs/Nodeinfo/immudb_headers_writer.go @@ -2,6 +2,8 @@ package NodeInfo import ( "context" + "fmt" + "strings" "time" "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/block" @@ -32,6 +34,13 @@ func (hw *HeadersWriter) WriteHeaders(headers []*block.Header) error { return err } + // Snapshot latest_block before writing any headers. + // HeaderSync writes skeleton blocks (no transactions) so it must not advance + // the latest_block marker — that would make the explorer and StartupSync think + // the node is fully synced up to the last header, when DataSync hasn't run yet. + // We restore this value after all headers are written. + prevLatest, prevErr := DB_OPs.GetLatestBlockNumber(conn) + for _, h := range headers { b := &config.ZKBlock{ BlockNumber: h.BlockNumber, @@ -41,9 +50,10 @@ func (hw *HeadersWriter) WriteHeaders(headers []*block.Header) error { TxnsRoot: h.TxnsRoot, ExtraData: h.ExtraData, GasLimit: h.GasLimit, - GasUsed: h.GasUsed, + GasUsed: h.GasUsed, + LogsBloom: h.LogsBloom, } - + if len(h.StateRoot) > 0 { b.StateRoot = common.BytesToHash(h.StateRoot) } @@ -61,13 +71,50 @@ func (hw *HeadersWriter) WriteHeaders(headers []*block.Header) error { addr := common.BytesToAddress(h.ZkvmAddr) b.ZKVMAddr = &addr } - + err := DB_OPs.StoreZKBlock(conn, b) if err != nil { - return err + if strings.Contains(err.Error(), "already exists") { + blockKey := fmt.Sprintf("%s%d", DB_OPs.PREFIX_BLOCK, b.BlockNumber) + if err2 := DB_OPs.Update(blockKey, b); err2 != nil { + return fmt.Errorf("force update block %d failed: %w", b.BlockNumber, err2) + } + + hashKey := fmt.Sprintf("%s%s", DB_OPs.PREFIX_BLOCK_HASH, b.BlockHash.Hex()) + if err2 := DB_OPs.Update(hashKey, blockKey); err2 != nil { + return fmt.Errorf("force update hash mapping failed: %w", err2) + } + + // Do NOT update latest_block here — DataSync owns the marker. + } else { + return err + } } } - + + // Update header_latest_block so SyncConfirmation can build the correct Merkle + // range. This is separate from latest_block (which DataSync owns) so the + // explorer still shows only fully data-synced blocks. + if len(headers) > 0 { + highestWritten := headers[0].BlockNumber + for _, h := range headers[1:] { + if h.BlockNumber > highestWritten { + highestWritten = h.BlockNumber + } + } + if err2 := DB_OPs.Update("header_latest_block", highestWritten); err2 != nil { + return fmt.Errorf("update header_latest_block failed: %w", err2) + } + } + + // Restore latest_block to the pre-HeaderSync value so the marker always + // reflects the last fully data-synced block, not just the last header. + if prevErr == nil { + if err2 := DB_OPs.Update("latest_block", prevLatest); err2 != nil { + return fmt.Errorf("restore latest_block after HeaderSync failed: %w", err2) + } + } + return nil } diff --git a/DB_OPs/account_immuclient.go b/DB_OPs/account_immuclient.go index ca2301ae..a9f29c60 100644 --- a/DB_OPs/account_immuclient.go +++ b/DB_OPs/account_immuclient.go @@ -56,13 +56,36 @@ func (s *AccountsSet) Add(address common.Address) { s.Accounts[address.Hex()] = nil } -// Get the Nonce of a account - NTF -var counter uint64 +// lastNonce is used to guarantee monotonic nanosecond timestamps for PutNonceofAccount. +var lastNonce atomic.Uint64 + +// PutNonceofAccount generates a unique epoch ID for new accounts. +// +// HISTORICAL BUG (Fixed): Previously computed as `uint64(UnixNano) << 16 | counter`, +// which silently overflowed uint64 and corrupted the embedded timestamp. +// +// FIX (Option C): We now use a pure monotonic nanosecond counter. It returns +// exact UnixNano precision, gracefully bumping by +1ns on extreme collisions. +// +// LIFECYCLE WARNING: The `Nonce` field in the Account struct serves a dual purpose: +// 1. On creation: It stores this unique nanosecond timestamp ID. +// 2. Post-transaction: Reconciliation and consensus overwrite it with the account's +// highest transaction nonce (e.g., 0, 1, 2...). +// Do NOT rely on Account.Nonce remaining a timestamp if the account has sent transactions! func PutNonceofAccount() (uint64, error) { - ts := uint64(time.Now().UTC().UnixNano()) - c := atomic.AddUint64(&counter, 1) - return ts<<16 | (c & 0xFFFF), nil // embed counter in low bits + for { + ns := uint64(time.Now().UTC().UnixNano()) + prev := lastNonce.Load() + next := ns + if next <= prev { + next = prev + 1 // same-ns collision: bump forward + } + if lastNonce.CompareAndSwap(prev, next) { + return next, nil + } + // CAS lost race against another goroutine — retry + } } // Create Account from DID and Address and Store using StoreAccount @@ -335,7 +358,7 @@ func BatchCreateAccountsOrdered(PooledConnection *config.PooledConnection, entri // BatchRestoreAccounts applies a batch of entries into accountsdb. // For address: keys it writes KV. For did: it creates a bound reference to the corresponding address key. -func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []struct { +func BatchRestoreAccounts(ctx context.Context, PooledConnection *config.PooledConnection, entries []struct { Key string Value []byte }) error { @@ -345,12 +368,6 @@ func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []s var err error var shouldReturnConnection bool - // Define Function wide context for timeout - ctx := context.Background() - - // End the context.Background() - defer ctx.Done() - if PooledConnection == nil || PooledConnection.Client == nil { PooledConnection, err = GetAccountConnectionandPutBack(ctx) if err != nil { @@ -386,6 +403,96 @@ func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []s } } + // Deduplicate address entries via hash set: the sender may include the same key + // multiple times in one page. The LWW check reads the committed DB value (not the + // in-progress ops slice), so both copies would independently pass and produce a + // duplicate key in ExecAll. Build a key→entry map keeping the highest UpdatedAt, + // then flatten back to slice. + { + type entry = struct { + Key string + Value []byte + } + addrSet := make(map[string]entry, len(addressEntries)) + for _, e := range addressEntries { + cur, ok := addrSet[e.Key] + if !ok { + addrSet[e.Key] = e + continue + } + var curAcc, inAcc Account + if json.Unmarshal(cur.Value, &curAcc) == nil && + json.Unmarshal(e.Value, &inAcc) == nil && + inAcc.UpdatedAt > curAcc.UpdatedAt { + addrSet[e.Key] = e + } + } + addressEntries = make([]entry, 0, len(addrSet)) + for _, e := range addrSet { + addressEntries = append(addressEntries, e) + } + } + + // Deduplicate DID entries via hash set: refs are idempotent, last occurrence wins. + { + type entry = struct { + Key string + Value []byte + } + didSet := make(map[string]entry, len(didEntries)) + for _, e := range didEntries { + didSet[e.Key] = e + } + didEntries = make([]entry, 0, len(didSet)) + for _, e := range didSet { + didEntries = append(didEntries, e) + } + } + + // Pre-fetch all existing account values in one GetAll RPC instead of N individual Gets + // during the LWW loop. Holding a connection across 3000+ sequential Gets exhausts the + // pool (max 20) when multiple dispatchWorkers run concurrently. + existingAccounts := make(map[string]Account, len(addressEntries)) + { + prefetchSet := make(map[string]struct{}, len(addressEntries)+len(didEntries)) + prefetchKeys := make([][]byte, 0, len(addressEntries)+len(didEntries)) + for _, e := range addressEntries { + if _, ok := prefetchSet[e.Key]; !ok { + prefetchSet[e.Key] = struct{}{} + prefetchKeys = append(prefetchKeys, []byte(e.Key)) + } + } + for _, e := range didEntries { + var acc Account + if json.Unmarshal(e.Value, &acc) == nil { + k := fmt.Sprintf("%s%s", Prefix, acc.Address) + if _, ok := prefetchSet[k]; !ok { + prefetchSet[k] = struct{}{} + prefetchKeys = append(prefetchKeys, []byte(k)) + } + } + } + if len(prefetchKeys) > 0 { + fetchCtx, fetchCancel := context.WithTimeout(ctx, 30*time.Second) + entriesList, getAllErr := PooledConnection.Client.Client.GetAll(fetchCtx, prefetchKeys) + fetchCancel() + if getAllErr == nil && entriesList != nil { + for _, entry := range entriesList.Entries { + if entry == nil || entry.Value == nil { + continue + } + var acc Account + if json.Unmarshal(entry.Value, &acc) == nil { + existingAccounts[string(entry.Key)] = acc + } + } + } + // GetAll failure is treated as "all accounts are new" — safe degradation; + // worst case we write data that LWW would have skipped, but correctness + // is preserved because ImmuDB is append-only and the node re-syncs on divergence. + } + } + // Build a map of address keys being written in this batch for quick lookup addressKeysInBatch := make(map[string]bool) for _, e := range addressEntries { @@ -412,49 +519,57 @@ func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []s var shouldWrite = true var incoming Account if err := json.Unmarshal(e.Value, &incoming); err == nil { - // Try read existing account - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - entry, getErr := PooledConnection.Client.Client.Get(ctx, []byte(e.Key)) - cancel() - if getErr == nil && entry != nil && len(entry.Value) > 0 { - var existing Account - if jsonErr := json.Unmarshal(entry.Value, &existing); jsonErr == nil { - // If existing is newer, skip writing to preserve newer balance - if existing.UpdatedAt > incoming.UpdatedAt { - // Remove from batch map since we're not writing it - delete(addressKeysInBatch, e.Key) - shouldWrite = false - } else if existing.UpdatedAt == incoming.UpdatedAt { - // If timestamps are equal, only update if incoming has different balance - // This handles race conditions where sync happens during local update - if existing.Balance == incoming.Balance { - // Same timestamp and balance - skip to avoid unnecessary write - delete(addressKeysInBatch, e.Key) - shouldWrite = false - } - // Same timestamp but different balance - write it (takes newer data) + if existing, found := existingAccounts[e.Key]; found { + if existing.UpdatedAt > incoming.UpdatedAt { + delete(addressKeysInBatch, e.Key) + shouldWrite = false + } else if existing.UpdatedAt == incoming.UpdatedAt && existing.Balance == incoming.Balance { + // Same timestamp and balance - no change needed + delete(addressKeysInBatch, e.Key) + shouldWrite = false + } + if shouldWrite && existing.UpdatedAt < incoming.UpdatedAt { + loggerCtx, cancel := context.WithCancel(context.Background()) + defer cancel() + PooledConnection.Client.Logger.Debug(loggerCtx, "Updating account - incoming is newer (LWW)", + ion.String("key", e.Key), + ion.Int64("existing_updated_at", existing.UpdatedAt), + ion.Int64("incoming_updated_at", incoming.UpdatedAt), + ion.String("existing_balance", existing.Balance), + ion.String("incoming_balance", incoming.Balance), + ion.String("database", config.AccountsDBName), + ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), + ion.String("log_file", LOG_FILE), + ion.String("topic", TOPIC), + ion.String("function", "DB_OPs.BatchRestoreAccounts")) + } + + // FIELD MERGING: Prevent partial updates (e.g. from Reconciliation) from wiping out account metadata + if shouldWrite { + // 1. Preserve DIDAddress if incoming DID is empty or mistakenly set to the hex address + if incoming.DIDAddress == "" || incoming.DIDAddress == incoming.Address.Hex() { + incoming.DIDAddress = existing.DIDAddress } - // incoming.UpdatedAt > existing.UpdatedAt - we write the newer data - if shouldWrite && existing.UpdatedAt < incoming.UpdatedAt { - loggerCtx, cancel := context.WithCancel(context.Background()) - defer cancel() - PooledConnection.Client.Logger.Debug(loggerCtx, "Updating account - incoming is newer (LWW)", - ion.String("key", e.Key), - ion.Int64("existing_updated_at", existing.UpdatedAt), - ion.Int64("incoming_updated_at", incoming.UpdatedAt), - ion.String("existing_balance", existing.Balance), - ion.String("incoming_balance", incoming.Balance), - ion.String("database", config.AccountsDBName), - ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), - ion.String("log_file", LOG_FILE), - ion.String("topic", TOPIC), - ion.String("function", "DB_OPs.BatchRestoreAccounts")) + // 2. Preserve CreatedAt + if incoming.CreatedAt == 0 { + incoming.CreatedAt = existing.CreatedAt + } + // 3. Preserve AccountType + if incoming.AccountType == "user" && existing.AccountType != "" { + incoming.AccountType = existing.AccountType + } + // 4. Preserve Metadata + if incoming.Metadata == nil { + incoming.Metadata = existing.Metadata + } + + // Re-serialize the merged account object to overwrite e.Value + if mergedVal, err := json.Marshal(incoming); err == nil { + e.Value = mergedVal } } - // If existing unmarshal fails, proceed with write (shouldWrite = true) } } else { - // Account doesn't exist yet - we'll create it loggerCtx, cancel := context.WithCancel(context.Background()) defer cancel() PooledConnection.Client.Logger.Debug(loggerCtx, "Creating new account during sync", @@ -495,72 +610,21 @@ func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []s } addrKey := fmt.Sprintf("%s%s", Prefix, acc.Address) - // If address key was in batch but skipped, or not in batch at all if !addressKeysInBatch[addrKey] { - // Check if address key exists in database - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - _, getErr := PooledConnection.Client.Client.Get(ctx, []byte(addrKey)) - cancel() - if getErr == nil { - // Address key exists in DB - create reference - didKey := []byte(e.Key) + if _, found := existingAccounts[addrKey]; found { ops = append(ops, &schema.Op{Operation: &schema.Op_Ref{Ref: &schema.ReferenceRequest{ - Key: didKey, + Key: []byte(e.Key), ReferencedKey: []byte(addrKey), AtTx: 0, BoundRef: true, }}}) } - // If getErr != nil, address key doesn't exist - skip creating orphaned reference + // addrKey not in existingAccounts → doesn't exist in DB → skip orphaned ref } - // If addressKeysInBatch[addrKey] is true, we already processed it above + // addressKeysInBatch[addrKey] == true → DID ref already appended in Pass 1 } - // Process did: keys after address: keys are updated - for _, e := range didEntries { - // For DID keys, create a reference to the address key - var acc Account - if err := json.Unmarshal(e.Value, &acc); err != nil { - // If payload is not an Account, skip creating ref to avoid corrupt data - continue - } - addrKey := fmt.Sprintf("%s%s", Prefix, acc.Address) - - // Check if address key is being written in this batch OR already exists in DB - // This ensures references are only created for valid address keys - shouldCreateRef := false - if addressKeysInBatch[addrKey] { - // Address key is being written in this batch - safe to create reference - shouldCreateRef = true - } else { - // Check if address key exists in database - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - _, getErr := PooledConnection.Client.Client.Get(ctx, []byte(addrKey)) - cancel() - if getErr == nil { - // Address key exists in database - safe to create reference - shouldCreateRef = true - } - } - - if !shouldCreateRef { - // Address key doesn't exist - skip creating reference - // This can happen if address: key was skipped due to LWW or was never synced - continue - } - - didKey := []byte(e.Key) - ops = append(ops, &schema.Op{Operation: &schema.Op_Ref{Ref: &schema.ReferenceRequest{ - Key: didKey, - ReferencedKey: []byte(addrKey), - AtTx: 0, - BoundRef: true, - }}}) - } - ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) - defer cancel() if len(ops) == 0 { - // Nothing to apply (e.g., all entries skipped by LWW) -> treat as success loggerCtx, cancel := context.WithCancel(context.Background()) defer cancel() PooledConnection.Client.Logger.Debug(loggerCtx, "No operations to apply in batch restore (all skipped by LWW)", @@ -582,19 +646,32 @@ func BatchRestoreAccounts(PooledConnection *config.PooledConnection, entries []s ion.String("topic", TOPIC), ion.String("function", "DB_OPs.BatchRestoreAccounts")) - _, err = PooledConnection.Client.Client.ExecAll(ctx, &schema.ExecAllRequest{Operations: ops}) - if err != nil { - loggerCtx, cancel := context.WithCancel(context.Background()) - defer cancel() - PooledConnection.Client.Logger.Error(loggerCtx, "Batch restore ExecAll failed", - err, - ion.Int("operations_count", len(ops)), - ion.String("database", config.AccountsDBName), - ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), - ion.String("log_file", LOG_FILE), - ion.String("topic", TOPIC), - ion.String("function", "DB_OPs.BatchRestoreAccounts")) - return fmt.Errorf("accounts batch restore failed: %w", err) + // Chunk ops to stay within ImmuDB's MaxTxEntries limit (default 1024). + // Each chunk is its own atomic transaction; LWW semantics make this safe. + const immudbMaxOpsPerTx = 1000 + for chunkStart := 0; chunkStart < len(ops); chunkStart += immudbMaxOpsPerTx { + end := chunkStart + immudbMaxOpsPerTx + if end > len(ops) { + end = len(ops) + } + chunkCtx, chunkCancel := context.WithTimeout(ctx, 30*time.Second) + _, err = PooledConnection.Client.Client.ExecAll(chunkCtx, &schema.ExecAllRequest{Operations: ops[chunkStart:end]}) + chunkCancel() + if err != nil { + loggerCtx2, cancel2 := context.WithCancel(context.Background()) + defer cancel2() + PooledConnection.Client.Logger.Error(loggerCtx2, "Batch restore ExecAll failed", + err, + ion.Int("operations_count", end-chunkStart), + ion.Int("chunk_start", chunkStart), + ion.Int("total_ops", len(ops)), + ion.String("database", config.AccountsDBName), + ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), + ion.String("log_file", LOG_FILE), + ion.String("topic", TOPIC), + ion.String("function", "DB_OPs.BatchRestoreAccounts")) + return fmt.Errorf("accounts batch restore failed: %w", err) + } } loggerCtx2, cancel2 := context.WithCancel(context.Background()) @@ -1047,8 +1124,8 @@ func ListAccountsPaginated(PooledConnection *config.PooledConnection, limit, off Desc: true, // latest accounts first } ReadCtx, ReadCancel := context.WithTimeout(context.Background(), 10*time.Second) - defer ReadCancel() scanResult, err := ic.Client.Scan(ReadCtx, scanReq) + ReadCancel() if err != nil { loggerCtx, cancel := context.WithCancel(context.Background()) defer cancel() @@ -1083,7 +1160,6 @@ func ListAccountsPaginated(PooledConnection *config.PooledConnection, limit, off var acc Account if err := json.Unmarshal(entry.Value, &acc); err != nil { loggerCtx, cancel := context.WithCancel(context.Background()) - defer cancel() PooledConnection.Client.Logger.Warn(loggerCtx, "Skipping account due to unmarshal error", ion.String("error", err.Error()), ion.String("key", string(entry.Key)), @@ -1092,6 +1168,7 @@ func ListAccountsPaginated(PooledConnection *config.PooledConnection, limit, off ion.String("log_file", LOG_FILE), ion.String("topic", TOPIC), ion.String("function", "DB_OPs.ListAccountsPaginated")) + cancel() continue } @@ -1132,6 +1209,114 @@ func ListAccountsPaginated(PooledConnection *config.PooledConnection, limit, off return accounts, nil } +// ListAccountsPaginatedFrom retrieves up to limit accounts starting after seekKey in ascending key order. +// seekKey=nil starts from the first address: entry. Returns the accounts and the scan cursor +// (key of the last accepted account); pass it as seekKey on the next call to continue without rescanning. +// +// Time: O(limit) ImmuDB entries read; Space: O(limit) +// DS: ImmuDB ascending Scan with SeekKey cursor — no offset restart across calls. +func ListAccountsPaginatedFrom(PooledConnection *config.PooledConnection, limit int, seekKey []byte, extendedPrefix string) ([]*Account, []byte, error) { + var err error + var shouldReturnConnection = false + + ctx := context.Background() + + if PooledConnection == nil || PooledConnection.Client == nil { + PooledConnection, err = GetAccountConnectionandPutBack(ctx) + if err != nil { + return nil, nil, fmt.Errorf("failed to get connection from pool: %w - ListAccountsPaginatedFrom", err) + } + shouldReturnConnection = true + } + if shouldReturnConnection { + defer func() { + PutAccountsConnection(PooledConnection) + }() + } + + ic := PooledConnection.Client + if err := ensureAccountsDBSelected(PooledConnection); err != nil { + return nil, nil, fmt.Errorf("failed to ensure accounts database is selected: %w - ListAccountsPaginatedFrom", err) + } + + prefix := []byte(Prefix) + var accounts []*Account + var lastKey []byte + const internalBatch = 1000 + currentSeek := seekKey + + for len(accounts) < limit { + scanReq := &schema.ScanRequest{ + Prefix: prefix, + Limit: uint64(internalBatch), + SeekKey: currentSeek, + Desc: false, + } + + scanCtx, scanCancel := context.WithTimeout(context.Background(), 10*time.Second) + scanResult, scanErr := ic.Client.Scan(scanCtx, scanReq) + scanCancel() + + if scanErr != nil { + loggerCtx, cancel := context.WithCancel(context.Background()) + defer cancel() + ic.Logger.Error(loggerCtx, "Failed to scan for accounts", + scanErr, + ion.String("database", config.AccountsDBName), + ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), + ion.String("log_file", LOG_FILE), + ion.String("topic", TOPIC), + ion.String("function", "DB_OPs.ListAccountsPaginatedFrom")) + return nil, nil, fmt.Errorf("failed to scan for accounts: %w - ListAccountsPaginatedFrom", scanErr) + } + + if len(scanResult.Entries) == 0 { + break + } + + // ImmuDB Scan is inclusive on SeekKey — skip the first entry if it is the cursor itself. + startIndex := 0 + if currentSeek != nil && string(scanResult.Entries[0].Key) == string(currentSeek) { + startIndex = 1 + } + + for i := startIndex; i < len(scanResult.Entries) && len(accounts) < limit; i++ { + entry := scanResult.Entries[i] + + var acc Account + if jsonErr := json.Unmarshal(entry.Value, &acc); jsonErr != nil { + loggerCtx, cancel := context.WithCancel(context.Background()) + ic.Logger.Warn(loggerCtx, "Skipping account due to unmarshal error", + ion.String("error", jsonErr.Error()), + ion.String("key", string(entry.Key)), + ion.String("database", config.AccountsDBName), + ion.String("created_at", time.Now().UTC().Format(time.RFC3339)), + ion.String("log_file", LOG_FILE), + ion.String("topic", TOPIC), + ion.String("function", "DB_OPs.ListAccountsPaginatedFrom")) + cancel() + continue + } + + if extendedPrefix != "" && !strings.HasPrefix(acc.DIDAddress, extendedPrefix) { + continue + } + + accounts = append(accounts, &acc) + lastKey = entry.Key + } + + if len(accounts) >= limit || len(scanResult.Entries) < internalBatch { + break + } + + // Advance cursor to the end of this scan batch. + currentSeek = scanResult.Entries[len(scanResult.Entries)-1].Key + } + + return accounts, lastKey, nil +} + // CountAccounts returns the total number of Accounts in the database. // This implementation scans keys without loading them all into memory. func CountAccounts(PooledConnection *config.PooledConnection) (int, error) { @@ -1214,10 +1399,9 @@ func GetTransactionsByAccount(PooledConnection *config.PooledConnection, account // Process current batch of blocks for i := startBlock; i <= endBlock; i++ { - block, err := GetZKBlockByNumber(PooledConnection, i) + block, err := GetZKBlockByNumberFast(PooledConnection, i) if err != nil { loggerCtx, cancel := context.WithCancel(context.Background()) - defer cancel() ic.Logger.Warn(loggerCtx, "Error retrieving block, skipping", ion.String("error", err.Error()), ion.Uint64("block_number", i), @@ -1226,6 +1410,7 @@ func GetTransactionsByAccount(PooledConnection *config.PooledConnection, account ion.String("log_file", LOG_FILE), ion.String("topic", TOPIC), ion.String("function", "DB_OPs.GetTransactionsByAccount")) + cancel() continue } @@ -1538,10 +1723,9 @@ func GetTransactionsByAccountPaginated(PooledConnection *config.PooledConnection // Process current batch of blocks (in reverse order) for i := currentBlock; i >= startBlock && len(allMatchingTxs) < transactionsNeeded; i-- { - block, err := GetZKBlockByNumber(PooledConnection, i) + block, err := GetZKBlockByNumberFast(PooledConnection, i) if err != nil { loggerCtx, cancel := context.WithCancel(context.Background()) - defer cancel() ic.Logger.Warn(loggerCtx, "Error retrieving block, skipping", ion.String("error", err.Error()), ion.Uint64("block_number", i), @@ -1550,6 +1734,7 @@ func GetTransactionsByAccountPaginated(PooledConnection *config.PooledConnection ion.String("log_file", LOG_FILE), ion.String("topic", TOPIC), ion.String("function", "DB_OPs.GetTransactionsByAccountPaginated")) + cancel() continue } diff --git a/DB_OPs/immuclient.go b/DB_OPs/immuclient.go index 61b45baf..2dc775f8 100644 --- a/DB_OPs/immuclient.go +++ b/DB_OPs/immuclient.go @@ -847,7 +847,8 @@ func getKeysBatch(PooledConnection *config.PooledConnection, prefix string, limi Prefix: []byte(prefix), Limit: uint64(limit), SeekKey: seekKey, - Desc: true, // latest keys first + Desc: false, // ASC: Prefix filter is reliable only in ascending scans; + // DESC with no matching keys falls backward past the prefix and returns wrong results } ic.Logger.Debug(loggerCtx, fmt.Sprintf("Scanning keys with prefix: %s (limit: %d)", prefix, limit), @@ -2083,6 +2084,46 @@ func GetZKBlockByNumber(mainDBClient *config.PooledConnection, blockNumber uint6 return block, nil } +// GetZKBlockByNumberFast retrieves a ZK block by number using plain Get (no proof generation). +// Use for sync/reconciliation paths where tamper-proof guarantees are not required. +// 5–10× faster than GetZKBlockByNumber for bulk reads. +// +// Time: O(1); Space: O(block size) +func GetZKBlockByNumberFast(mainDBClient *config.PooledConnection, blockNumber uint64) (*config.ZKBlock, error) { + var shouldReturnConnection = false + var err error + blockKey := fmt.Sprintf("%s%d", PREFIX_BLOCK, blockNumber) + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + block := new(config.ZKBlock) + if mainDBClient == nil { + mainDBClient, err = GetMainDBConnectionandPutBack(ctx) + if err != nil { + return nil, fmt.Errorf("failed to get main DB connection: %w - GetZKBlockByNumberFast", err) + } + shouldReturnConnection = true + } + + if shouldReturnConnection { + defer func() { + PutMainDBConnection(mainDBClient) + }() + } + + entry, err := mainDBClient.Client.Client.Get(ctx, []byte(blockKey)) + if err != nil { + return nil, fmt.Errorf("failed to retrieve block %d: %w", blockNumber, err) + } + + if err := json.Unmarshal(entry.Value, block); err != nil { + return nil, fmt.Errorf("failed to unmarshal block %d: %w", blockNumber, err) + } + + return block, nil +} + // GetZKBlockByHash retrieves a ZK block by its hash (UNCHANGED) func GetZKBlockByHash(mainDBClient *config.PooledConnection, blockHash string) (*config.ZKBlock, error) { // First get the block number from the hash @@ -2612,4 +2653,4 @@ func ensureConnectionDatabaseSelected(pc *config.PooledConnection) error { defer cancel() _, err := pc.Client.Client.UseDatabase(ctx, &schema.Database{DatabaseName: pc.Database}) return err -} +} \ No newline at end of file diff --git a/FastsyncV2/fastsyncv2.go b/FastsyncV2/fastsyncv2.go index c7d68431..30b5c052 100644 --- a/FastsyncV2/fastsyncv2.go +++ b/FastsyncV2/fastsyncv2.go @@ -25,11 +25,13 @@ import ( NodeInfo "gossipnode/DB_OPs/Nodeinfo" "github.com/JupiterMetaLabs/JMDN-FastSync/common/WAL" + accountspb "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/accounts" availabilitypb "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/availability" blockpb "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/block" headersyncpb "github.com/JupiterMetaLabs/JMDN-FastSync/common/proto/headersync" "github.com/JupiterMetaLabs/JMDN-FastSync/common/types" wal_types "github.com/JupiterMetaLabs/JMDN-FastSync/common/types/wal" + "github.com/JupiterMetaLabs/JMDN-FastSync/core/accountsync" "github.com/JupiterMetaLabs/JMDN-FastSync/core/availability" "github.com/JupiterMetaLabs/JMDN-FastSync/core/datasync" "github.com/JupiterMetaLabs/JMDN-FastSync/core/headersync" @@ -37,6 +39,7 @@ import ( potsrequesthelper "github.com/JupiterMetaLabs/JMDN-FastSync/core/pots/helper" "github.com/JupiterMetaLabs/JMDN-FastSync/core/priorsync" "github.com/JupiterMetaLabs/JMDN-FastSync/core/reconsillation" + "github.com/ethereum/go-ethereum/common" "github.com/libp2p/go-libp2p/core/host" "github.com/libp2p/go-libp2p/core/peer" @@ -44,9 +47,6 @@ import ( ) const ( - // protocolVersion is the FastSync wire protocol version. All routers must agree on this. - protocolVersion = 1 - // checksumVersion is the checksum format used by PriorSync to validate block metadata. // Must match the version used by the NodeInfo adapter (DB_OPs/Nodeinfo.ChecksumVersion). checksumVersion = 2 @@ -55,8 +55,7 @@ const ( // V1 = TCP only, V2 = TCP + QUIC. commsVersion = 2 - // syncTimeout is the maximum wall-clock time for a complete sync operation. - syncTimeout = 15 * time.Minute + priorsyncVersion = 2 ) // FastsyncV2 holds the router instances and shared state for the sync engine. @@ -66,16 +65,20 @@ type FastsyncV2 struct { NodeInfo *types.Nodeinfo WAL *WAL.WAL PoTSWAL *WAL.WAL - PriorRouter priorsync.Priorsync_router - HeaderRouter headersync.Headersync_router - DataRouter datasync.DataSync_router - AvailRouter availability.Availability_router - ReconRouter reconsillation.Reconciliation_router - PoTSRouter *pots.PoTS + PriorRouter priorsync.Priorsync_router + HeaderRouter headersync.Headersync_router + DataRouter datasync.DataSync_router + AvailRouter availability.Availability_router + ReconRouter reconsillation.Reconciliation_router + PoTSRouter *pots.PoTS + AccountSyncRouter accountsync.AccountSync_router // blockInfoAdapter is the ImmuDB-backed implementation of types.BlockInfo. // Used for local block queries, header/data writes, and account management. blockInfoAdapter types.BlockInfo + + // syncTimeout is the maximum wall-clock time for a complete sync operation. + syncTimeout time.Duration } // NewFastsyncV2 initializes the JMDN-FastSync V2 engine over the given libp2p host. @@ -83,7 +86,7 @@ type FastsyncV2 struct { // It creates the NodeInfo adapter (ImmuDB), initializes both WALs (standard + PoTS), // creates and configures all protocol routers, and starts the server-side network handlers // so this node can respond to incoming sync requests from other peers. -func NewFastsyncV2(h host.Host) (*FastsyncV2, error) { +func NewFastsyncV2(h host.Host, syncTimeout time.Duration) (*FastsyncV2, error) { ctx := context.Background() // --- 1. Initialize the BlockInfo adapter (ImmuDB → JMDN-FastSync interface) --- @@ -123,12 +126,17 @@ func NewFastsyncV2(h host.Host) (*FastsyncV2, error) { availRouter := availability.NewAvailability() reconRouter := reconsillation.NewReconciliation() potsRouter := pots.NewPoTS() + accountSyncRouter := accountsync.NewAccountSync() // --- 5. Configure routers with shared sync variables --- - // PriorSync takes both protocol version AND checksum version (unique among routers). - priorRouter.SetSyncVars(ctx, protocolVersion, checksumVersion, *nodeinfo, h, wal) - headerRouter.SetSyncVars(ctx, protocolVersion, *nodeinfo, h, wal) - dataRouter.SetSyncVars(ctx, protocolVersion, *nodeinfo, h, wal) + // The first version parameter to SetSyncVars controls transport selection in the + // Communication layer (V1=TCP-only, V2=TCP+QUIC). Since JMDN nodes listen on both + // TCP and QUIC, we must use commsVersion (2) so server-side bisection callbacks + // can reach peers that connected over QUIC. + // PriorSync takes both comms version AND checksum version (unique among routers). + priorRouter.SetSyncVars(ctx, priorsyncVersion, checksumVersion, *nodeinfo, h, wal) + headerRouter.SetSyncVars(ctx, commsVersion, *nodeinfo, h, wal) + dataRouter.SetSyncVars(ctx, commsVersion, *nodeinfo, h, wal) // Availability and Reconciliation share the same SyncVars derived from PriorSync. syncVars := priorRouter.GetSyncVars() @@ -136,9 +144,12 @@ func NewFastsyncV2(h host.Host) (*FastsyncV2, error) { reconRouter.SetSyncVarsConfig(ctx, *syncVars) // PoTS uses its own isolated WAL for live block buffering. - potsRouter.SetSyncVars(ctx, protocolVersion, *nodeinfo, h) + // commsVersion (2) enables QUIC transport with TCP fallback, matching the other routers. + potsRouter.SetSyncVars(ctx, commsVersion, *nodeinfo, h) potsRouter.SetWAL(ctx, potsWAL) + accountSyncRouter.SetSyncVars(ctx, commsVersion, *nodeinfo, h, wal) + // --- 6. Mark this node as available for sync and start server-side handlers --- // IAmAvailable allows other nodes to discover us via Availability requests. availability.FastsyncReady().IAmAvailable() @@ -160,16 +171,64 @@ func NewFastsyncV2(h host.Host) (*FastsyncV2, error) { NodeInfo: nodeinfo, WAL: wal, PoTSWAL: potsWAL, - PriorRouter: priorRouter, - HeaderRouter: headerRouter, - DataRouter: dataRouter, - AvailRouter: availRouter, - ReconRouter: reconRouter, - PoTSRouter: potsRouter, - blockInfoAdapter: blockInfo, + PriorRouter: priorRouter, + HeaderRouter: headerRouter, + DataRouter: dataRouter, + AvailRouter: availRouter, + ReconRouter: reconRouter, + PoTSRouter: potsRouter, + AccountSyncRouter: accountSyncRouter, + blockInfoAdapter: blockInfo, + syncTimeout: syncTimeout, }, nil } +// AccountSyncOnly connects to targetPeer, performs Availability (to get auth), +// then runs AccountSync only — skipping block comparison and data sync entirely. +// Use this when both nodes have identical blocks but the local node is missing accounts. +func (fs *FastsyncV2) AccountSyncOnly(targetPeer string) (uint64, error) { + ctx, cancel := context.WithTimeout(context.Background(), fs.syncTimeout) + defer cancel() + + maddr, err := multiaddr.NewMultiaddr(targetPeer) + if err != nil { + return 0, fmt.Errorf("invalid multiaddr %q: %w", targetPeer, err) + } + info, err := peer.AddrInfoFromP2pAddr(maddr) + if err != nil { + return 0, fmt.Errorf("extract peer info: %w", err) + } + if err := fs.Host.Connect(ctx, *info); err != nil { + return 0, fmt.Errorf("connect to peer %s: %w", info.ID, err) + } + + peerAddrs := fs.Host.Peerstore().Addrs(info.ID) + if len(peerAddrs) == 0 { + peerAddrs = info.Addrs + } + targetNodeInfo := &types.Nodeinfo{ + PeerID: info.ID, + Multiaddr: peerAddrs, + Version: commsVersion, + } + + availResp, err := fs.AvailRouter.SendAvailabilityRequest( + ctx, fs.PriorRouter.GetSyncVars(), *targetNodeInfo, 0, math.MaxUint64, + ) + if err != nil { + return 0, fmt.Errorf("availability request failed: %w", err) + } + if !availResp.IsAvailable { + return 0, fmt.Errorf("peer %s reports unavailable for FastSync", info.ID) + } + if availResp.Auth == nil || availResp.Auth.UUID == "" { + return 0, fmt.Errorf("peer %s returned no auth token", info.ID) + } + log.Printf("[FastsyncV2] AccountSyncOnly: authorized (UUID=%s), starting AccountSync", availResp.Auth.UUID) + + return fs.AccountSyncRouter.AccountSync(availResp) +} + // HandleSync executes the full FastSync protocol with the target peer. // // The target peer must be a valid libp2p multiaddress with an embedded peer ID, @@ -183,10 +242,47 @@ func NewFastsyncV2(h host.Host) (*FastsyncV2, error) { // 5. Reconciliation — recompute and commit account balances. // 6. PoTS — catch up on blocks produced during steps 2–5. func (fs *FastsyncV2) HandleSync(targetPeer string) error { + return fs.handleSyncInternal(targetPeer, 0) +} + +// HandleStartupSync syncs from an already-connected peer, starting from the local +// latest block number. This is used on node startup/restart to catch up on blocks +// missed while offline, without re-syncing the entire chain. +func (fs *FastsyncV2) HandleStartupSync(peerID peer.ID, addrs []multiaddr.Multiaddr) error { + if len(addrs) == 0 { + return fmt.Errorf("no addresses for peer %s", peerID) + } + + // Build the full multiaddr string with embedded peer ID (required by handleSyncInternal) + targetMultiaddr := fmt.Sprintf("%s/p2p/%s", addrs[0].String(), peerID.String()) + + // Ensure local marker is up to date before determining start block + fs.reconcileLocalLatestBlock() + localBlockNum := fs.blockInfoAdapter.GetBlockDetails().Blocknumber + startBlock := localBlockNum + if startBlock == 0 { + // Fresh node with no blocks — do a full sync + log.Printf("[FastsyncV2] StartupSync: fresh node (block 0), performing full sync") + } else { + log.Printf("[FastsyncV2] StartupSync: resuming from block %d", startBlock) + } + + return fs.handleSyncInternal(targetMultiaddr, startBlock) +} + +// handleSyncInternal is the core sync engine. startBlock controls where PriorSync +// begins comparing: 0 for a full sync, or localBlockNum for incremental startup sync. +func (fs *FastsyncV2) handleSyncInternal(targetPeer string, startBlock uint64) error { syncStart := time.Now() - ctx, cancel := context.WithTimeout(context.Background(), syncTimeout) + ctx, cancel := context.WithTimeout(context.Background(), fs.syncTimeout) defer cancel() + // --- 0. Pre-sync reconciliation --- + // Ensure our local block marker is accurate before starting + log.Printf("[FastsyncV2] Reconciling local block marker before sync...") + fs.reconcileLocalLatestBlock() + + // --- Parse and connect to the target peer --- maddr, err := multiaddr.NewMultiaddr(targetPeer) if err != nil { @@ -202,12 +298,20 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { } log.Printf("[FastsyncV2] Connected to peer %s", info.ID) + // After connecting, fetch all addresses the peer advertises from the peerstore. + // info.Addrs only contains the single address from the user-supplied multiaddr, + // which may be QUIC-only. PoTS V1 requires TCP; the peerstore will have both. + peerAddrs := fs.Host.Peerstore().Addrs(info.ID) + if len(peerAddrs) == 0 { + peerAddrs = info.Addrs + } + // Construct the target's NodeInfo for all subsequent protocol calls. // BlockInfo is nil because we don't need to query the remote's DB locally — the // routers communicate with the remote via libp2p streams. targetNodeInfo := &types.Nodeinfo{ PeerID: info.ID, - Multiaddr: info.Addrs, + Multiaddr: peerAddrs, Version: commsVersion, } @@ -217,7 +321,7 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { log.Printf("[FastsyncV2] Phase 1: Checking availability of peer %s", info.ID) availResp, err := fs.AvailRouter.SendAvailabilityRequest( - ctx, fs.PriorRouter.GetSyncVars(), *targetNodeInfo, 0, math.MaxUint64, + ctx, fs.PriorRouter.GetSyncVars(), *targetNodeInfo, startBlock, math.MaxUint64, ) if err != nil { return fmt.Errorf("availability request failed: %w", err) @@ -234,13 +338,13 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { // PHASE 2: PriorSync — identify divergent block ranges via Merkle comparison // ========================================================================= localBlockNum := fs.blockInfoAdapter.GetBlockDetails().Blocknumber - log.Printf("[FastsyncV2] Phase 2: PriorSync (local latest block: %d)", localBlockNum) + log.Printf("[FastsyncV2] Phase 2: PriorSync (local latest block: %d, start: %d)", localBlockNum, startBlock) - // Request range [0, localBlockNum] locally vs [0, MaxUint64] on remote. - // The remote builds a Merkle tree for its blocks and uses TreeDiff to find - // all divergent ranges. Returns a Tag (list of block numbers + ranges). + // Compare [startBlock, localBlockNum] locally vs [startBlock, MaxUint64] on remote. + // startBlock=0 → full sync (compare entire chain) + // startBlock=N → incremental sync (only compare from block N onward) resp, err := fs.PriorRouter.PriorSync( - 0, localBlockNum, 0, math.MaxUint64, targetNodeInfo, availResp.Auth, + startBlock, localBlockNum, startBlock, math.MaxUint64, targetNodeInfo, availResp.Auth, ) if err != nil { return fmt.Errorf("priorsync failed: %w", err) @@ -257,6 +361,22 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { // In our case we sync from a single peer, but the API supports multi-peer failover. remotes := []*availabilitypb.AvailabilityResponse{availResp} + // ========================================================================= + // PHASE 2.5: AccountSync — sync zero-transaction accounts before header fetch + // ========================================================================= + // Upload our local account nonce ART; server diffs it against its own accounts + // and streams any missing ones back via dial-back (AccountsSyncDataProtocol). + // Those accounts are written to DB by the stream handler before this returns. + // Must run before HeaderSync so Reconciliation sees a complete account set. + log.Println("[FastsyncV2] Phase 2.5: AccountSync") + + totalMissing, err := fs.AccountSyncRouter.AccountSync(availResp) + if err != nil { + log.Printf("[FastsyncV2] Phase 2.5 warning: AccountSync failed: %v", err) + } else { + log.Printf("[FastsyncV2] Phase 2.5 complete: %d missing accounts synced", totalMissing) + } + // ========================================================================= // PHASE 3: HeaderSync — fetch block headers for divergent ranges // ========================================================================= @@ -291,6 +411,41 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { } log.Println("[FastsyncV2] Phase 4 complete: block data synchronized") + // After DataSync, ensure our latest block marker is updated to reflect the new blocks + // so that Reconciliation and PoTS work with the correct state. + fs.reconcileLocalLatestBlock() + + // ========================================================================= + // PHASE 4.5: FetchAccounts — pull any tagged accounts missing from local DB + // ========================================================================= + // DataSync returns the set of accounts touched by the synced blocks. Before + // Reconciliation replays their transactions, ensure every tagged account + // actually exists locally. Missing ones are fetched in one targeted request. + if taggedAccounts != nil && len(taggedAccounts.Accounts) > 0 { + missingMap := make(map[string]bool) + accountMgr := fs.blockInfoAdapter.NewAccountManager() + for addr := range taggedAccounts.Accounts { + acc, err := accountMgr.GetAccountByAddress(addr) + if err == nil && acc == nil { + missingMap[addr] = true + } + } + if len(missingMap) > 0 { + log.Printf("[FastsyncV2] Phase 4.5: fetching %d missing tagged accounts", len(missingMap)) + resp, err := fs.AccountSyncRouter.FetchAccounts(availResp, missingMap) + if err != nil { + log.Printf("[FastsyncV2] Phase 4.5 warning: FetchAccounts failed: %v", err) + } else if resp != nil && len(resp.GetAccounts()) > 0 { + accounts := protoAccountsToTypes(resp.GetAccounts()) + if err := accountMgr.WriteAccounts(accounts); err != nil { + log.Printf("[FastsyncV2] Phase 4.5 warning: WriteAccounts failed: %v", err) + } else { + log.Printf("[FastsyncV2] Phase 4.5 complete: wrote %d missing tagged accounts", len(accounts)) + } + } + } + } + // ========================================================================= // PHASE 5: Reconciliation — recompute and commit account balances // ========================================================================= @@ -300,7 +455,7 @@ func (fs *FastsyncV2) HandleSync(targetPeer string) error { // 3. Atomic DB commit via AccountManager.BatchUpdateAccounts log.Println("[FastsyncV2] Phase 5: Reconciliation") - reconciledCount, failedAccounts, err := fs.ReconRouter.Reconcile(taggedAccounts) + reconciledCount, failedAccounts, err := fs.ReconRouter.Reconcile(taggedAccounts, availResp) if err != nil { log.Printf("[FastsyncV2] Phase 5 warning: reconciliation returned error: %v", err) } @@ -403,7 +558,7 @@ func (fs *FastsyncV2) executePoTS( // Secondary Reconciliation for accounts affected by PoTS blocks. if potsTaggedAccts != nil { - reconCount, failed, err := fs.ReconRouter.Reconcile(potsTaggedAccts) + reconCount, failed, err := fs.ReconRouter.Reconcile(potsTaggedAccts, availResp) if err != nil { log.Printf("[FastsyncV2] PoTS reconciliation warning: %v", err) } @@ -490,6 +645,9 @@ func (fs *FastsyncV2) Close() { if fs.PoTSRouter != nil { fs.PoTSRouter.Close() } + if fs.AccountSyncRouter != nil { + fs.AccountSyncRouter.Close() + } if fs.WAL != nil { fs.WAL.Close() } @@ -520,6 +678,7 @@ func zkBlockToProtoHeader(b *types.ZKBlock) *blockpb.Header { GasLimit: b.GasLimit, GasUsed: b.GasUsed, BlockNumber: b.BlockNumber, + LogsBloom: b.LogsBloom, } if b.CoinbaseAddr != nil { h.CoinbaseAddr = b.CoinbaseAddr[:] @@ -551,11 +710,12 @@ func zkBlockToProtoNonHeaders(b *types.ZKBlock) *blockpb.NonHeaders { for idx, tx := range b.Transactions { pbTx := &blockpb.Transaction{ - Hash: tx.Hash[:], - Type: uint32(tx.Type), - Nonce: tx.Nonce, - GasLimit: tx.GasLimit, - Data: tx.Data, + Hash: tx.Hash[:], + Type: uint32(tx.Type), + Timestamp: tx.Timestamp, + Nonce: tx.Nonce, + GasLimit: tx.GasLimit, + Data: tx.Data, } if tx.From != nil { pbTx.From = tx.From[:] @@ -566,6 +726,9 @@ func zkBlockToProtoNonHeaders(b *types.ZKBlock) *blockpb.NonHeaders { if tx.Value != nil { pbTx.Value = tx.Value.Bytes() } + if tx.ChainID != nil { + pbTx.ChainId = tx.ChainID.Bytes() + } if tx.GasPrice != nil { pbTx.GasPrice = tx.GasPrice.Bytes() } @@ -575,6 +738,15 @@ func zkBlockToProtoNonHeaders(b *types.ZKBlock) *blockpb.NonHeaders { if tx.MaxPriorityFee != nil { pbTx.MaxPriorityFee = tx.MaxPriorityFee.Bytes() } + for _, at := range tx.AccessList { + pbAT := &blockpb.AccessTuple{ + Address: at.Address[:], + } + for _, sk := range at.StorageKeys { + pbAT.StorageKeys = append(pbAT.StorageKeys, sk[:]) + } + pbTx.AccessList = append(pbTx.AccessList, pbAT) + } if tx.V != nil { pbTx.V = tx.V.Bytes() } @@ -585,6 +757,21 @@ func zkBlockToProtoNonHeaders(b *types.ZKBlock) *blockpb.NonHeaders { pbTx.S = tx.S.Bytes() } + if tx.ChainID != nil { + pbTx.ChainId = tx.ChainID.Bytes() + } + if len(tx.AccessList) > 0 { + for _, al := range tx.AccessList { + pbAl := &blockpb.AccessTuple{ + Address: al.Address[:], + } + for _, sk := range al.StorageKeys { + pbAl.StorageKeys = append(pbAl.StorageKeys, sk[:]) + } + pbTx.AccessList = append(pbTx.AccessList, pbAl) + } + } + nh.Transactions = append(nh.Transactions, &blockpb.DBTransaction{ Tx: pbTx, TxIndex: uint32(idx), @@ -610,3 +797,42 @@ func commitmentToBytes(c []uint32) []byte { } return buf } + +// protoAccountsToTypes converts a slice of proto Account messages to types.Account. +// The address bytes field (20 bytes) is converted to common.Address. +func protoAccountsToTypes(pbAccounts []*accountspb.Account) []*types.Account { + result := make([]*types.Account, 0, len(pbAccounts)) + for _, pb := range pbAccounts { + result = append(result, &types.Account{ + DIDAddress: pb.GetDidAddress(), + Address: common.BytesToAddress(pb.GetAddress()), + Balance: pb.GetBalance(), + Nonce: pb.GetNonce(), + AccountType: pb.GetAccountType(), + CreatedAt: pb.GetCreatedAt(), + UpdatedAt: pb.GetUpdatedAt(), + }) + } + return result +} + +// reconcileLocalLatestBlock ensures the local database marker ("latest_block") matches +// the actual highest block key present in the database. This fixes "stuck" syncs +// caused by failing or outdated markers. +func (fs *FastsyncV2) reconcileLocalLatestBlock() uint64 { + // We use the specialized ReconcileBlockNumber method if available on the adapter + type blockReconciler interface { + ReconcileBlockNumber() uint64 + } + + if reconciler, ok := fs.blockInfoAdapter.(blockReconciler); ok { + num := reconciler.ReconcileBlockNumber() + log.Printf("[FastsyncV2] Local block reconciliation complete: latest block is %d", num) + return num + } + + // Fallback to standard GetBlockNumber if reconciliation is not supported + num := fs.blockInfoAdapter.GetBlockNumber() + log.Printf("[FastsyncV2] Fallback block lookup complete: latest block is %d", num) + return num +} diff --git a/Scripts/block_merkle/main.go b/Scripts/block_merkle/main.go new file mode 100644 index 00000000..74fad984 --- /dev/null +++ b/Scripts/block_merkle/main.go @@ -0,0 +1,529 @@ +// scripts/block_merkle/main.go +// +// Fetches all ZKBlocks from ImmuDB, computes a single hash per block +// (from all fields EXCEPT BlockHash), then builds ONE Merkle tree where: +// +// Level 0 — Leaves: one per block (leaf.hash = SHA256 of all block fields) +// Level 1+ — Parents: SHA256(left_child || right_child) +// Top — Root: the final single hash +// +// Outputs ONE JSON file: +// +// { +// "root": "", +// "total_blocks": 732, +// "from_block": 1, +// "to_block": 732, +// "generated_at": "2026-04-22T06:00:00Z", +// "leaves": [ +// { "block_number": 1, "block_hash_excluded": true, "hash": "" }, +// ... +// ], +// "levels": [ +// { "level": 0, "label": "Leaves", "nodes": [...] }, +// { "level": 1, "label": "Level 1", "nodes": [...] }, +// { "level": N, "label": "Root", "nodes": [{ "hash": "" }] } +// ] +// } +// +// Usage: +// +// go run ./scripts/block_merkle/main.go \ +// -out merkle_all.json \ +// -user immudb -pass immudb +// +// # specific range +// go run ./scripts/block_merkle/main.go \ +// -from 1 -to 500 \ +// -out merkle_1_500.json \ +// -user immudb -pass immudb +// +// Flags: +// +// -from N First block (default: 1) +// -to N Last block (default: latest in DB) +// -out s Output JSON file (default: merkle_all.json) +// -workers N Concurrent fetch workers (default: 4) +// -user s ImmuDB username +// -pass s ImmuDB password +package main + +import ( + "context" + "crypto/sha256" + "encoding/binary" + "encoding/hex" + "encoding/json" + "flag" + "fmt" + "os" + "sort" + "sync" + "time" + + "gossipnode/DB_OPs" + "gossipnode/config" + "gossipnode/config/settings" + "gossipnode/logging" +) + +// --------------------------------------------------------------------------- +// Output types — one combined file for the whole chain +// --------------------------------------------------------------------------- + +// BlockLeaf is one entry in the leaf level of the global Merkle tree. +type BlockLeaf struct { + BlockNumber uint64 `json:"block_number"` + BlockHashExcluded bool `json:"block_hash_excluded"` // always true — BlockHash is never hashed + Hash string `json:"hash"` // SHA256 over all other block fields +} + +// Node is one hash in any Merkle level. +// Left/Right are indices into the level directly below (-1 for leaves). +// +// FromBlock/ToBlock record the inclusive range of block numbers this node +// covers. At leaf level they are both equal to the single block number. +// At every parent level they span left-child.FromBlock → right-child.ToBlock. +// A divergence algorithm can use these to narrow the search without walking +// all the way down to the leaves: if hashes differ, follow the child whose +// [FromBlock, ToBlock] contains the expected divergence range. +type Node struct { + Index int `json:"index"` + Hash string `json:"hash"` + FromBlock uint64 `json:"from_block"` // first block covered by this subtree + ToBlock uint64 `json:"to_block"` // last block covered by this subtree + Left int `json:"left"` // child index in level below (-1 for leaves) + Right int `json:"right"` // child index in level below (-1 for leaves) + Duplicated bool `json:"duplicated,omitempty"` // true when padded from an odd count +} + +// Level is one horizontal row of the tree. +type Level struct { + Level int `json:"level"` // 0 = leaves, highest = root + Label string `json:"label"` + Nodes []Node `json:"nodes"` +} + +// MerkleForest is the single output JSON. +type MerkleForest struct { + Root string `json:"root"` + TotalBlocks uint64 `json:"total_blocks"` + FromBlock uint64 `json:"from_block"` + ToBlock uint64 `json:"to_block"` + GeneratedAt string `json:"generated_at"` + ErrorCount int `json:"error_count,omitempty"` + Leaves []BlockLeaf `json:"leaves"` + Levels []Level `json:"levels"` +} + +// --------------------------------------------------------------------------- +// Block hashing — one canonical hash per block (BlockHash excluded) +// +// Excluded fields (DB/derived, not part of canonical block content): +// - BlockHash: derived from the other fields; including it would be circular +// --------------------------------------------------------------------------- + +func hashBlock(b *config.ZKBlock) string { + h := sha256.New() + + buf8 := make([]byte, 8) + + // ── Scalars ─────────────────────────────────────────────────────────── + binary.BigEndian.PutUint64(buf8, b.BlockNumber) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, uint64(b.Timestamp)) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, b.GasLimit) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, b.GasUsed) + h.Write(buf8) + + // ── Fixed-size hashes ───────────────────────────────────────────────── + h.Write(b.PrevHash.Bytes()) + h.Write(b.StateRoot.Bytes()) + + // ── Strings ─────────────────────────────────────────────────────────── + h.Write([]byte(b.TxnsRoot)) + h.Write([]byte(b.ProofHash)) + h.Write([]byte(b.Status)) + h.Write([]byte(b.ExtraData)) + + // ── Byte slices ─────────────────────────────────────────────────────── + h.Write(b.StarkProof) + h.Write(b.LogsBloom) + + // ── Commitment ([]uint32) ───────────────────────────────────────────── + buf4 := make([]byte, 4) + for _, v := range b.Commitment { + binary.BigEndian.PutUint32(buf4, v) + h.Write(buf4) + } + + // ── Nullable addresses ──────────────────────────────────────────────── + if b.CoinbaseAddr != nil { + h.Write(b.CoinbaseAddr.Bytes()) + } else { + h.Write([]byte("nil")) + } + if b.ZKVMAddr != nil { + h.Write(b.ZKVMAddr.Bytes()) + } else { + h.Write([]byte("nil")) + } + + // ── Transactions ────────────────────────────────────────────────────── + for _, tx := range b.Transactions { + th := sha256.New() + + th.Write(tx.Hash.Bytes()) + + if tx.From != nil { + th.Write(tx.From.Bytes()) + } + if tx.To != nil { + th.Write(tx.To.Bytes()) + } + if tx.Value != nil { + th.Write(tx.Value.Bytes()) + } + + th.Write([]byte{tx.Type}) + + binary.BigEndian.PutUint64(buf8, tx.Timestamp) + th.Write(buf8) + + binary.BigEndian.PutUint64(buf8, tx.Nonce) + th.Write(buf8) + + binary.BigEndian.PutUint64(buf8, tx.GasLimit) + th.Write(buf8) + + if tx.ChainID != nil { + th.Write(tx.ChainID.Bytes()) + } + if tx.GasPrice != nil { + th.Write(tx.GasPrice.Bytes()) + } + if tx.MaxFee != nil { + th.Write(tx.MaxFee.Bytes()) + } + if tx.MaxPriorityFee != nil { + th.Write(tx.MaxPriorityFee.Bytes()) + } + + th.Write(tx.Data) + + // AccessList: each entry = address + storage keys + for _, entry := range tx.AccessList { + th.Write(entry.Address.Bytes()) + for _, key := range entry.StorageKeys { + th.Write(key.Bytes()) + } + } + + if tx.V != nil { + th.Write(tx.V.Bytes()) + } + if tx.R != nil { + th.Write(tx.R.Bytes()) + } + if tx.S != nil { + th.Write(tx.S.Bytes()) + } + + h.Write(th.Sum(nil)) + } + + return hex.EncodeToString(h.Sum(nil)) +} + +// --------------------------------------------------------------------------- +// Global Merkle tree builder +// --------------------------------------------------------------------------- + +func buildMerkleTree(leaves []BlockLeaf) (levels []Level, root string) { + if len(leaves) == 0 { + // SHA256 of the empty string — not a zero hash + empty := hex.EncodeToString(sha256.New().Sum(nil)) + return []Level{{Level: 0, Label: "Root", Nodes: []Node{{Hash: empty, Left: -1, Right: -1}}}}, empty + } + + // Level 0: one node per leaf — FromBlock == ToBlock == the block's own number. + leafNodes := make([]Node, len(leaves)) + for i, l := range leaves { + leafNodes[i] = Node{ + Index: i, + Hash: l.Hash, + FromBlock: l.BlockNumber, + ToBlock: l.BlockNumber, + Left: -1, + Right: -1, + } + } + levels = append(levels, Level{Level: 0, Label: "Leaves", Nodes: leafNodes}) + + cur := make([]string, len(leaves)) + for i, l := range leaves { + cur[i] = l.Hash + } + + for lvl := 1; len(cur) > 1; lvl++ { + padded := false + if len(cur)%2 != 0 { + cur = append(cur, cur[len(cur)-1]) // duplicate last + padded = true + } + + prevLevel := levels[lvl-1] + prevCount := len(prevLevel.Nodes) // real node count before padding + next := make([]string, len(cur)/2) + nodes := make([]Node, len(cur)/2) + + for i := 0; i < len(cur); i += 2 { + l, _ := hex.DecodeString(cur[i]) + r, _ := hex.DecodeString(cur[i+1]) + h := sha256.Sum256(append(l, r...)) + next[i/2] = hex.EncodeToString(h[:]) + + li, ri := i, i+1 + isDup := padded && ri >= prevCount + if li >= prevCount { + li = prevCount - 1 + } + if ri >= prevCount { + ri = prevCount - 1 + } + + // Parent covers left-child.From → right-child.To. + nodes[i/2] = Node{ + Index: i / 2, + Hash: next[i/2], + FromBlock: prevLevel.Nodes[li].FromBlock, + ToBlock: prevLevel.Nodes[ri].ToBlock, + Left: li, + Right: ri, + Duplicated: isDup, + } + } + + label := fmt.Sprintf("Level %d", lvl) + if len(next) == 1 { + label = "Root" + } + levels = append(levels, Level{Level: lvl, Label: label, Nodes: nodes}) + cur = next + } + + return levels, cur[0] +} + +// --------------------------------------------------------------------------- +// Worker result +// --------------------------------------------------------------------------- + +type result struct { + blockNum uint64 + hash string + err error +} + +// --------------------------------------------------------------------------- +// Main +// --------------------------------------------------------------------------- + +func main() { + fromFlag := flag.Uint64("from", 0, "First block number (default: 1)") + toFlag := flag.Uint64("to", 0, "Last block number (default: latest in DB)") + outFile := flag.String("out", "merkle_all.json", "Output JSON file") + numWorkers := flag.Int("workers", 4, "Concurrent fetch workers") + user := flag.String("user", "", "ImmuDB username") + pass := flag.String("pass", "", "ImmuDB password") + flag.Parse() + + // ── 1. Load settings ────────────────────────────────────────────────── + cfg, err := settings.Load() + if err != nil { + fmt.Fprintf(os.Stderr, "ERROR: load settings: %v\n", err) + os.Exit(1) + } + username := cfg.Database.Username + password := cfg.Database.Password + if *user != "" { + username = *user + } + if *pass != "" { + password = *pass + } + + // ── 2. Bootstrap logger ─────────────────────────────────────────────── + logging.NewAsyncLogger() + + // ── 3. Init DB pool ─────────────────────────────────────────────────── + poolCfg := config.DefaultConnectionPoolConfig() + if err := DB_OPs.InitMainDBPoolWithLoki(poolCfg, false, username, password); err != nil { + fmt.Fprintf(os.Stderr, "ERROR: init DB pool: %v\n", err) + os.Exit(1) + } + defer DB_OPs.CloseMainDBPool() + + // ── 4. Resolve range ────────────────────────────────────────────────── + fromBlock := *fromFlag + if fromBlock == 0 { + fromBlock = 1 + } + + toBlock := *toFlag + if toBlock == 0 { + ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) + conn, err := DB_OPs.GetMainDBConnectionandPutBack(ctx) + if err != nil { + cancel() + fmt.Fprintf(os.Stderr, "ERROR: get connection: %v\n", err) + os.Exit(1) + } + latest, err := DB_OPs.GetLatestBlockNumber(conn) + cancel() // cancel AFTER use so GRO returns the connection at the right time + if err != nil || latest == 0 { + fmt.Fprintf(os.Stderr, "ERROR: get latest block: %v\n", err) + os.Exit(1) + } + toBlock = latest + } + + if fromBlock > toBlock { + fmt.Fprintf(os.Stderr, "ERROR: -from (%d) > -to (%d)\n", fromBlock, toBlock) + os.Exit(1) + } + + total := toBlock - fromBlock + 1 + fmt.Printf("Fetching blocks %d → %d (%d blocks, %d workers)\n", fromBlock, toBlock, total, *numWorkers) + + // ── 5. Concurrent fetch ─────────────────────────────────────────────── + jobs := make(chan uint64, *numWorkers*2) + results := make(chan result, *numWorkers*2) + var wg sync.WaitGroup + + for w := 0; w < *numWorkers; w++ { + wg.Add(1) + go func() { + defer wg.Done() + for num := range jobs { + ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) + conn, err := DB_OPs.GetMainDBConnectionandPutBack(ctx) + if err != nil { + cancel() + results <- result{blockNum: num, err: err} + continue + } + block, err := DB_OPs.GetZKBlockByNumber(conn, num) + cancel() // cancel AFTER use + if err != nil { + results <- result{blockNum: num, err: err} + continue + } + if block == nil { + results <- result{blockNum: num, err: fmt.Errorf("not found")} + continue + } + results <- result{blockNum: num, hash: hashBlock(block)} + } + }() + } + + go func() { + for n := fromBlock; n <= toBlock; n++ { + jobs <- n + } + close(jobs) + }() + go func() { + wg.Wait() + close(results) + }() + + // ── 6. Collect, report progress ─────────────────────────────────────── + type blockResult struct { + hash string + err error + } + collected := make(map[uint64]blockResult, int(total)) + done, errCount := 0, 0 + + for r := range results { + done++ + if r.err != nil { + errCount++ + collected[r.blockNum] = blockResult{err: r.err} + fmt.Printf(" [%d/%d] block %-8d ERROR: %v\n", done, total, r.blockNum, r.err) + } else { + collected[r.blockNum] = blockResult{hash: r.hash} + if done%100 == 0 || done == int(total) || int(total) <= 100 { + fmt.Printf(" [%d/%d] block %-8d hash=%s...\n", done, total, r.blockNum, r.hash[:16]) + } + } + } + + // ── 7. Build ordered leaf list (skip errored blocks) ────────────────── + nums := make([]uint64, 0, len(collected)) + for n := range collected { + nums = append(nums, n) + } + sort.Slice(nums, func(i, j int) bool { return nums[i] < nums[j] }) + + leaves := make([]BlockLeaf, 0, len(nums)) + for _, n := range nums { + r := collected[n] + if r.err != nil { + continue + } + leaves = append(leaves, BlockLeaf{ + BlockNumber: n, + BlockHashExcluded: true, + Hash: r.hash, + }) + } + + fmt.Printf("\nBuilding Merkle tree over %d leaves...\n", len(leaves)) + + // ── 8. Build global Merkle tree ─────────────────────────────────────── + levels, root := buildMerkleTree(leaves) + + // ── 9. Write single output JSON ─────────────────────────────────────── + // from_block/to_block reflect the actually-covered range (errors skipped). + actualFrom, actualTo := fromBlock, toBlock + if len(leaves) > 0 { + actualFrom = leaves[0].BlockNumber + actualTo = leaves[len(leaves)-1].BlockNumber + } + + out := MerkleForest{ + Root: root, + TotalBlocks: uint64(len(leaves)), + FromBlock: actualFrom, + ToBlock: actualTo, + GeneratedAt: time.Now().UTC().Format(time.RFC3339), + ErrorCount: errCount, + Leaves: leaves, + Levels: levels, + } + + data, err := json.MarshalIndent(out, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "ERROR: marshal output: %v\n", err) + os.Exit(1) + } + if err := os.WriteFile(*outFile, data, 0644); err != nil { + fmt.Fprintf(os.Stderr, "ERROR: write %s: %v\n", *outFile, err) + os.Exit(1) + } + + fmt.Printf("\n──────────────────────────────────────────────────────────\n") + fmt.Printf("Merkle root : %s\n", root) + fmt.Printf("Total leaves : %d (blocks fetched successfully)\n", len(leaves)) + fmt.Printf("Tree levels : %d (leaf level + %d parent levels)\n", len(levels), len(levels)-1) + fmt.Printf("Errors : %d\n", errCount) + fmt.Printf("Output : %s\n", *outFile) +} diff --git a/Scripts/check_nonce_dupes.go b/Scripts/check_nonce_dupes.go new file mode 100644 index 00000000..f3a1ef87 --- /dev/null +++ b/Scripts/check_nonce_dupes.go @@ -0,0 +1,192 @@ +//go:build ignore + +// check_nonce_dupes.go — scan the accounts DB and report duplicate nonces. +// +// Usage: +// +// go run Scripts/check_nonce_dupes.go [flags] +// +// Flags: +// +// -host ImmuDB host (default: 127.0.0.1) +// -port ImmuDB port (default: 3322) +// -user ImmuDB username (default: immudb) +// -pass ImmuDB password (default: immudb) +// -db accounts DB name (default: accountsdb) +// -batch scan batch size (default: 100) +// -prefix account key prefix (default: address:) +package main + +import ( + "context" + "encoding/json" + "flag" + "fmt" + "os" + "sort" + "text/tabwriter" + "time" + + "github.com/codenotary/immudb/pkg/api/schema" + immudb "github.com/codenotary/immudb/pkg/client" + "github.com/ethereum/go-ethereum/common" + "google.golang.org/grpc/metadata" +) + +// Account mirrors DB_OPs.Account — keep in sync if fields change. +type Account struct { + DIDAddress string `json:"did,omitempty"` + Address common.Address `json:"address"` + Balance string `json:"balance,omitempty"` + Nonce uint64 `json:"nonce"` + AccountType string `json:"account_type"` + CreatedAt int64 `json:"created_at"` + UpdatedAt int64 `json:"updated_at"` +} + +func main() { + host := flag.String("host", "127.0.0.1", "ImmuDB host") + port := flag.Int("port", 3322, "ImmuDB port") + user := flag.String("user", "immudb", "ImmuDB username") + pass := flag.String("pass", "immudb", "ImmuDB password") + dbName := flag.String("db", "accountsdb", "Accounts database name") + batch := flag.Int("batch", 100, "Scan batch size") + prefix := flag.String("prefix", "address:", "Account key prefix") + flag.Parse() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) + defer cancel() + + // --- Connect --- + opts := immudb.DefaultOptions().WithAddress(*host).WithPort(*port) + client := immudb.NewClient().WithOptions(opts) + + if err := client.OpenSession(ctx, []byte(*user), []byte(*pass), *dbName); err != nil { + fmt.Fprintf(os.Stderr, "failed to open session: %v\n", err) + os.Exit(1) + } + defer client.CloseSession(ctx) + + md := metadata.Pairs("setname", *dbName) + ctx = metadata.NewOutgoingContext(ctx, md) + + fmt.Printf("Connected to immudb %s:%d, database: %s\n\n", *host, *port, *dbName) + + // --- Scan all address: keys --- + accounts, err := scanAllAccounts(ctx, client, []byte(*prefix), *batch) + if err != nil { + fmt.Fprintf(os.Stderr, "scan error: %v\n", err) + os.Exit(1) + } + fmt.Printf("Scanned %d accounts\n\n", len(accounts)) + + // --- Group by nonce --- + // nonceMap[nonce] = list of accounts with that nonce + nonceMap := make(map[uint64][]*Account) + for _, acc := range accounts { + nonceMap[acc.Nonce] = append(nonceMap[acc.Nonce], acc) + } + + // --- Find duplicates --- + type dupeGroup struct { + nonce uint64 + accounts []*Account + } + var dupes []dupeGroup + for nonce, accs := range nonceMap { + if len(accs) > 1 { + dupes = append(dupes, dupeGroup{nonce, accs}) + } + } + + // Sort by nonce for deterministic output + sort.Slice(dupes, func(i, j int) bool { return dupes[i].nonce < dupes[j].nonce }) + + // --- Print all accounts --- + fmt.Println("=== All accounts ===") + tw := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0) + fmt.Fprintln(tw, "ADDRESS\tNONCE\tCREATED_AT\tNOTE") + fmt.Fprintln(tw, "-------\t-----\t----------\t----") + + // Sort accounts by nonce for readability + sort.Slice(accounts, func(i, j int) bool { return accounts[i].Nonce < accounts[j].Nonce }) + + for _, acc := range accounts { + createdAt := time.Unix(0, acc.CreatedAt).UTC().Format(time.RFC3339) + note := "" + if len(nonceMap[acc.Nonce]) > 1 { + note = fmt.Sprintf("DUPLICATE NONCE (shared by %d accounts)", len(nonceMap[acc.Nonce])) + } + fmt.Fprintf(tw, "%s\t%d\t%s\t%s\n", acc.Address.Hex(), acc.Nonce, createdAt, note) + } + tw.Flush() + + // --- Duplicate summary --- + fmt.Println() + if len(dupes) == 0 { + fmt.Println("No duplicate nonces found.") + } else { + fmt.Printf("=== Duplicate nonces (%d) ===\n", len(dupes)) + tw2 := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0) + fmt.Fprintln(tw2, "NONCE\tADDRESS\tCREATED_AT") + fmt.Fprintln(tw2, "-----\t-------\t----------") + for _, d := range dupes { + for i, acc := range d.accounts { + nStr := fmt.Sprintf("%d", d.nonce) + if i > 0 { + nStr = " (same)" + } + createdAt := time.Unix(0, acc.CreatedAt).UTC().Format(time.RFC3339) + fmt.Fprintf(tw2, "%s\t%s\t%s\n", nStr, acc.Address.Hex(), createdAt) + } + fmt.Fprintln(tw2, "") + } + tw2.Flush() + } +} + +// scanAllAccounts pages through all keys with the given prefix and returns parsed accounts. +func scanAllAccounts(ctx context.Context, c immudb.ImmuClient, prefix []byte, batchSize int) ([]*Account, error) { + var accounts []*Account + var seekKey []byte + + for { + req := &schema.ScanRequest{ + Prefix: prefix, + Limit: uint64(batchSize), + SeekKey: seekKey, + Desc: false, + } + + result, err := c.Scan(ctx, req) + if err != nil { + return nil, fmt.Errorf("scan failed: %w", err) + } + if len(result.Entries) == 0 { + break + } + + startIdx := 0 + if seekKey != nil && len(result.Entries) > 0 && + string(result.Entries[0].Key) == string(seekKey) { + startIdx = 1 // skip the seek key (inclusive pagination) + } + + for i := startIdx; i < len(result.Entries); i++ { + entry := result.Entries[i] + var acc Account + if err := json.Unmarshal(entry.Value, &acc); err != nil { + fmt.Fprintf(os.Stderr, "warn: skip key %s — unmarshal error: %v\n", entry.Key, err) + continue + } + accounts = append(accounts, &acc) + } + + if len(result.Entries) < batchSize { + break + } + seekKey = result.Entries[len(result.Entries)-1].Key + } + + return accounts, nil +} diff --git a/Scripts/merkle_check.go b/Scripts/merkle_check.go new file mode 100644 index 00000000..6eeed273 --- /dev/null +++ b/Scripts/merkle_check.go @@ -0,0 +1,539 @@ +// scripts/block_merkle/main.go +// +// Fetches all ZKBlocks from ImmuDB, computes a single hash per block +// (from all fields EXCEPT BlockHash), then builds ONE Merkle tree where: +// +// Level 0 — Leaves: one per block (leaf.hash = SHA256 of all block fields) +// Level 1+ — Parents: SHA256(left_child || right_child) +// Top — Root: the final single hash +// +// Outputs ONE JSON file: +// +// { +// "root": "", +// "total_blocks": 732, +// "from_block": 1, +// "to_block": 732, +// "generated_at": "2026-04-22T06:00:00Z", +// "leaves": [ +// { "block_number": 1, "block_hash_excluded": true, "hash": "" }, +// ... +// ], +// "levels": [ +// { "level": 0, "label": "Leaves", "nodes": [...] }, +// { "level": 1, "label": "Level 1", "nodes": [...] }, +// { "level": N, "label": "Root", "nodes": [{ "hash": "" }] } +// ] +// } +// +// Usage: +// +// go run ./scripts/block_merkle/main.go \ +// -out merkle_all.json \ +// -user immudb -pass immudb +// +// # specific range +// go run ./scripts/block_merkle/main.go \ +// -from 1 -to 500 \ +// -out merkle_1_500.json \ +// -user immudb -pass immudb +// +// Flags: +// +// -from N First block (default: 1) +// -to N Last block (default: latest in DB) +// -out s Output JSON file (default: merkle_all.json) +// -workers N Concurrent fetch workers (default: 4) +// -user s ImmuDB username +// -pass s ImmuDB password +package main + +import ( + "context" + "crypto/sha256" + "encoding/binary" + "encoding/hex" + "encoding/json" + "flag" + "fmt" + "os" + "sort" + "sync" + "time" + + "gossipnode/DB_OPs" + "gossipnode/config" + "gossipnode/config/settings" + "gossipnode/logging" +) + +// --------------------------------------------------------------------------- +// Output types — one combined file for the whole chain +// --------------------------------------------------------------------------- + +// BlockLeaf is one entry in the leaf level of the global Merkle tree. +type BlockLeaf struct { + BlockNumber uint64 `json:"block_number"` + BlockHashExcluded bool `json:"block_hash_excluded"` // always true — BlockHash is never hashed + Hash string `json:"hash"` // SHA256 over all other block fields +} + +// Node is one hash in any Merkle level. +// Left/Right are indices into the level directly below (-1 for leaves). +// +// FromBlock/ToBlock record the inclusive range of block numbers this node +// covers. At leaf level they are both equal to the single block number. +// At every parent level they span left-child.FromBlock → right-child.ToBlock. +// A divergence algorithm can use these to narrow the search without walking +// all the way down to the leaves: if hashes differ, follow the child whose +// [FromBlock, ToBlock] contains the expected divergence range. +type Node struct { + Index int `json:"index"` + Hash string `json:"hash"` + FromBlock uint64 `json:"from_block"` // first block covered by this subtree + ToBlock uint64 `json:"to_block"` // last block covered by this subtree + Left int `json:"left"` // child index in level below (-1 for leaves) + Right int `json:"right"` // child index in level below (-1 for leaves) + Duplicated bool `json:"duplicated,omitempty"` // true when padded from an odd count +} + +// Level is one horizontal row of the tree. +type Level struct { + Level int `json:"level"` // 0 = leaves, highest = root + Label string `json:"label"` + Nodes []Node `json:"nodes"` +} + +// MerkleForest is the single output JSON. +type MerkleForest struct { + Root string `json:"root"` + TotalBlocks uint64 `json:"total_blocks"` + FromBlock uint64 `json:"from_block"` + ToBlock uint64 `json:"to_block"` + GeneratedAt string `json:"generated_at"` + ErrorCount int `json:"error_count,omitempty"` + Leaves []BlockLeaf `json:"leaves"` + Levels []Level `json:"levels"` +} + +// --------------------------------------------------------------------------- +// Block hashing — one canonical hash per block (BlockHash excluded) +// +// Excluded fields (DB/derived, not part of canonical block content): +// - BlockHash: derived from the other fields; including it would be circular +// --------------------------------------------------------------------------- + +func hashBlock(b *config.ZKBlock) string { + h := sha256.New() + + buf8 := make([]byte, 8) + buf4len := make([]byte, 4) + + // writeVar writes a 4-byte big-endian length prefix followed by the data. + // This prevents boundary-shift collisions between adjacent variable-length + // fields, e.g. ("AB"+"CD") vs ("A"+"BCD") producing the same byte stream. + writeVar := func(dst interface{ Write([]byte) (int, error) }, data []byte) { + binary.BigEndian.PutUint32(buf4len, uint32(len(data))) + dst.Write(buf4len) + dst.Write(data) + } + + // ── Scalars ─────────────────────────────────────────────────────────── + binary.BigEndian.PutUint64(buf8, b.BlockNumber) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, uint64(b.Timestamp)) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, b.GasLimit) + h.Write(buf8) + + binary.BigEndian.PutUint64(buf8, b.GasUsed) + h.Write(buf8) + + // ── Fixed-size hashes (always 32 bytes — no length prefix needed) ───── + h.Write(b.PrevHash.Bytes()) + h.Write(b.StateRoot.Bytes()) + + // ── Strings (variable length — length-prefixed) ──────────────────────── + writeVar(h, []byte(b.TxnsRoot)) + writeVar(h, []byte(b.ProofHash)) + writeVar(h, []byte(b.Status)) + writeVar(h, []byte(b.ExtraData)) + + // ── Byte slices (variable length — length-prefixed) ─────────────────── + writeVar(h, b.StarkProof) + writeVar(h, b.LogsBloom) + + // ── Commitment ([]uint32) ───────────────────────────────────────────── + buf4 := make([]byte, 4) + for _, v := range b.Commitment { + binary.BigEndian.PutUint32(buf4, v) + h.Write(buf4) + } + + // ── Nullable addresses ──────────────────────────────────────────────── + if b.CoinbaseAddr != nil { + h.Write(b.CoinbaseAddr.Bytes()) + } else { + h.Write([]byte("nil")) + } + if b.ZKVMAddr != nil { + h.Write(b.ZKVMAddr.Bytes()) + } else { + h.Write([]byte("nil")) + } + + // ── Transactions ────────────────────────────────────────────────────── + for _, tx := range b.Transactions { + th := sha256.New() + + th.Write(tx.Hash.Bytes()) + + if tx.From != nil { + th.Write(tx.From.Bytes()) + } + if tx.To != nil { + th.Write(tx.To.Bytes()) + } + if tx.Value != nil { + th.Write(tx.Value.Bytes()) + } + + th.Write([]byte{tx.Type}) + + binary.BigEndian.PutUint64(buf8, tx.Timestamp) + th.Write(buf8) + + binary.BigEndian.PutUint64(buf8, tx.Nonce) + th.Write(buf8) + + binary.BigEndian.PutUint64(buf8, tx.GasLimit) + th.Write(buf8) + + if tx.ChainID != nil { + th.Write(tx.ChainID.Bytes()) + } + if tx.GasPrice != nil { + th.Write(tx.GasPrice.Bytes()) + } + if tx.MaxFee != nil { + th.Write(tx.MaxFee.Bytes()) + } + if tx.MaxPriorityFee != nil { + th.Write(tx.MaxPriorityFee.Bytes()) + } + + writeVar(th, tx.Data) + + // AccessList: each entry = address + storage keys + for _, entry := range tx.AccessList { + th.Write(entry.Address.Bytes()) + for _, key := range entry.StorageKeys { + th.Write(key.Bytes()) + } + } + + if tx.V != nil { + th.Write(tx.V.Bytes()) + } + if tx.R != nil { + th.Write(tx.R.Bytes()) + } + if tx.S != nil { + th.Write(tx.S.Bytes()) + } + + h.Write(th.Sum(nil)) + } + + return hex.EncodeToString(h.Sum(nil)) +} + +// --------------------------------------------------------------------------- +// Global Merkle tree builder +// --------------------------------------------------------------------------- + +func buildMerkleTree(leaves []BlockLeaf) (levels []Level, root string) { + if len(leaves) == 0 { + // SHA256 of the empty string — not a zero hash + empty := hex.EncodeToString(sha256.New().Sum(nil)) + return []Level{{Level: 0, Label: "Root", Nodes: []Node{{Hash: empty, Left: -1, Right: -1}}}}, empty + } + + // Level 0: one node per leaf — FromBlock == ToBlock == the block's own number. + leafNodes := make([]Node, len(leaves)) + for i, l := range leaves { + leafNodes[i] = Node{ + Index: i, + Hash: l.Hash, + FromBlock: l.BlockNumber, + ToBlock: l.BlockNumber, + Left: -1, + Right: -1, + } + } + levels = append(levels, Level{Level: 0, Label: "Leaves", Nodes: leafNodes}) + + cur := make([]string, len(leaves)) + for i, l := range leaves { + cur[i] = l.Hash + } + + for lvl := 1; len(cur) > 1; lvl++ { + padded := false + if len(cur)%2 != 0 { + cur = append(cur, cur[len(cur)-1]) // duplicate last + padded = true + } + + prevLevel := levels[lvl-1] + prevCount := len(prevLevel.Nodes) // real node count before padding + next := make([]string, len(cur)/2) + nodes := make([]Node, len(cur)/2) + + for i := 0; i < len(cur); i += 2 { + l, _ := hex.DecodeString(cur[i]) + r, _ := hex.DecodeString(cur[i+1]) + h := sha256.Sum256(append(l, r...)) + next[i/2] = hex.EncodeToString(h[:]) + + li, ri := i, i+1 + isDup := padded && ri >= prevCount + if li >= prevCount { + li = prevCount - 1 + } + if ri >= prevCount { + ri = prevCount - 1 + } + + // Parent covers left-child.From → right-child.To. + nodes[i/2] = Node{ + Index: i / 2, + Hash: next[i/2], + FromBlock: prevLevel.Nodes[li].FromBlock, + ToBlock: prevLevel.Nodes[ri].ToBlock, + Left: li, + Right: ri, + Duplicated: isDup, + } + } + + label := fmt.Sprintf("Level %d", lvl) + if len(next) == 1 { + label = "Root" + } + levels = append(levels, Level{Level: lvl, Label: label, Nodes: nodes}) + cur = next + } + + return levels, cur[0] +} + +// --------------------------------------------------------------------------- +// Worker result +// --------------------------------------------------------------------------- + +type result struct { + blockNum uint64 + hash string + err error +} + +// --------------------------------------------------------------------------- +// Main +// --------------------------------------------------------------------------- + +func main() { + fromFlag := flag.Uint64("from", 0, "First block number (default: 1)") + toFlag := flag.Uint64("to", 0, "Last block number (default: latest in DB)") + outFile := flag.String("out", "merkle_all.json", "Output JSON file") + numWorkers := flag.Int("workers", 4, "Concurrent fetch workers") + user := flag.String("user", "", "ImmuDB username") + pass := flag.String("pass", "", "ImmuDB password") + flag.Parse() + + // ── 1. Load settings ────────────────────────────────────────────────── + cfg, err := settings.Load() + if err != nil { + fmt.Fprintf(os.Stderr, "ERROR: load settings: %v\n", err) + os.Exit(1) + } + username := cfg.Database.Username + password := cfg.Database.Password + if *user != "" { + username = *user + } + if *pass != "" { + password = *pass + } + + // ── 2. Bootstrap logger ─────────────────────────────────────────────── + logging.NewAsyncLogger() + + // ── 3. Init DB pool ─────────────────────────────────────────────────── + poolCfg := config.DefaultConnectionPoolConfig() + if err := DB_OPs.InitMainDBPoolWithLoki(poolCfg, false, username, password); err != nil { + fmt.Fprintf(os.Stderr, "ERROR: init DB pool: %v\n", err) + os.Exit(1) + } + defer DB_OPs.CloseMainDBPool() + + // ── 4. Resolve range ────────────────────────────────────────────────── + fromBlock := *fromFlag + if fromBlock == 0 { + fromBlock = 1 + } + + toBlock := *toFlag + if toBlock == 0 { + ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) + conn, err := DB_OPs.GetMainDBConnectionandPutBack(ctx) + if err != nil { + cancel() + fmt.Fprintf(os.Stderr, "ERROR: get connection: %v\n", err) + os.Exit(1) + } + latest, err := DB_OPs.GetLatestBlockNumber(conn) + cancel() // cancel AFTER use so GRO returns the connection at the right time + if err != nil || latest == 0 { + fmt.Fprintf(os.Stderr, "ERROR: get latest block: %v\n", err) + os.Exit(1) + } + toBlock = latest + } + + if fromBlock > toBlock { + fmt.Fprintf(os.Stderr, "ERROR: -from (%d) > -to (%d)\n", fromBlock, toBlock) + os.Exit(1) + } + + total := toBlock - fromBlock + 1 + fmt.Printf("Fetching blocks %d → %d (%d blocks, %d workers)\n", fromBlock, toBlock, total, *numWorkers) + + // ── 5. Concurrent fetch ─────────────────────────────────────────────── + jobs := make(chan uint64, *numWorkers*2) + results := make(chan result, *numWorkers*2) + var wg sync.WaitGroup + + for w := 0; w < *numWorkers; w++ { + wg.Add(1) + go func() { + defer wg.Done() + for num := range jobs { + ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) + conn, err := DB_OPs.GetMainDBConnectionandPutBack(ctx) + if err != nil { + cancel() + results <- result{blockNum: num, err: err} + continue + } + block, err := DB_OPs.GetZKBlockByNumber(conn, num) + cancel() // cancel AFTER use + if err != nil { + results <- result{blockNum: num, err: err} + continue + } + if block == nil { + results <- result{blockNum: num, err: fmt.Errorf("not found")} + continue + } + results <- result{blockNum: num, hash: hashBlock(block)} + } + }() + } + + go func() { + for n := fromBlock; n <= toBlock; n++ { + jobs <- n + } + close(jobs) + }() + go func() { + wg.Wait() + close(results) + }() + + // ── 6. Collect, report progress ─────────────────────────────────────── + type blockResult struct { + hash string + err error + } + collected := make(map[uint64]blockResult, int(total)) + done, errCount := 0, 0 + + for r := range results { + done++ + if r.err != nil { + errCount++ + collected[r.blockNum] = blockResult{err: r.err} + fmt.Printf(" [%d/%d] block %-8d ERROR: %v\n", done, total, r.blockNum, r.err) + } else { + collected[r.blockNum] = blockResult{hash: r.hash} + if done%100 == 0 || done == int(total) || int(total) <= 100 { + fmt.Printf(" [%d/%d] block %-8d hash=%s...\n", done, total, r.blockNum, r.hash[:16]) + } + } + } + + // ── 7. Build ordered leaf list (skip errored blocks) ────────────────── + nums := make([]uint64, 0, len(collected)) + for n := range collected { + nums = append(nums, n) + } + sort.Slice(nums, func(i, j int) bool { return nums[i] < nums[j] }) + + leaves := make([]BlockLeaf, 0, len(nums)) + for _, n := range nums { + r := collected[n] + if r.err != nil { + continue + } + leaves = append(leaves, BlockLeaf{ + BlockNumber: n, + BlockHashExcluded: true, + Hash: r.hash, + }) + } + + fmt.Printf("\nBuilding Merkle tree over %d leaves...\n", len(leaves)) + + // ── 8. Build global Merkle tree ─────────────────────────────────────── + levels, root := buildMerkleTree(leaves) + + // ── 9. Write single output JSON ─────────────────────────────────────── + // from_block/to_block reflect the actually-covered range (errors skipped). + actualFrom, actualTo := fromBlock, toBlock + if len(leaves) > 0 { + actualFrom = leaves[0].BlockNumber + actualTo = leaves[len(leaves)-1].BlockNumber + } + + out := MerkleForest{ + Root: root, + TotalBlocks: uint64(len(leaves)), + FromBlock: actualFrom, + ToBlock: actualTo, + GeneratedAt: time.Now().UTC().Format(time.RFC3339), + ErrorCount: errCount, + Leaves: leaves, + Levels: levels, + } + + data, err := json.MarshalIndent(out, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "ERROR: marshal output: %v\n", err) + os.Exit(1) + } + if err := os.WriteFile(*outFile, data, 0644); err != nil { + fmt.Fprintf(os.Stderr, "ERROR: write %s: %v\n", *outFile, err) + os.Exit(1) + } + + fmt.Printf("\n──────────────────────────────────────────────────────────\n") + fmt.Printf("Merkle root : %s\n", root) + fmt.Printf("Total leaves : %d (blocks fetched successfully)\n", len(leaves)) + fmt.Printf("Tree levels : %d (leaf level + %d parent levels)\n", len(levels), len(levels)-1) + fmt.Printf("Errors : %d\n", errCount) + fmt.Printf("Output : %s\n", *outFile) +} diff --git a/Sequencer/Consensus.go b/Sequencer/Consensus.go index f8a4c36f..9ae19d6b 100644 --- a/Sequencer/Consensus.go +++ b/Sequencer/Consensus.go @@ -412,6 +412,10 @@ func (consensus *Consensus) Start(zkblock *config.ZKBlock) error { ion.String("function", "Consensus.Start.setZKBlockData")) setZKBlockSpan.End() + // Set sequencer identity and round ID so voters know where to send votes + consensus.ZKBlockData.SetSequencerID(consensus.Host.ID().String()) + consensus.ZKBlockData.SetRoundID(zkblock.BlockHash.Hex()) + // Validate consensus configuration validateCtx, validateSpan := tracer.Start(trace_ctx, "Consensus.Start.validateConfiguration") validateStartTime := time.Now().UTC() @@ -938,104 +942,141 @@ func (consensus *Consensus) startEventDrivenFlowAfterSubscriptionPermission(trac ion.String("function", "Consensus.startEventDrivenFlow.broadcastVoteTrigger")) broadcastSpan.End() - // Step 4: Wait for votes to be collected and processed, then print CRDT state and process votes + // Step 4: Event-driven vote collection + // Create a round context with ConsensusTimeout deadline + roundCtx, roundCancel := context.WithTimeout(trace_ctx, config.ConsensusTimeout) + consensus.roundCtx = roundCtx + consensus.roundCancel = roundCancel + + // Create vote notification channel and register it so handleSubmitVote can push votes + voteNotifyCh := make(chan PubSubMessages.VoteNotification, config.MaxMainPeers) + consensus.voteNotifyCh = voteNotifyCh + MessagePassing.RegisterVoteCollector(voteNotifyCh) + processVotesCtx, processVotesSpan := tracer.Start(trace_ctx, "Consensus.startEventDrivenFlow.processVotes") processVotesStartTime := time.Now().UTC() - logger().NamedLogger.Info(processVotesCtx, "Waiting for votes to be collected and processed", + + blockHash := consensus.ZKBlockData.GetZKBlock().BlockHash.Hex() + requiredVotes := config.MaxMainPeers + collectedVotes := make(map[string]int8) // peerID -> vote + + logger().NamedLogger.Info(processVotesCtx, "Starting event-driven vote collection", + ion.Int("required_votes", requiredVotes), + ion.String("block_hash", blockHash), + ion.Float64("timeout_seconds", config.ConsensusTimeout.Seconds()), ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) - // TODO: Replace this with actual event-driven trigger from vote collection completion - // For now, use a delay but log that it should be event-driven - common.LocalGRO.Go(GRO.SequencerRequestEventDrivenFlowThread, func(ctx context.Context) error { - processCtx, processSpan := tracer.Start(processVotesCtx, "Consensus.startEventDrivenFlow.processVotes.waitAndProcess") - defer processSpan.End() + // Event loop: wait for votes or timeout + for { + select { + case notification := <-voteNotifyCh: + // Only accept votes for this round's block hash + if notification.BlockHash != blockHash { + logger().NamedLogger.Warn(processVotesCtx, "Ignoring vote for different block hash", + ion.String("expected", blockHash), + ion.String("got", notification.BlockHash), + ion.String("peer", notification.PeerID), + ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) + continue + } - // Wait for votes to be collected (this should be replaced with event-driven trigger) - waitTime := 15 * time.Second - processSpan.SetAttributes(attribute.Float64("wait_time_seconds", waitTime.Seconds())) - logger().NamedLogger.Info(processCtx, "Waiting for vote collection", - ion.Float64("wait_time_seconds", waitTime.Seconds()), - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.waitAndProcess")) - time.Sleep(waitTime) - - // Print CRDT state - printCtx, printSpan := tracer.Start(processCtx, "Consensus.startEventDrivenFlow.processVotes.printCRDTState") - printStartTime := time.Now().UTC() - logger().NamedLogger.Info(printCtx, "Triggering CRDT state print", - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.printCRDTState")) - - if err := consensus.PrintCRDTState(printCtx); err != nil { - printSpan.RecordError(err) - printSpan.SetAttributes(attribute.String("status", "failed")) - printDuration := time.Since(printStartTime).Seconds() - printSpan.SetAttributes(attribute.Float64("duration", printDuration)) - logger().NamedLogger.Error(printCtx, "PrintCRDTState failed", - err, - ion.Float64("duration", printDuration), - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.printCRDTState")) - } else { - printDuration := time.Since(printStartTime).Seconds() - printSpan.SetAttributes( - attribute.Float64("duration", printDuration), - attribute.String("status", "success"), - ) - logger().NamedLogger.Info(printCtx, "CRDT state printed successfully", - ion.Float64("duration", printDuration), - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.printCRDTState")) - } - printSpan.End() + // Only accept votes from committee members + if !isCommitteeMember(notification.PeerID, consensus.PeerList.MainPeers) { + logger().NamedLogger.Warn(processVotesCtx, "Ignoring vote from non-committee peer", + ion.String("peer", notification.PeerID), + ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) + continue + } - // Process vote collection - collectCtx, collectSpan := tracer.Start(processCtx, "Consensus.startEventDrivenFlow.processVotes.processVoteCollection") - collectStartTime := time.Now().UTC() - logger().NamedLogger.Info(collectCtx, "Triggering vote collection and processing", - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.processVoteCollection")) - - if err := consensus.ProcessVoteCollection(); err != nil { - collectSpan.RecordError(err) - collectSpan.SetAttributes(attribute.String("status", "failed")) - collectDuration := time.Since(collectStartTime).Seconds() - collectSpan.SetAttributes(attribute.Float64("duration", collectDuration)) - logger().NamedLogger.Error(collectCtx, "ProcessVoteCollection failed", - err, - ion.Float64("duration", collectDuration), - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.processVoteCollection")) - } else { - collectDuration := time.Since(collectStartTime).Seconds() - collectSpan.SetAttributes( - attribute.Float64("duration", collectDuration), - attribute.String("status", "success"), - ) - logger().NamedLogger.Info(collectCtx, "Vote collection and processing initiated successfully", - ion.Float64("duration", collectDuration), - ion.String("function", "Consensus.startEventDrivenFlow.processVotes.processVoteCollection")) + // Store vote (idempotent) + collectedVotes[notification.PeerID] = notification.Vote + Maps.StoreVoteResult(blockHash, notification.PeerID, notification.Vote) + + logger().NamedLogger.Info(processVotesCtx, "Vote received via push notification", + ion.String("peer", notification.PeerID), + ion.Int("vote", int(notification.Vote)), + ion.Int("collected", len(collectedVotes)), + ion.Int("required", requiredVotes), + ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) + + fmt.Printf("📥 Vote received: peer=%s vote=%d (%d/%d)\n", + notification.PeerID[:16], notification.Vote, len(collectedVotes), requiredVotes) + + // Exit early if we have all votes (quorum) + if len(collectedVotes) >= requiredVotes { + logger().NamedLogger.Info(processVotesCtx, "All votes collected - quorum reached", + ion.Int("collected", len(collectedVotes)), + ion.Int("required", requiredVotes), + ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) + goto VOTES_COLLECTED + } + + case <-roundCtx.Done(): + logger().NamedLogger.Warn(processVotesCtx, "Round deadline reached, proceeding with partial votes", + ion.Int("collected", len(collectedVotes)), + ion.Int("required", requiredVotes), + ion.String("function", "Consensus.startEventDrivenFlow.processVotes")) + fmt.Printf("⏰ Consensus timeout: collected %d/%d votes\n", len(collectedVotes), requiredVotes) + goto VOTES_COLLECTED } - collectSpan.End() + } - processDuration := time.Since(processVotesStartTime).Seconds() - processSpan.SetAttributes( - attribute.Float64("duration", processDuration), - attribute.String("status", "success"), - ) - return nil - }) +VOTES_COLLECTED: + // Unregister the vote collector now that collection is done + MessagePassing.UnregisterVoteCollector() + roundCancel() processVotesDuration := time.Since(processVotesStartTime).Seconds() - processVotesSpan.SetAttributes(attribute.Float64("duration", processVotesDuration)) + processVotesSpan.SetAttributes( + attribute.Int("votes_collected", len(collectedVotes)), + attribute.Int("votes_required", requiredVotes), + attribute.Float64("duration", processVotesDuration), + ) processVotesSpan.End() + // Print CRDT state + consensus.PrintCRDTState(trace_ctx) + + // Collect BLS results from buddy nodes (pull-based for BLS signatures) + listenerNode := PubSubMessages.NewGlobalVariables().Get_ForListner() + blsResults := consensus.CollectVoteResultsFromBuddies(listenerNode) + + // Verify consensus with BLS signatures + consensusReached := consensus.VerifyConsensusWithBLS(blsResults) + + // Broadcast and process the block + if err := consensus.BroadcastAndProcessBlock(blsResults, consensusReached); err != nil { + logger().NamedLogger.Error(trace_ctx, "Failed to broadcast and process block", + err, + ion.String("function", "Consensus.startEventDrivenFlowAfterSubscriptionPermission")) + } + totalDuration := time.Since(startTime).Seconds() asyncFlowSpan.SetAttributes( attribute.Float64("duration", totalDuration), attribute.String("status", "success"), + attribute.Int("votes_collected", len(collectedVotes)), + attribute.Bool("consensus_reached", consensusReached), ) logger().NamedLogger.Info(trace_ctx, "Event-driven consensus flow completed", ion.Float64("total_duration", totalDuration), + ion.Int("votes_collected", len(collectedVotes)), + ion.Bool("consensus_reached", consensusReached), ion.String("function", "Consensus.startEventDrivenFlowAfterSubscriptionPermission")) } // VerifySubscriptions checks if nodes are actually subscribed to the pubsub channel // This method now uses the new pubsub-based verification system +// isCommitteeMember checks if a peer ID string is in the committee (MainPeers list) +func isCommitteeMember(peerIDStr string, mainPeers []peer.ID) bool { + for _, p := range mainPeers { + if p.String() == peerIDStr { + return true + } + } + return false +} + func (consensus *Consensus) VerifySubscriptions(logger_ctx context.Context) error { tracer := logger().NamedLogger.Tracer("Consensus") trace_ctx, span := tracer.Start(logger_ctx, "Consensus.VerifySubscriptions") @@ -1145,8 +1186,8 @@ func (consensus *Consensus) BroadcastVoteTrigger() error { attribute.String("block_hash", consensus.ZKBlockData.GetZKBlock().BlockHash.Hex()), ) - // Use the messaging.BroadcastVoteTrigger function to broadcast the vote trigger - if err := messaging.BroadcastVoteTrigger(consensus.Host, consensus.ZKBlockData); err != nil { + // Send vote trigger only to committee members (not all connected peers) + if err := messaging.BroadcastVoteTriggerToCommittee(consensus.Host, consensus.ZKBlockData, consensus.PeerList.MainPeers); err != nil { span.RecordError(err) span.SetAttributes(attribute.String("status", "failed")) duration := time.Since(startTime).Seconds() @@ -1766,7 +1807,7 @@ func (consensus *Consensus) parseVoteResultResponse(response string, peerID peer // Extract and store numeric vote if result, ok := resultData["result"].(float64); ok { - Maps.StoreVoteResult(peerID.String(), int8(result)) + Maps.StoreVoteResult(consensus.ZKBlockData.GetZKBlock().BlockHash.Hex(), peerID.String(), int8(result)) span.SetAttributes(attribute.Int64("vote_result", int64(result))) logger().NamedLogger.Info(trace_ctx, "Received vote result from peer", ion.String("peer_id", peerID.String()), diff --git a/Sequencer/Triggers/Maps/vote_results.go b/Sequencer/Triggers/Maps/vote_results.go index 0f3410fa..19b5f592 100644 --- a/Sequencer/Triggers/Maps/vote_results.go +++ b/Sequencer/Triggers/Maps/vote_results.go @@ -5,50 +5,69 @@ import ( "sync" ) -// Global map to store vote results from buddy nodes: map[peerID]voteResult -var voteResultsMap = make(map[string]int8) +// voteResultsMap stores vote results scoped by block hash: map[blockHash]map[peerID]voteResult +var voteResultsMap = make(map[string]map[string]int8) // Mutex to protect voteResultsMap var voteResultsMutex sync.Mutex -// StoreVoteResult stores a vote result from a buddy node -func StoreVoteResult(peerID string, result int8) { +// StoreVoteResult stores a vote result from a buddy node, scoped by block hash +func StoreVoteResult(blockHash, peerID string, result int8) { voteResultsMutex.Lock() defer voteResultsMutex.Unlock() - voteResultsMap[peerID] = result - log.Printf("Stored vote result for peer %s: %d", peerID, result) + if voteResultsMap[blockHash] == nil { + voteResultsMap[blockHash] = make(map[string]int8) + } + voteResultsMap[blockHash][peerID] = result + log.Printf("Stored vote result for block %s, peer %s: %d", blockHash, peerID, result) } -// GetVoteResult retrieves a vote result for a peer -func GetVoteResult(peerID string) (int8, bool) { +// GetVoteResult retrieves a vote result for a peer in a given round +func GetVoteResult(blockHash, peerID string) (int8, bool) { voteResultsMutex.Lock() defer voteResultsMutex.Unlock() - result, exists := voteResultsMap[peerID] - return result, exists + if round, exists := voteResultsMap[blockHash]; exists { + result, ok := round[peerID] + return result, ok + } + return 0, false } -// GetAllVoteResults retrieves all vote results -func GetAllVoteResults() map[string]int8 { +// GetAllVoteResults retrieves all vote results for a given block hash +func GetAllVoteResults(blockHash string) map[string]int8 { voteResultsMutex.Lock() defer voteResultsMutex.Unlock() result := make(map[string]int8) - for k, v := range voteResultsMap { - result[k] = v + if round, exists := voteResultsMap[blockHash]; exists { + for k, v := range round { + result[k] = v + } } return result } -// ClearVoteResults clears all vote results +// ClearVoteResults clears all vote results across all rounds func ClearVoteResults() { voteResultsMutex.Lock() defer voteResultsMutex.Unlock() - voteResultsMap = make(map[string]int8) + voteResultsMap = make(map[string]map[string]int8) log.Printf("Cleared all vote results") } -// GetVoteResultsCount returns the number of stored vote results -func GetVoteResultsCount() int { +// ClearVoteResultsForBlock clears vote results for a specific block hash +func ClearVoteResultsForBlock(blockHash string) { voteResultsMutex.Lock() defer voteResultsMutex.Unlock() - return len(voteResultsMap) + delete(voteResultsMap, blockHash) + log.Printf("Cleared vote results for block %s", blockHash) +} + +// GetVoteResultsCount returns the number of stored vote results for a given block hash +func GetVoteResultsCount(blockHash string) int { + voteResultsMutex.Lock() + defer voteResultsMutex.Unlock() + if round, exists := voteResultsMap[blockHash]; exists { + return len(round) + } + return 0 } diff --git a/Sequencer/Triggers/Triggers.go b/Sequencer/Triggers/Triggers.go index 3afbb28f..e4665e86 100644 --- a/Sequencer/Triggers/Triggers.go +++ b/Sequencer/Triggers/Triggers.go @@ -422,7 +422,7 @@ func RequestVoteResultsFromBuddies(blockhash string) error { var resultData map[string]interface{} if err := json.Unmarshal([]byte(responseMsg.Message), &resultData); err == nil { if result, ok := resultData["result"].(float64); ok { - Maps.StoreVoteResult(peerID.String(), int8(result)) + Maps.StoreVoteResult(blockhash, peerID.String(), int8(result)) log.Printf("RequestVoteResultsFromBuddies: Stored vote result from %s: %d", peerID, int8(result)) } } @@ -452,7 +452,7 @@ func StartBFTConsensus(blockhash string) error { elapsed := time.Duration(0) for elapsed < maxWait { - count := Maps.GetVoteResultsCount() + count := Maps.GetVoteResultsCount(blockhash) if count > 0 { log.Printf("StartBFTConsensus: Found %d vote results, proceeding with BFT", count) break @@ -471,7 +471,7 @@ func StartBFTConsensus(blockhash string) error { // Prepare buddy input data for BFT using vote results buddyNode.Mutex.RLock() - allVoteResults := Maps.GetAllVoteResults() + allVoteResults := Maps.GetAllVoteResults(blockhash) allBuddies := make([]bft.BuddyInput, len(buddyNode.BuddyNodes.Buddies_Nodes)) for i, peerID := range buddyNode.BuddyNodes.Buddies_Nodes { diff --git a/Sequencer/consensus_statemachine.go b/Sequencer/consensus_statemachine.go index abcf0a9a..fe371f2b 100644 --- a/Sequencer/consensus_statemachine.go +++ b/Sequencer/consensus_statemachine.go @@ -45,6 +45,10 @@ type Consensus struct { // Guards to prevent infinite loops isProcessingVotes bool processedBlockHash string + // Event-driven vote collection + voteNotifyCh chan PubSubMessages.VoteNotification + roundCtx context.Context + roundCancel context.CancelFunc } // @constructor function @@ -267,6 +271,11 @@ func (consensus *Consensus) BroadcastAndProcessBlock(blsResults []BLS_Signer.BLS // CleanupSubscriptions unsubscribes from consensus-related topics to prevent resource leaks // This should be called after each consensus round completes (success or failure) func (consensus *Consensus) CleanupSubscriptions() { + // Cancel the round context if active + if consensus.roundCancel != nil { + consensus.roundCancel() + } + if consensus.gossipnode == nil { return } diff --git a/Vote/Trigger.go b/Vote/Trigger.go index 95d4af85..d7d4da99 100644 --- a/Vote/Trigger.go +++ b/Vote/Trigger.go @@ -192,31 +192,52 @@ func (vt *VoteTrigger) SubmitVote() error { // Reuse existing logger_ctx from above (already created with tracer) - // Try to send to multiple nodes if first attempt fails + // Determine the target node: prefer the sequencer (consensus creator) if known + var targetPeerID peer.ID + sequencerIDStr := vt.ConsensusMessage.GetSequencerID() + if sequencerIDStr != "" { + decoded, decErr := peer.Decode(sequencerIDStr) + if decErr != nil { + fmt.Printf("⚠️ Failed to decode SequencerID %q, falling back to consistent hashing: %v\n", sequencerIDStr, decErr) + } else { + targetPeerID = decoded + } + } + + // Try to send to the sequencer (or fallback to consistent hashing) maxAttempts := 3 for attempt := 0; attempt < maxAttempts; attempt++ { - // Pick up the listener node using the consistent hashing with offset - NodeToSendTo := vt.PickListnerWithOffset(listenerNode.PeerID, attempt) + var sendTo peer.ID + if targetPeerID != "" { + // Send directly to the sequencer + sendTo = targetPeerID + } else { + // Fallback: use consistent hashing (backward compatibility with old sequencer nodes) + NodeToSendTo := vt.PickListnerWithOffset(listenerNode.PeerID, attempt) + sendTo = NodeToSendTo.PeerID + } // Check if trying to send to self - skip and try next - if NodeToSendTo.PeerID == listenerNode.PeerID && attempt < maxAttempts-1 { + if sendTo == listenerNode.PeerID && attempt < maxAttempts-1 { continue } - // Send the message to the listener node + // Send the message to the target node err := MessagePassing.NewListenerStruct(listenerNode). - SendMessageToPeer(logger_ctx, NodeToSendTo.PeerID, string(messageBytes)) + SendMessageToPeer(logger_ctx, sendTo, string(messageBytes)) if err != nil { // If this is not the last attempt, try again if attempt < maxAttempts-1 { + fmt.Printf("⚠️ Failed to send vote to %s (attempt %d/%d): %v\n", sendTo, attempt+1, maxAttempts, err) continue } // Last attempt failed - return fmt.Errorf("failed to send message to listener node after %d attempts: %v", maxAttempts, err) + return fmt.Errorf("failed to send vote to sequencer %s after %d attempts: %v", sendTo, maxAttempts, err) } // Success! + fmt.Printf("✅ Vote sent to sequencer %s\n", sendTo) return nil } diff --git a/config/GRO/constants.go b/config/GRO/constants.go index ad7c2d26..81f2ac77 100644 --- a/config/GRO/constants.go +++ b/config/GRO/constants.go @@ -68,6 +68,7 @@ const ( DIDThread = "thread:did" ShutdownThread = "thread:shutdown" BlockPollerThread = "thread:block:poller" + StartupSyncThread = "thread:startup:sync" // SequencerTriggerThread = "thread:sequencer:trigger" SequencerConsensusThread = "thread:sequencer:consensus" diff --git a/config/PubSubMessages/Consensus.go b/config/PubSubMessages/Consensus.go index 325f96c7..ea52cb90 100644 --- a/config/PubSubMessages/Consensus.go +++ b/config/PubSubMessages/Consensus.go @@ -18,6 +18,8 @@ type ConsensusMessage struct { StartTime time.Time InteriumTime time.Time TotalNodes int + SequencerID string // Peer ID of the consensus creator (sequencer) so voters know where to send votes + RoundID string // Unique round identifier (block hash) for vote scoping } type Buddy_PeerMultiaddr struct { diff --git a/config/PubSubMessages/Consensus_Builder.go b/config/PubSubMessages/Consensus_Builder.go index bf6ba206..47d65d9b 100644 --- a/config/PubSubMessages/Consensus_Builder.go +++ b/config/PubSubMessages/Consensus_Builder.go @@ -15,6 +15,8 @@ func NewConsensusMessageBuilder(consensusMessage *ConsensusMessage) *ConsensusMe StartTime: consensusMessage.StartTime, InteriumTime: consensusMessage.InteriumTime, TotalNodes: consensusMessage.TotalNodes, + SequencerID: consensusMessage.SequencerID, + RoundID: consensusMessage.RoundID, } } return &ConsensusMessage{} @@ -116,3 +118,21 @@ func (consensusMessage *ConsensusMessage) ClearGloalVarCacheConsensusMessage() * CacheConsensuMessage = make(map[string]*ConsensusMessage) return consensusMessage } + +func (consensusMessage *ConsensusMessage) SetSequencerID(id string) *ConsensusMessage { + consensusMessage.SequencerID = id + return consensusMessage +} + +func (consensusMessage *ConsensusMessage) GetSequencerID() string { + return consensusMessage.SequencerID +} + +func (consensusMessage *ConsensusMessage) SetRoundID(id string) *ConsensusMessage { + consensusMessage.RoundID = id + return consensusMessage +} + +func (consensusMessage *ConsensusMessage) GetRoundID() string { + return consensusMessage.RoundID +} diff --git a/config/PubSubMessages/vote_notification.go b/config/PubSubMessages/vote_notification.go new file mode 100644 index 00000000..5bbbcb09 --- /dev/null +++ b/config/PubSubMessages/vote_notification.go @@ -0,0 +1,9 @@ +package PubSubMessages + +// VoteNotification is pushed to the sequencer's vote collector channel +// when a vote arrives at the listener's handleSubmitVote handler. +type VoteNotification struct { + PeerID string // peer ID of the voter + BlockHash string // block hash this vote is for (round scoping) + Vote int8 // +1 accept, -1 reject +} diff --git a/config/settings/config.go b/config/settings/config.go index b26faa99..61b3825b 100644 --- a/config/settings/config.go +++ b/config/settings/config.go @@ -10,15 +10,16 @@ import ( // NodeConfig is the top-level configuration for a JMDN node. // Each section maps to a YAML key in jmdn.yaml. type NodeConfig struct { - Node NodeSettings `mapstructure:"node"` - Network NetworkSettings `mapstructure:"network"` - Ports PortSettings `mapstructure:"ports"` - Binds BindSettings `mapstructure:"binds"` - Database DatabaseSettings `mapstructure:"database"` - Logging LoggingSettings `mapstructure:"logging"` - Features FeatureSettings `mapstructure:"features"` - Security SecurityConfig `mapstructure:"security"` - Alerts AlertsConfig `mapstructure:"alerts"` + Node NodeSettings `mapstructure:"node"` + Network NetworkSettings `mapstructure:"network"` + Ports PortSettings `mapstructure:"ports"` + Binds BindSettings `mapstructure:"binds"` + Database DatabaseSettings `mapstructure:"database"` + Logging LoggingSettings `mapstructure:"logging"` + Features FeatureSettings `mapstructure:"features"` + Security SecurityConfig `mapstructure:"security"` + Alerts AlertsConfig `mapstructure:"alerts"` + FastSync FastSyncSettings `mapstructure:"fastsync"` } // NodeSettings defines the identity of this node. @@ -62,12 +63,23 @@ type BindSettings struct { Profiler string `mapstructure:"profiler" yaml:"profiler"` } -// DatabaseSettings controls ImmuDB connection parameters. -type DatabaseSettings struct { - Username string `mapstructure:"username" yaml:"username"` +// RedisSettings controls the Redis connection used by the account sync worker. +// The worker uses a Redis Stream (XADD/XREADGROUP/XACK) to decouple the +// WriteAccounts / BatchUpdateAccounts callers from the ~15 s ImmuDB commit latency. +// URL format: "host:port" (e.g. "localhost:6379"). +// Env override: JMDN_DATABASE_REDIS_URL, JMDN_DATABASE_REDIS_PASSWORD +type RedisSettings struct { + URL string `mapstructure:"url" yaml:"url"` Password string `mapstructure:"password" yaml:"password"` } +// DatabaseSettings controls ImmuDB and Redis connection parameters. +type DatabaseSettings struct { + Username string `mapstructure:"username" yaml:"username"` + Password string `mapstructure:"password" yaml:"password"` + Redis RedisSettings `mapstructure:"redis" yaml:"redis"` +} + // LoggingSettings mirrors Ion's Config struct so jmdn.yaml can fully configure // the logger (console, file, OTEL, tracing, metrics) in one place. // This replaces the old otelconfig.LogConfig and scattered env vars. @@ -102,14 +114,15 @@ type LogFileSettings struct { // LogOTELSettings configures OpenTelemetry log/trace export. type LogOTELSettings struct { - Enabled bool `mapstructure:"enabled" yaml:"enabled"` - Endpoint string `mapstructure:"endpoint" yaml:"endpoint"` - Protocol string `mapstructure:"protocol" yaml:"protocol"` // grpc or http - Insecure bool `mapstructure:"insecure" yaml:"insecure"` - Username string `mapstructure:"username" yaml:"username"` - Password string `mapstructure:"password" yaml:"password"` - BatchSize int `mapstructure:"batch_size" yaml:"batch_size"` - ExportInterval time.Duration `mapstructure:"export_interval" yaml:"export_interval"` + Enabled bool `mapstructure:"enabled" yaml:"enabled"` + Endpoint string `mapstructure:"endpoint" yaml:"endpoint"` + Protocol string `mapstructure:"protocol" yaml:"protocol"` // grpc or http + Insecure bool `mapstructure:"insecure" yaml:"insecure"` + Headers map[string]string `mapstructure:"headers" yaml:"headers"` + Username string `mapstructure:"username" yaml:"username"` + Password string `mapstructure:"password" yaml:"password"` + BatchSize int `mapstructure:"batch_size" yaml:"batch_size"` + ExportInterval time.Duration `mapstructure:"export_interval" yaml:"export_interval"` } // LogTracingSettings configures distributed tracing. @@ -123,3 +136,35 @@ type FeatureSettings struct { UseLegacyBFT bool `mapstructure:"use_legacy_bft" yaml:"use_legacy_bft"` GROTrack bool `mapstructure:"grotrack" yaml:"grotrack"` } + +// FastSyncSettings controls FastSync V2 behaviour for this node. +// +// Serving vs syncing are independent: +// - enabled=true → this node registers FastSync protocol handlers and serves +// block/account data to any peer that requests it. +// - sync=true → this node is allowed to pull data from peers and update +// its own local database (HeaderSync, DataSync, Reconciliation). +// +// A sequencer should set sync=false so it never overwrites its own authoritative +// state, while keeping enabled=true so other nodes can still sync from it. +type FastSyncSettings struct { + // Enabled controls whether the FastSync engine is initialized and protocol + // handlers are registered. Set false to disable FastSync entirely. + Enabled bool `mapstructure:"enabled" yaml:"enabled"` + + // EnablePulling controls whether this node will pull data from peers and write to its + // local DB. false = read-only participant (serves data, never updates itself). + EnablePulling bool `mapstructure:"enable_pulling" yaml:"enable_pulling"` + + // PullOnStartup controls whether the node attempts to catch up on missed blocks + // automatically when it (re)starts and connects to peers. + PullOnStartup bool `mapstructure:"pull_on_startup" yaml:"pull_on_startup"` + + // SyncTimeout is the maximum wall-clock time allowed for a single full sync + // operation before it is cancelled. + SyncTimeout time.Duration `mapstructure:"sync_timeout" yaml:"sync_timeout"` + + // AllowedPeers is an optional whitelist of libp2p peer IDs this node will + // accept sync data FROM. Empty list = accept from any peer. + AllowedPeers []string `mapstructure:"allowed_peers" yaml:"allowed_peers"` +} diff --git a/config/settings/defaults.go b/config/settings/defaults.go index 8c660631..60ae4cac 100644 --- a/config/settings/defaults.go +++ b/config/settings/defaults.go @@ -42,6 +42,10 @@ func DefaultConfig() NodeConfig { Database: DatabaseSettings{ Username: "", Password: "", + Redis: RedisSettings{ + URL: "127.0.0.1:6379", // required for account sync worker; set via jmdn.yaml or JMDN_DATABASE_REDIS_URL + Password: "jmdnredissync", // optional: set if Redis requires authentication + }, }, Logging: LoggingSettings{ Level: "warn", @@ -64,6 +68,7 @@ func DefaultConfig() NodeConfig { Enabled: false, Protocol: "grpc", Insecure: false, + Headers: map[string]string{}, BatchSize: 512, ExportInterval: 5 * time.Second, }, @@ -76,6 +81,13 @@ func DefaultConfig() NodeConfig { UseLegacyBFT: false, GROTrack: false, }, + FastSync: FastSyncSettings{ + Enabled: true, + EnablePulling: true, + PullOnStartup: true, + SyncTimeout: 10 * time.Minute, + AllowedPeers: []string{}, + }, Security: DefaultSecurityConfig(), Alerts: DefaultAlertsConfig(), } diff --git a/config/settings/loader.go b/config/settings/loader.go index cf5dda78..3c60233a 100644 --- a/config/settings/loader.go +++ b/config/settings/loader.go @@ -123,6 +123,8 @@ func setDefaults(v *viper.Viper) { // Database v.SetDefault("database.username", d.Database.Username) v.SetDefault("database.password", d.Database.Password) + v.SetDefault("database.redis.url", d.Database.Redis.URL) + v.SetDefault("database.redis.password", d.Database.Redis.Password) // Logging v.SetDefault("logging.level", d.Logging.Level) @@ -148,6 +150,7 @@ func setDefaults(v *viper.Viper) { v.SetDefault("logging.otel.endpoint", d.Logging.OTEL.Endpoint) v.SetDefault("logging.otel.protocol", d.Logging.OTEL.Protocol) v.SetDefault("logging.otel.insecure", d.Logging.OTEL.Insecure) + v.SetDefault("logging.otel.headers", d.Logging.OTEL.Headers) v.SetDefault("logging.otel.username", d.Logging.OTEL.Username) v.SetDefault("logging.otel.password", d.Logging.OTEL.Password) v.SetDefault("logging.otel.batch_size", d.Logging.OTEL.BatchSize) @@ -161,10 +164,38 @@ func setDefaults(v *viper.Viper) { v.SetDefault("features.use_legacy_bft", d.Features.UseLegacyBFT) v.SetDefault("features.grotrack", d.Features.GROTrack) + // FastSync + v.SetDefault("fastsync.enabled", d.FastSync.Enabled) + v.SetDefault("fastsync.enable_pulling", d.FastSync.EnablePulling) + v.SetDefault("fastsync.pull_on_startup", d.FastSync.PullOnStartup) + v.SetDefault("fastsync.sync_timeout", d.FastSync.SyncTimeout) + v.SetDefault("fastsync.allowed_peers", d.FastSync.AllowedPeers) + // Security + v.SetDefault("security.enabled", d.Security.Enabled) + v.SetDefault("security.cert_dir", d.Security.CertDir) + v.SetDefault("security.ip_cache_size", d.Security.IPCacheSize) + v.SetDefault("security.global_rate_limit", d.Security.GlobalRateLimit) + v.SetDefault("security.global_burst", d.Security.GlobalBurst) + v.SetDefault("security.trust_forwarded_headers", d.Security.TrustForwardedHeaders) + v.SetDefault("security.trusted_proxies", d.Security.TrustedProxies) + v.SetDefault("security.trusted_clients", d.Security.TrustedClients) v.SetDefault("security.explorer_api_key", d.Security.ExplorerAPIKey) v.SetDefault("security.jwt_secret", d.Security.JWTSecret) + // Register defaults for all predefined Security Services so Viper can pick up ENV overrides + for svcName, policy := range d.Security.Services { + prefix := "security.services." + svcName + "." + v.SetDefault(prefix+"tls", policy.TLS) + v.SetDefault(prefix+"auth_type", string(policy.AuthType)) + v.SetDefault(prefix+"token_env", policy.TokenEnv) + v.SetDefault(prefix+"rate_limit", policy.RateLimit) + v.SetDefault(prefix+"burst", policy.Burst) + v.SetDefault(prefix+"cert_file", policy.CertFile) + v.SetDefault(prefix+"key_file", policy.KeyFile) + v.SetDefault(prefix+"ca_file", policy.CAFile) + } + // Alerts v.SetDefault("alerts.url", d.Alerts.URL) v.SetDefault("alerts.api_key", d.Alerts.APIKey) diff --git a/docs/phases/account-enqueue-chunking.md b/docs/phases/account-enqueue-chunking.md new file mode 100644 index 00000000..cdb0a1aa --- /dev/null +++ b/docs/phases/account-enqueue-chunking.md @@ -0,0 +1,67 @@ +# Bounded Account Enqueue (Chunking) — Implementation Phases + +Mode: **Prod**. Scope: **accounts only**, consumer-side (`DB_OPs/Nodeinfo`). No library change, no `go.mod` bump. + +## Problem (evidence) +- Library client receive handler `core/sync/sync_protocols.go:666 HandleAccountsSyncData` ACKs each page (`:713`, before any DB) then accumulates **all** pages into one in-memory `batch` and calls `WriteAccountsBatch` **once at EOF** (`:720`). +- That single call hands `account_manager.WriteAccounts` (`immudb_account_manager.go:170`) the entire batch (up to millions). It does one `json.Marshal` + one `XADD` → a single huge message. Redis caps a bulk string at `proto-max-bulk-len` (512 MiB default): the message stalls or is rejected → EOF write fails *after* all pages were ACKed → server session fails → dispatcher retries the range (`DispatchACKTimeout=10s`, `DispatchMaxRetries=3`) → dead-letter storm → sync never converges. +- Consumer enqueue path is already async (returns before ImmuDB commit); the worker drain/ACK/XDEL contract is correct. The ONLY defect is the unbounded single message. + +## SOLID Gates +- **S** — Invariant owned by the producer methods: "deliver an account/update batch to the stream as bounded, individually-valid messages." (Worker owns the separate "ACK only after commit" invariant — unchanged.) +- **O** — New record kinds extend via the existing `syncPayloadType` tag + a `processBatch` case; the generic helper accepts any `[]T`. No switch edit needed to change chunk size (const) or add a caller. +- **I** — Helper depends only on `RedisStreamer.Enqueue` (1 method of the 8-method interface) — minimal surface; no new interface added. +- **D** — Helper depends on the `RedisStreamer` interface, not `*redis.Client`. No new concrete cross-package import. (Pre-existing `DB_OPs` import in the worker is out of scope.) + +## Pattern Selection +Primary pattern: **none new** — bounded iteration inside the established Producer (account_manager) → Adapter (RedisStreamer) structure. +Why: the fix is a loop + size bound, not a new abstraction. Adding a Strategy/Builder would be ceremony. +Trade-off: chunk size is a const, not injected — promote to config later via the documented extension point if needed. +Anti-pattern avoided: a "MessageChunker" service object / new interface for a 12-line helper. + +## Phase 1.0: Bounded enqueue helper + const + timeout +- What: add `maxRecordsPerMessage` const (500), `enqueueTimeout(chunks)` helper, and generic `enqueueRecordsChunked[T any](ctx, s RedisStreamer, ptype syncPayloadType, items []T) error` in `immudb_account_manager.go`. Best-effort over chunks, `errors.Join` aggregation. +- Data structures: input `[]T` (sequential, read-once, O(1) re-slicing into fixed chunks — no map, no copy). Bound: each marshalled message holds ≤ `maxRecordsPerMessage` records. +- Inputs: none (uses existing `RedisStreamer`, stream constants). +- Done when: helper compiles; never marshals more than `maxRecordsPerMessage` records into one message. +- Status: [x] + +## Phase 1.1: Rewire WriteAccounts + BatchUpdateAccounts +- Trigger: 1.0 helper exists; both producers must use it instead of the single `json.Marshal`+`XADD`. +- What: replace the one-shot marshal/enqueue in `WriteAccounts` (`:170`) and `BatchUpdateAccounts` (`:324`) with chunk-count computation + `enqueueRecordsChunked`. Sized context via `enqueueTimeout`. +- Done when: both methods enqueue N records as `ceil(N/500)` messages; error wraps record + message counts. +- Status: [x] + +## Phase 1.2: Docs — module headers / function docs +- Trigger: behavior of the two interface methods changed; worker `writeEntries` bound is now finite. +- What: update `WriteAccounts`/`BatchUpdateAccounts` doc comments (chunking + best-effort semantics); update `account_sync_worker.go` module header `[]dbEntry` growth-bound note (now `MaxDrainItems × maxRecordsPerMessage`, previously unbounded). +- Done when: doc comments reflect chunking; worker header bound corrected. +- Status: [x] + +## Phase 1.3: White-box test +- Trigger: helper is unexported; needs same-package test. +- What: `account_sync_enqueue_test.go` (package NodeInfo). Mock `RedisStreamer` records `Enqueue` payloads. Table: 0,1,499,500,501,1000,2500 → assert message count = ceil(n/500), each decoded chunk ≤ 500, sum == n, correct type tag. Failure case: every 3rd chunk errors → `errors.Join` non-nil AND remaining chunks still enqueued (best-effort). +- Deviation (documented): craftcode Phase 6 wants tests under `tests/`; Go package-internal visibility forces a same-dir `_test.go`. Matches existing repo convention (`DB_OPs/sqlops/sqlops_test.go`). +- Done when: `go test ./DB_OPs/Nodeinfo/ -run Enqueue` passes (no live infra needed — mock streamer). +- Status: [x] + +## Phase 2.0: Library durable-before-ACK (scope B — JMDN-FastSync) +Repo: `../JMDN-FastSync` (NOT this repo). Files: +- `common/types/constants/accounts_constants.go` — added `AccountReceiveFlushThreshold = 20_000` (documents the bound; the per-page rewrite makes peak receive memory one page/stream). +- `core/sync/sync_protocols.go HandleAccountsSyncData` — rewrote the receive loop from "accumulate whole stream → one `WriteAccountsBatch` at EOF" to **per-page: read → `WriteAccountsBatch` (WAL + Redis enqueue) → ACK**. Success = `BatchAck` (Ok); persist failure = `ErrBatchAck` (Ok=false) → dispatcher retries page → dead-letter on repeated failure. +- What it fixes: the client previously buffered the entire diff range (up to ~2.7M accounts, ~10 streams) in one slice → OOM. The server's 200k nonce-buffer cap does NOT bound the client. Now receive memory = one page (~3k) per stream. +- Server impact: **none** — stateless dispatcher unchanged; still one-ack-per-page contract; NAK rides the pre-existing retry→DLQ path (`DispatcherCallbacks.go:134`, `run.go handleFailure`). ACK now reflects true durability. +- Done when: `JMDN-FastSync` builds + vets clean; `jmdn` builds end-to-end against it. +- Status: [x] (code) / [ ] (published — see Integration) + +## Phase 2.1: Integration / publish +- Local verify (done): `go mod edit -replace github.com/JupiterMetaLabs/JMDN-FastSync=../JMDN-FastSync` in `jmdn/go.mod`; `CGO_ENABLED=1 go build ./...` → exit 0. **This replace is DEV-ONLY — must be reverted before merge.** +- Production path (NOT yet done — requires user): commit + push `JMDN-FastSync` (branch `fix/accountsync/performance`), obtain the new pseudo-version, then `go get github.com/JupiterMetaLabs/JMDN-FastSync@` in `jmdn` and remove the `replace`. Until then the library change is not in any published artifact. +- Status: [ ] + +## Non-goals (explicit) +- No fix to `parseUpdatesPayload` AccountType/DIDAddress behavior (separate concern). +- No worker config or drain-logic change. +- No change to the server dispatcher (statelessness preserved). + + diff --git a/docs/phases/accountsync-cursor-pagination.md b/docs/phases/accountsync-cursor-pagination.md new file mode 100644 index 00000000..8f73b234 --- /dev/null +++ b/docs/phases/accountsync-cursor-pagination.md @@ -0,0 +1,73 @@ +# AccountSync Performance Fixes — Implementation Phases + +## Context +AccountSync wall-clock >2 days on 10k blocks + 2.7M accounts. +All issues below are in this repo (`jmdn`). Issues in `JMDN-FastSync` library are tracked separately. + +## SOLID Gates +**S:** Each fix owns one invariant (scan, block read, type conversion). +**O:** New scan behaviour → pass `extendedPrefix`; no existing code modified. +**I:** No fat interfaces introduced. +**D:** No new cross-package concrete imports. + +## Pattern Selection +Iterator (Behavioral) for pagination; Facade (Structural) for fast block read variant. + +--- + +## Phase 1: Cursor-based pagination — DONE +- What: Replace `offset int` with `seekKey []byte` cursor in `immudbNonceIter`. + Add `ListAccountsPaginatedFrom` (ascending, cursor-based) in `account_immuclient.go`. + Remove dead `nonceToAccount map` + `sync.Mutex` from iterator. +- Impact: ~365M ImmuDB scan entries → ~2.7M. O(N²) → O(N). +- Files: `DB_OPs/account_immuclient.go`, `DB_OPs/Nodeinfo/immudb_account_manager.go` +- Done when: build passes, `offset` field gone from `immudbNonceIter`. ✅ + +--- + +## Phase 2: Fix `defer ReadCancel()` inside loop in `ListAccountsPaginated` — DONE +- What: Line 1085 — `defer ReadCancel()` is inside a `for` loop. Each iteration + schedules a cancel that only fires on function return, not on loop iteration end. + All cancel funcs accumulate for the function lifetime → goroutine/context leak. + Fix: call `ReadCancel()` immediately after the `Scan` call (not deferred). +- Files: `DB_OPs/account_immuclient.go` +- Done when: no `defer` inside the scan loop of `ListAccountsPaginated`. + +--- + +## Phase 3: Add `GetZKBlockByNumberFast` (plain Get, no proof generation) — DONE +- What: `GetZKBlockByNumber` uses `VerifiedGet` — generates a cryptographic Merkle + proof per read (5–10× slower than plain `Get`). Sync/reconciliation paths do not + need tamper-proof guarantees. Add `GetZKBlockByNumberFast` using `ic.Client.Get`. + Keep `GetZKBlockByNumber` (VerifiedGet) for client-facing verified queries. +- Data structures: none new; same `*config.ZKBlock` return type. +- Files: `DB_OPs/immuclient.go` +- Done when: `GetZKBlockByNumberFast` exported, compiles, uses plain `Get`. + +--- + +## Phase 4: `GetTransactionsByAccount` uses `GetZKBlockByNumberFast` — DONE +- What: `GetTransactionsByAccount` (line 1293) loops every block 0→latestBlock, + calling `GetZKBlockByNumber` (VerifiedGet) per block. This is called per tagged + account during reconciliation → O(accounts × blocks) VerifiedGet calls. + Switch to `GetZKBlockByNumberFast`. Also fix `GetTransactionsByAccountPaginated` + (line 1576) which has the same issue. +- Data structures: none new. +- Files: `DB_OPs/account_immuclient.go` +- Done when: both functions call `GetZKBlockByNumberFast`, no `GetZKBlockByNumber` + call remains inside a block-scan loop. + +--- + +## Phase 5: Remove JSON round-trip in `GetTransactionsForAccount` (#15) — DONE +- What: `immudb_account_manager.go:40-48` marshals each `config.Transaction` to JSON + then unmarshals into `types.DBTransaction` just to convert types. Direct field copy + eliminates two allocs + two reflect traversals per transaction. +- Files: `DB_OPs/Nodeinfo/immudb_account_manager.go` +- Done when: no `json.Marshal` / `json.Unmarshal` in the tx conversion loop. + +--- + +## Phase 6: Build verification — DONE +- What: `go build ./...` — zero errors, zero new import cycles. +- Done when: clean build across all changed packages. diff --git a/docs/phases/redis-accountsync-queue.md b/docs/phases/redis-accountsync-queue.md new file mode 100644 index 00000000..89728a74 --- /dev/null +++ b/docs/phases/redis-accountsync-queue.md @@ -0,0 +1,113 @@ +# Redis AccountSync Queue — Implementation Phases + +## Context + +**Problem:** `BatchRestoreAccounts` (ImmuDB commit) takes ~15 s. AccountSync callers time +out waiting, push to DLQ, retry, and waste throughput. + +**Solution:** `WriteAccounts` and `BatchUpdateAccounts` enqueue payloads to a Redis Stream +and return an immediate ACK. A single background worker (`XREADGROUP` + `XAUTOCLAIM`) +drains the stream, coalesces batches, and writes to ImmuDB asynchronously. + +## Design Decisions (locked) + +| Decision | Choice | Rationale | +|---|---|---| +| Interface contract | Unchanged (`types.AccountManager`) | External module; signatures fixed | +| Redis unavailable | Fail fast | Caller already has DLQ/retry; B degrades to 15 s latency | +| Worker lifecycle | Explicit `StartAccountSyncWorker(ctx, streamer, cfg)` from main.go | main.go owns all infra lifecycles | +| Queue mechanism | Redis Streams (`XADD`/`XREADGROUP`/`XACK`/`XAUTOCLAIM`) | Built-in PEL, ACK semantics, crash recovery | +| Batch coalescing | Drain `MaxDrainItems` entries per `XREADGROUP`; write in `MaxAccountsPerBatch` sub-batches | Reduces DB round trips under burst | +| ACK semantics | ACK only after `BatchRestoreAccounts` succeeds | At-least-once; `BatchRestoreAccounts` is LWW-idempotent | +| Redis client injection | Interface `RedisStreamer` injected via `StartAccountSyncWorker`; `NewRedisStreamer(*redis.Client)` adapter in package | DIP; no concrete cross-package import | + +## SOLID Gates + +**S — Single Responsibility** +- `account_sync_redis.go`: owns "define the Redis stream transport abstraction" +- `account_sync_worker.go`: owns "drain Redis stream → write to ImmuDB (at-least-once)" +- `immudb_account_manager.go`: owns "enqueue account sync payloads and return ACK immediately" + +**O — Open/Closed** +Extension point: new payload types (e.g., DID sync) → add `case` in `processBatch` switch + +new `enqueue*` helper in `immudb_account_manager.go`. Worker loop and stream infra untouched. + +**I — Interface Segregation** +`RedisStreamer` has exactly 5 methods: `Enqueue`, `EnsureConsumerGroup`, `ReadGroup`, `Ack`, +`AutoClaim`. All 5 are used by the worker. No caller sees unused methods. + +**D — Dependency Inversion** +Worker and account_manager both depend on `RedisStreamer` (interface in this package). +Only `redisStreamerAdapter` imports `*redis.Client` (concrete, local to the adapter). +No concrete cross-package import anywhere else in `DB_OPs/Nodeinfo`. + +## Pattern Selection + +**Primary pattern: Adapter** (Structural) +`redisStreamerAdapter` adapts the concrete `*redis.Client` API to the domain `RedisStreamer` +interface. Callers depend on the interface; the adapter is the only concrete import. + +**Secondary: Command** (Behavioral) +Each stream entry is a serialized command (account write operation) consumed by the worker. +Enables at-least-once replay via PEL without reissuing the original RPC. + +**Anti-pattern avoided:** Direct concrete dependency on `*redis.Client` throughout +`DB_OPs/Nodeinfo` — would couple the package to a specific Redis client library forever. + +--- + +## Phase 1.0: RedisStreamer interface + adapter +- **What:** New file `account_sync_redis.go`. + - `StreamEntry` struct + - `RedisStreamer` interface (5 methods; no go-redis types exposed) + - `redisStreamerAdapter` wrapping `*redis.Client` + - `NewRedisStreamer(*redis.Client) RedisStreamer` factory + - Package-level `pkgStreamer`/`pkgStreamerMu` + `setStreamer`/`getStreamer` + - Stream constants: `accountSyncStream`, `accountSyncGroup`, `accountSyncConsumer` + - Payload type constants: `payloadTypeAccounts`, `payloadTypeUpdates` +- **Data structures:** + - `StreamEntry`: ephemeral per read; unbounded count, capped by `MaxDrainItems` at call site. + - `pkgStreamer`: singleton reference; set once by Phase 2's `StartAccountSyncWorker`. +- **Inputs:** none +- **Done when:** package compiles; `NewRedisStreamer` returns a non-nil `RedisStreamer` +- **Status:** [x] + +## Phase 2.0: Worker — `account_sync_worker.go` +- **What:** New file with: + - `AccountSyncWorkerConfig` struct + `DefaultWorkerConfig()` + - `StartAccountSyncWorker(ctx, streamer, cfg) error` + - `runWorker` (XREADGROUP BLOCK loop, ctx-aware exit) + - `reclaimPending` (XAUTOCLAIM on startup for crash recovery) + - `processBatch` (parse → coalesce → sub-batch write → ACK; poison pill handling) + - `parseAccountsPayload` / `parseUpdatesPayload` + - `accountUpdateWire` (stable JSON wire type for `types.AccountUpdate`) + - `dbEntry` type alias for `struct { Key string; Value []byte }` +- **Data structures:** + - `[]StreamEntry`: ephemeral per `runWorker` iteration; bounded by `MaxDrainItems` (100) + - `[]dbEntry`: ephemeral per `processBatch`; bounded by `MaxDrainItems × avg-accounts-per-payload`; sub-batched by `MaxAccountsPerBatch` (500) + - PEL (Redis-side): unbounded count of unacked entries; evicted by XAUTOCLAIM after `PendingIdleTimeout` (30 s) +- **Inputs:** Phase 1.0 complete +- **Done when:** `StartAccountSyncWorker` compiles; worker exits cleanly on ctx cancel +- **Status:** [x] + +## Phase 3.0: Modify `immudb_account_manager.go` +- **What:** + - `WriteAccounts` → `getStreamer()` → `json.Marshal(accounts)` → `s.Enqueue(...)` → return + - `BatchUpdateAccounts` → convert to `[]accountUpdateWire` → `json.Marshal` → `s.Enqueue(...)` → return + - Remove: direct `DB_OPs.GetAccountConnectionandPutBack` + `DB_OPs.BatchRestoreAccounts` calls from these two methods +- **Data structures:** none introduced; removes ephemeral `[]struct{Key,Value}` from both methods +- **Inputs:** Phase 2.0 complete (`accountUpdateWire` defined there; same package) +- **Done when:** `go build ./DB_OPs/Nodeinfo/...` succeeds; both methods no longer block on ImmuDB +- **Status:** [x] + +## Phase 4.0: main.go wiring (caller's responsibility) +- **What:** In main.go (or lifecycle coordinator), after Redis client is initialized: + ```go + streamer := NodeInfo.NewRedisStreamer(redisClient) + if err := NodeInfo.StartAccountSyncWorker(rootCtx, streamer, NodeInfo.DefaultWorkerConfig()); err != nil { + log.Fatalf("account sync worker: %v", err) + } + ``` +- **Inputs:** Phase 3.0 complete +- **Done when:** node boots, worker log line appears, WriteAccounts returns in < 100 ms +- **Status:** [ ] — caller's responsibility diff --git a/explorer/api.go b/explorer/api.go index a6bf5672..c6c3b561 100644 --- a/explorer/api.go +++ b/explorer/api.go @@ -301,8 +301,8 @@ func (s *ImmuDBServer) StartWithContext(ctx context.Context, addr string) error srv := &http.Server{ Addr: bindAddr, Handler: s.router, - ReadTimeout: 10 * time.Second, - WriteTimeout: 10 * time.Second, + ReadTimeout: 60 * time.Second, + WriteTimeout: 60 * time.Second, MaxHeaderBytes: 1 << 20, // 1 MB } diff --git a/fastsync/fastsync.go b/fastsync/fastsync.go index c3514460..bfefb709 100644 --- a/fastsync/fastsync.go +++ b/fastsync/fastsync.go @@ -1110,7 +1110,7 @@ func (fs *FastSync) batchCreateOrderedWithRetry(entries []struct { } case AccountsDB: fmt.Printf(">>> [DB] Calling BatchRestoreAccounts for AccountsDB with %d entries...\n", len(entries)) - err = DB_OPs.BatchRestoreAccounts(dbClient, entries) + err = DB_OPs.BatchRestoreAccounts(context.Background(), dbClient, entries) if err != nil { fmt.Printf(">>> [DB] ERROR: BatchRestoreAccounts failed for AccountsDB: %v\n", err) } else { diff --git a/go.mod b/go.mod index 8124b053..eb14307a 100644 --- a/go.mod +++ b/go.mod @@ -3,10 +3,10 @@ module gossipnode go 1.25.0 require ( - github.com/JupiterMetaLabs/JMDN-FastSync v0.0.0-20260303175904-869ab7d63ad2 - github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260213044906-5629a60edea4 + github.com/JupiterMetaLabs/JMDN-FastSync v0.0.0-20260601052219-40e74741de7c + github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260413092720-b819e61566f8 github.com/JupiterMetaLabs/goroutine-orchestrator v0.1.5 - github.com/JupiterMetaLabs/ion v0.3.5 + github.com/JupiterMetaLabs/ion v0.4.2 github.com/bits-and-blooms/bloom/v3 v3.7.1 github.com/codenotary/immudb v1.10.0 github.com/ethereum/go-ethereum v1.17.0 @@ -25,15 +25,16 @@ require ( github.com/olekukonko/tablewriter v0.0.5 github.com/prometheus/client_golang v1.23.2 github.com/prometheus/client_model v0.6.2 + github.com/redis/go-redis/v9 v9.19.0 github.com/rs/zerolog v1.34.0 github.com/spf13/viper v1.21.0 github.com/stretchr/testify v1.11.1 github.com/tyler-smith/go-bip39 v1.1.0 github.com/yahoo/coname v0.0.0-20170609175141-84592ddf8673 go.dedis.ch/dela v0.2.0 - go.opentelemetry.io/otel v1.40.0 + go.opentelemetry.io/otel v1.42.0 golang.org/x/time v0.12.0 - google.golang.org/grpc v1.78.0 + google.golang.org/grpc v1.79.3 google.golang.org/protobuf v1.36.11 ) @@ -73,14 +74,14 @@ require ( github.com/golang/snappy v1.0.0 // indirect github.com/grpc-ecosystem/go-grpc-middleware v1.3.0 // indirect github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect - github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.3 // indirect + github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0 // indirect github.com/huin/goupnp v1.3.0 // indirect github.com/inconshreveable/mousetrap v1.1.0 // indirect github.com/ipfs/go-cid v0.5.0 // indirect github.com/jackpal/go-nat-pmp v1.0.2 // indirect github.com/jbenet/go-temp-err-catcher v0.1.0 // indirect github.com/json-iterator/go v1.1.12 // indirect - github.com/klauspost/compress v1.18.2 // indirect + github.com/klauspost/compress v1.18.5 // indirect github.com/klauspost/cpuid/v2 v2.3.0 // indirect github.com/koron/go-ssdp v0.0.6 // indirect github.com/kylelemons/godebug v1.1.0 // indirect @@ -148,6 +149,7 @@ require ( github.com/rogpeppe/go-internal v1.14.1 // indirect github.com/rs/xid v1.6.0 // indirect github.com/sagikazarmark/locafero v0.11.0 // indirect + github.com/shirou/gopsutil v3.21.11+incompatible // indirect github.com/sourcegraph/conc v0.3.1-0.20240121214520-5f936abd7ae8 // indirect github.com/spaolacci/murmur3 v1.1.0 // indirect github.com/spf13/afero v1.15.0 // indirect @@ -164,24 +166,26 @@ require ( github.com/twitchyliquid64/golang-asm v0.15.1 // indirect github.com/ugorji/go/codec v1.3.0 // indirect github.com/wlynxg/anet v0.0.5 // indirect + github.com/yusufpapurcu/wmi v1.2.4 // indirect go.dedis.ch/fixbuf v1.0.3 // indirect go.dedis.ch/kyber/v3 v3.1.0 // indirect go.opentelemetry.io/auto/sdk v1.2.1 // indirect - go.opentelemetry.io/contrib/bridges/otelzap v0.14.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.15.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.15.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.39.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.39.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.39.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.39.0 // indirect - go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.39.0 // indirect - go.opentelemetry.io/otel/log v0.15.0 // indirect - go.opentelemetry.io/otel/metric v1.40.0 // indirect - go.opentelemetry.io/otel/sdk v1.39.0 // indirect - go.opentelemetry.io/otel/sdk/log v0.15.0 // indirect - go.opentelemetry.io/otel/sdk/metric v1.39.0 // indirect - go.opentelemetry.io/otel/trace v1.40.0 // indirect - go.opentelemetry.io/proto/otlp v1.9.0 // indirect + go.opentelemetry.io/contrib/bridges/otelzap v0.17.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.18.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.18.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.42.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.42.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.42.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.42.0 // indirect + go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.42.0 // indirect + go.opentelemetry.io/otel/log v0.18.0 // indirect + go.opentelemetry.io/otel/metric v1.42.0 // indirect + go.opentelemetry.io/otel/sdk v1.42.0 // indirect + go.opentelemetry.io/otel/sdk/log v0.18.0 // indirect + go.opentelemetry.io/otel/sdk/metric v1.42.0 // indirect + go.opentelemetry.io/otel/trace v1.42.0 // indirect + go.opentelemetry.io/proto/otlp v1.10.0 // indirect + go.uber.org/atomic v1.11.0 // indirect go.uber.org/dig v1.19.0 // indirect go.uber.org/fx v1.24.0 // indirect go.uber.org/mock v0.6.0 // indirect @@ -190,20 +194,20 @@ require ( go.yaml.in/yaml/v2 v2.4.2 // indirect go.yaml.in/yaml/v3 v3.0.4 // indirect golang.org/x/arch v0.20.0 // indirect - golang.org/x/crypto v0.46.0 // indirect + golang.org/x/crypto v0.49.0 // indirect golang.org/x/exp v0.0.0-20250606033433-dcc06ee1d476 // indirect - golang.org/x/mod v0.30.0 // indirect - golang.org/x/net v0.48.0 // indirect - golang.org/x/sync v0.19.0 // indirect - golang.org/x/sys v0.39.0 // indirect - golang.org/x/telemetry v0.0.0-20251111182119-bc8e575c7b54 // indirect - golang.org/x/term v0.38.0 // indirect - golang.org/x/text v0.32.0 // indirect - golang.org/x/tools v0.39.0 // indirect + golang.org/x/mod v0.33.0 // indirect + golang.org/x/net v0.52.0 // indirect + golang.org/x/sync v0.20.0 // indirect + golang.org/x/sys v0.42.0 // indirect + golang.org/x/telemetry v0.0.0-20260209163413-e7419c687ee4 // indirect + golang.org/x/term v0.41.0 // indirect + golang.org/x/text v0.35.0 // indirect + golang.org/x/tools v0.42.0 // indirect golang.org/x/xerrors v0.0.0-20231012003039-104605ab7028 // indirect google.golang.org/genproto v0.0.0-20230803162519-f966b187b2e5 // indirect - google.golang.org/genproto/googleapis/api v0.0.0-20251222181119-0a764e51fe1b // indirect - google.golang.org/genproto/googleapis/rpc v0.0.0-20251222181119-0a764e51fe1b // indirect + google.golang.org/genproto/googleapis/api v0.0.0-20260319201613-d00831a3d3e7 // indirect + google.golang.org/genproto/googleapis/rpc v0.0.0-20260319201613-d00831a3d3e7 // indirect gopkg.in/natefinch/lumberjack.v2 v2.2.1 // indirect gopkg.in/yaml.v3 v3.0.1 // indirect lukechampine.com/blake3 v1.4.1 // indirect diff --git a/go.sum b/go.sum index 91671a1a..51b870f2 100644 --- a/go.sum +++ b/go.sum @@ -1,16 +1,16 @@ cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= cloud.google.com/go v0.34.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= -github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260213044906-5629a60edea4 h1:L8laR48CLdrTsV8mfxaBfpVR+/U22tQyIxbP25i2d90= -github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260213044906-5629a60edea4/go.mod h1:9AvHMXXjd0dSPiPmsjKRfgUPTIyxRyoUC0RtVPIVVlc= +github.com/JupiterMetaLabs/JMDN-FastSync v0.0.0-20260601052219-40e74741de7c h1:2Kgkf8pb/FEkLllenyy48GsHda4501EvwHOSdEXabNY= +github.com/JupiterMetaLabs/JMDN-FastSync v0.0.0-20260601052219-40e74741de7c/go.mod h1:0erT7gGH4TYtitRik+Y3GfxSa5KGLacr9rJovV3vNB0= +github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260413092720-b819e61566f8 h1:yPrYb6g6NnqGsiCVqMf0zndEYTuelL3B03Fee+utLWA= +github.com/JupiterMetaLabs/JMDN_Merkletree v0.0.0-20260413092720-b819e61566f8/go.mod h1:zM8F31G2SiPXzTo1WzbDFZ5iOOAkqrkuZjS0QVDW4ew= github.com/JupiterMetaLabs/goroutine-orchestrator v0.1.5 h1:S9+s6JeWSrGJ6ooYb4f8iRlJxwPUZ8X/EA4EgxKS3zc= github.com/JupiterMetaLabs/goroutine-orchestrator v0.1.5/go.mod h1:SNkJRVlUwZM7Lt5ZhojWaimBljLg/pV6IKgn8oyViOA= -github.com/JupiterMetaLabs/ion v0.3.5 h1:L5xg2rSuyxaMjY/y0uxQfNc5lg/hEHofVUec5Bok1Ik= -github.com/JupiterMetaLabs/ion v0.3.5/go.mod h1:R64AKOZ4AFLSr/Hp9eBBK1rwvQwuIUx5Ebhqerq63RU= +github.com/JupiterMetaLabs/ion v0.4.2 h1:hogqCgUAQuy6yvLUdXoFOtJlvczFVaRvHGB7NgnFFfc= +github.com/JupiterMetaLabs/ion v0.4.2/go.mod h1:7RPjP/Zo+qJ+PC/yhfz0/I7/i6rHDuopistQivoY8jc= github.com/ProjectZKM/Ziren/crates/go-runtime/zkvm_runtime v0.0.0-20251001021608-1fe7b43fc4d6 h1:1zYrtlhrZ6/b6SAjLSfKzWtdgqK0U+HtH/VcBWh1BaU= github.com/ProjectZKM/Ziren/crates/go-runtime/zkvm_runtime v0.0.0-20251001021608-1fe7b43fc4d6/go.mod h1:ioLG6R+5bUSO1oeGSDxOV3FADARuMoytZCSX6MEMQkI= -github.com/StackExchange/wmi v1.2.1 h1:VIkavFPXSjcnS+O8yTq7NI32k0R5Aj+v39y29VYDOSA= -github.com/StackExchange/wmi v1.2.1/go.mod h1:rcmrprowKIVzvc+NUiLncP2uuArMWLCbu9SBzvHz7e8= github.com/aead/chacha20 v0.0.0-20180709150244-8b13a72661da h1:KjTM2ks9d14ZYCvmHS9iAKVt9AyzRSqNU1qabPih5BY= github.com/aead/chacha20 v0.0.0-20180709150244-8b13a72661da/go.mod h1:eHEWzANqSiWQsof+nXEI9bUVUyV6F53Fp89EuCh2EAA= github.com/aead/chacha20poly1305 v0.0.0-20170617001512-233f39982aeb/go.mod h1:UzH9IX1MMqOcwhoNOIjmTQeAxrFgzs50j4golQtXXxU= @@ -27,6 +27,10 @@ github.com/bits-and-blooms/bitset v1.24.2 h1:M7/NzVbsytmtfHbumG+K2bremQPMJuqv1JD github.com/bits-and-blooms/bitset v1.24.2/go.mod h1:7hO7Gc7Pp1vODcmWvKMRA9BNmbv6a/7QIWpPxHddWR8= github.com/bits-and-blooms/bloom/v3 v3.7.1 h1:WXovk4TRKZttAMJfoQx6K2DM0zNIt8w+c67UqO+etV0= github.com/bits-and-blooms/bloom/v3 v3.7.1/go.mod h1:rZzYLLje2dfzXfAkJNxQQHsKurAyK55KUnL43Euk0hU= +github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs= +github.com/bsm/ginkgo/v2 v2.12.0/go.mod h1:SwYbGRRDovPVboqFv0tPTcG1sN61LM1Z4ARdbAV9g4c= +github.com/bsm/gomega v1.27.10 h1:yeMWxP2pV2fG3FgAODIY8EiRE3dy0aeFYt4l7wh6yKA= +github.com/bsm/gomega v1.27.10/go.mod h1:JyEr/xRbxbtgWNi8tIEVPUYZ5Dzef52k01W3YH0H+O0= github.com/bytedance/sonic v1.14.0 h1:/OfKt8HFw0kh2rj8N0F6C/qPGRESq0BbaNZgcNXXzQQ= github.com/bytedance/sonic v1.14.0/go.mod h1:WoEbx8WTcFJfzCe0hbmyTGrfjt8PzNEBdxlNUO24NhA= github.com/bytedance/sonic/loader v0.3.0 h1:dskwH8edlzNMctoruo8FPTJDF3vLtDT0sXZwvZJyqeA= @@ -91,6 +95,7 @@ github.com/go-logr/logr v1.4.3 h1:CjnDlHq8ikf6E492q6eKboGOC0T8CDaOvkHCIg8idEI= github.com/go-logr/logr v1.4.3/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY= github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag= github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE= +github.com/go-ole/go-ole v1.2.6/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0= github.com/go-ole/go-ole v1.3.0 h1:Dt6ye7+vXGIKZ7Xtk4s6/xVdGDQynvom7xCFEdWr6uE= github.com/go-ole/go-ole v1.3.0/go.mod h1:5LS6F96DhAwUc7C+1HLexzMXY1xGRSryjyPPKW6zv78= github.com/go-playground/assert/v2 v2.2.0 h1:JvknZsQTYeFEAhQwI4qEt9cyV5ONwRHC+lYKSsYSR8s= @@ -143,8 +148,8 @@ github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0 h1:Ovs26xHkKqVztRpIrF/92Bcuy github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgfV/d3M/q6VIi02HzZEHgUlZvzk= github.com/grpc-ecosystem/grpc-gateway v1.16.0 h1:gmcG1KaJ57LophUzW0Hy8NmPhnMZb4M0+kPpLofRdBo= github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw= -github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.3 h1:NmZ1PKzSTQbuGHw9DGPFomqkkLWMC+vZCkfs+FHv1Vg= -github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.3/go.mod h1:zQrxl1YP88HQlA6i9c63DSVPFklWpGX4OWAc9bFuaH4= +github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0 h1:HWRh5R2+9EifMyIHV7ZV+MIZqgz+PMpZ14Jynv3O2Zs= +github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0/go.mod h1:JfhWUomR1baixubs02l85lZYYOm7LV6om4ceouMv45c= github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k= github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM= github.com/holiman/uint256 v1.3.2 h1:a9EgMPSC1AAaj1SZL5zIQD3WbwTuHrMGOerLjGmM/TA= @@ -163,8 +168,8 @@ github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnr github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo= github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8= github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck= -github.com/klauspost/compress v1.18.2 h1:iiPHWW0YrcFgpBYhsA6D1+fqHssJscY/Tm/y2Uqnapk= -github.com/klauspost/compress v1.18.2/go.mod h1:R0h/fSBs8DE4ENlcrlib3PsXS61voFxhIs2DeRhCvJ4= +github.com/klauspost/compress v1.18.5 h1:/h1gH5Ce+VWNLSWqPzOVn6XBO+vJbCNGvjoaGBFW2IE= +github.com/klauspost/compress v1.18.5/go.mod h1:cwPg85FWrGar70rWktvGQj8/hthj3wpl0PGDogxkrSQ= github.com/klauspost/cpuid/v2 v2.3.0 h1:S4CRMLnYUhGeDFDqkGriYKdfoFlDnMtqTiI/sFzhA9Y= github.com/klauspost/cpuid/v2 v2.3.0/go.mod h1:hqwkgyIinND0mEev00jJYCxPNVRVXFQeu1XKlok6oO0= github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= @@ -346,6 +351,8 @@ github.com/quic-go/quic-go v0.59.0 h1:OLJkp1Mlm/aS7dpKgTc6cnpynnD2Xg7C1pwL6vy/SA github.com/quic-go/quic-go v0.59.0/go.mod h1:upnsH4Ju1YkqpLXC305eW3yDZ4NfnNbmQRCMWS58IKU= github.com/quic-go/webtransport-go v0.10.0 h1:LqXXPOXuETY5Xe8ITdGisBzTYmUOy5eSj+9n4hLTjHI= github.com/quic-go/webtransport-go v0.10.0/go.mod h1:LeGIXr5BQKE3UsynwVBeQrU1TPrbh73MGoC6jd+V7ow= +github.com/redis/go-redis/v9 v9.19.0 h1:XPVaaPSnG6RhYf7p+rmSa9zZfeVAnWsH5h3lxthOm/k= +github.com/redis/go-redis/v9 v9.19.0/go.mod h1:v/M13XI1PVCDcm01VtPFOADfZtHf8YW3baQf57KlIkA= github.com/rivo/uniseg v0.2.0 h1:S1pD9weZBuJdFmowNwbpi7BJ8TNftyUImj/0WQi72jY= github.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJtxc= github.com/rogpeppe/fastuuid v1.2.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6LYCDYWNEvQ= @@ -358,8 +365,8 @@ github.com/rs/zerolog v1.34.0/go.mod h1:bJsvje4Z08ROH4Nhs5iH600c3IkWhwp44iRc54W6 github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= github.com/sagikazarmark/locafero v0.11.0 h1:1iurJgmM9G3PA/I+wWYIOw/5SyBtxapeHDcg+AAIFXc= github.com/sagikazarmark/locafero v0.11.0/go.mod h1:nVIGvgyzw595SUSUE6tvCp3YYTeHs15MvlmU87WwIik= -github.com/shirou/gopsutil v3.21.4-0.20210419000835-c7a38de76ee5+incompatible h1:Bn1aCHHRnjv4Bl16T8rcaFjYSrGrIZvpiGO6P3Q4GpU= -github.com/shirou/gopsutil v3.21.4-0.20210419000835-c7a38de76ee5+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA= +github.com/shirou/gopsutil v3.21.11+incompatible h1:+1+c1VGhc88SSonWP6foOcLhvnKlUeu/erjjvaPEYiI= +github.com/shirou/gopsutil v3.21.11+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA= github.com/sirupsen/logrus v1.4.2/go.mod h1:tLMulIdttU9McNUspp0xgXVQah82FyeX6MwdIuYE2rE= github.com/sourcegraph/conc v0.3.1-0.20240121214520-5f936abd7ae8 h1:+jumHNA0Wrelhe64i8F6HNlS8pkoyMv5sreGx2Ry5Rw= github.com/sourcegraph/conc v0.3.1-0.20240121214520-5f936abd7ae8/go.mod h1:3n1Cwaq1E1/1lhQhtRK2ts/ZwZEhjcQeJQ1RuC6Q/8U= @@ -425,6 +432,10 @@ github.com/yahoo/coname v0.0.0-20170609175141-84592ddf8673/go.mod h1:Wq2sZrP++Us github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY= +github.com/yusufpapurcu/wmi v1.2.4 h1:zFUKzehAFReQwLys1b/iSMl+JQGSCSjtVqQn9bBrPo0= +github.com/yusufpapurcu/wmi v1.2.4/go.mod h1:SBZ9tNy3G9/m5Oi98Zks0QjeHVDvuK0qfxQmPyzfmi0= +github.com/zeebo/xxh3 v1.1.0 h1:s7DLGDK45Dyfg7++yxI0khrfwq9661w9EN78eP/UZVs= +github.com/zeebo/xxh3 v1.1.0/go.mod h1:IisAie1LELR4xhVinxWS5+zf1lA4p0MW4T+w+W07F5s= go.dedis.ch/dela v0.2.0 h1:ZwMvLzMBeVfl2LDIB4gQNsrRFIGPAuSLX2TwCz9zQas= go.dedis.ch/dela v0.2.0/go.mod h1:2qkjZawF0II6GCPFC8LnP6XaxHoq/IEbuLvcsM4wT8o= go.dedis.ch/fixbuf v1.0.3 h1:hGcV9Cd/znUxlusJ64eAlExS+5cJDIyTyEG+otu5wQs= @@ -441,43 +452,45 @@ go.etcd.io/bbolt v1.3.9 h1:8x7aARPEXiXbHmtUwAIv7eV2fQFHrLLavdiJ3uzJXoI= go.etcd.io/bbolt v1.3.9/go.mod h1:zaO32+Ti0PK1ivdPtgMESzuzL2VPoIG1PCQNvOdo/dE= go.opentelemetry.io/auto/sdk v1.2.1 h1:jXsnJ4Lmnqd11kwkBV2LgLoFMZKizbCi5fNZ/ipaZ64= go.opentelemetry.io/auto/sdk v1.2.1/go.mod h1:KRTj+aOaElaLi+wW1kO/DZRXwkF4C5xPbEe3ZiIhN7Y= -go.opentelemetry.io/contrib/bridges/otelzap v0.14.0 h1:2nKw2ZXZOC0N8RBsBbYwGwfKR7kJWzzyCZ6QfUGW/es= -go.opentelemetry.io/contrib/bridges/otelzap v0.14.0/go.mod h1:kvyVt0WEI5BB6XaIStXPIkCSQ2nSkyd8IZnAHLEXge4= -go.opentelemetry.io/otel v1.40.0 h1:oA5YeOcpRTXq6NN7frwmwFR0Cn3RhTVZvXsP4duvCms= -go.opentelemetry.io/otel v1.40.0/go.mod h1:IMb+uXZUKkMXdPddhwAHm6UfOwJyh4ct1ybIlV14J0g= -go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.15.0 h1:W+m0g+/6v3pa5PgVf2xoFMi5YtNR06WtS7ve5pcvLtM= -go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.15.0/go.mod h1:JM31r0GGZ/GU94mX8hN4D8v6e40aFlUECSQ48HaLgHM= -go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.15.0 h1:EKpiGphOYq3CYnIe2eX9ftUkyU+Y8Dtte8OaWyHJ4+I= -go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.15.0/go.mod h1:nWFP7C+T8TygkTjJ7mAyEaFaE7wNfms3nV/vexZ6qt0= -go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.39.0 h1:cEf8jF6WbuGQWUVcqgyWtTR0kOOAWY1DYZ+UhvdmQPw= -go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.39.0/go.mod h1:k1lzV5n5U3HkGvTCJHraTAGJ7MqsgL1wrGwTj1Isfiw= -go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.39.0 h1:nKP4Z2ejtHn3yShBb+2KawiXgpn8In5cT7aO2wXuOTE= -go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.39.0/go.mod h1:NwjeBbNigsO4Aj9WgM0C+cKIrxsZUaRmZUO7A8I7u8o= -go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.39.0 h1:f0cb2XPmrqn4XMy9PNliTgRKJgS5WcL/u0/WRYGz4t0= -go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.39.0/go.mod h1:vnakAaFckOMiMtOIhFI2MNH4FYrZzXCYxmb1LlhoGz8= -go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.39.0 h1:in9O8ESIOlwJAEGTkkf34DesGRAc/Pn8qJ7k3r/42LM= -go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.39.0/go.mod h1:Rp0EXBm5tfnv0WL+ARyO/PHBEaEAT8UUHQ6AGJcSq6c= -go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.39.0 h1:Ckwye2FpXkYgiHX7fyVrN1uA/UYd9ounqqTuSNAv0k4= -go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.39.0/go.mod h1:teIFJh5pW2y+AN7riv6IBPX2DuesS3HgP39mwOspKwU= -go.opentelemetry.io/otel/log v0.15.0 h1:0VqVnc3MgyYd7QqNVIldC3dsLFKgazR6P3P3+ypkyDY= -go.opentelemetry.io/otel/log v0.15.0/go.mod h1:9c/G1zbyZfgu1HmQD7Qj84QMmwTp2QCQsZH1aeoWDE4= -go.opentelemetry.io/otel/log/logtest v0.15.0 h1:porNFuxAjodl6LhePevOc3n7bo3Wi3JhGXNWe7KP8iU= -go.opentelemetry.io/otel/log/logtest v0.15.0/go.mod h1:c8epqBXGHgS1LiNgmD+LuNYK9lSS3mqvtMdxLsfJgLg= -go.opentelemetry.io/otel/metric v1.40.0 h1:rcZe317KPftE2rstWIBitCdVp89A2HqjkxR3c11+p9g= -go.opentelemetry.io/otel/metric v1.40.0/go.mod h1:ib/crwQH7N3r5kfiBZQbwrTge743UDc7DTFVZrrXnqc= -go.opentelemetry.io/otel/sdk v1.39.0 h1:nMLYcjVsvdui1B/4FRkwjzoRVsMK8uL/cj0OyhKzt18= -go.opentelemetry.io/otel/sdk v1.39.0/go.mod h1:vDojkC4/jsTJsE+kh+LXYQlbL8CgrEcwmt1ENZszdJE= -go.opentelemetry.io/otel/sdk/log v0.15.0 h1:WgMEHOUt5gjJE93yqfqJOkRflApNif84kxoHWS9VVHE= -go.opentelemetry.io/otel/sdk/log v0.15.0/go.mod h1:qDC/FlKQCXfH5hokGsNg9aUBGMJQsrUyeOiW5u+dKBQ= -go.opentelemetry.io/otel/sdk/log/logtest v0.14.0 h1:Ijbtz+JKXl8T2MngiwqBlPaHqc4YCaP/i13Qrow6gAM= -go.opentelemetry.io/otel/sdk/log/logtest v0.14.0/go.mod h1:dCU8aEL6q+L9cYTqcVOk8rM9Tp8WdnHOPLiBgp0SGOA= -go.opentelemetry.io/otel/sdk/metric v1.39.0 h1:cXMVVFVgsIf2YL6QkRF4Urbr/aMInf+2WKg+sEJTtB8= -go.opentelemetry.io/otel/sdk/metric v1.39.0/go.mod h1:xq9HEVH7qeX69/JnwEfp6fVq5wosJsY1mt4lLfYdVew= -go.opentelemetry.io/otel/trace v1.40.0 h1:WA4etStDttCSYuhwvEa8OP8I5EWu24lkOzp+ZYblVjw= -go.opentelemetry.io/otel/trace v1.40.0/go.mod h1:zeAhriXecNGP/s2SEG3+Y8X9ujcJOTqQ5RgdEJcawiA= -go.opentelemetry.io/proto/otlp v1.9.0 h1:l706jCMITVouPOqEnii2fIAuO3IVGBRPV5ICjceRb/A= -go.opentelemetry.io/proto/otlp v1.9.0/go.mod h1:xE+Cx5E/eEHw+ISFkwPLwCZefwVjY+pqKg1qcK03+/4= +go.opentelemetry.io/contrib/bridges/otelzap v0.17.0 h1:oCltVHJcblcth2z9B9dRTeZIZTe2Sf9Ad9h8bcc+s8M= +go.opentelemetry.io/contrib/bridges/otelzap v0.17.0/go.mod h1:G/VE1A/hRn6mEWdfC8rMvSdQVGM64KUPi4XilLkwcQw= +go.opentelemetry.io/otel v1.42.0 h1:lSQGzTgVR3+sgJDAU/7/ZMjN9Z+vUip7leaqBKy4sho= +go.opentelemetry.io/otel v1.42.0/go.mod h1:lJNsdRMxCUIWuMlVJWzecSMuNjE7dOYyWlqOXWkdqCc= +go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.18.0 h1:deI9UQMoGFgrg5iLPgzueqFPHevDl+28YKfSpPTI6rY= +go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.18.0/go.mod h1:PFx9NgpNUKXdf7J4Q3agRxMs3Y07QhTCVipKmLsMKnU= +go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.18.0 h1:icqq3Z34UrEFk2u+HMhTtRsvo7Ues+eiJVjaJt62njs= +go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.18.0/go.mod h1:W2m8P+d5Wn5kipj4/xmbt9uMqezEKfBjzVJadfABSBE= +go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.42.0 h1:MdKucPl/HbzckWWEisiNqMPhRrAOQX8r4jTuGr636gk= +go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.42.0/go.mod h1:RolT8tWtfHcjajEH5wFIZ4Dgh5jpPdFXYV9pTAk/qjc= +go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.42.0 h1:H7O6RlGOMTizyl3R08Kn5pdM06bnH8oscSj7o11tmLA= +go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.42.0/go.mod h1:mBFWu/WOVDkWWsR7Tx7h6EpQB8wsv7P0Yrh0Pb7othc= +go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.42.0 h1:THuZiwpQZuHPul65w4WcwEnkX2QIuMT+UFoOrygtoJw= +go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.42.0/go.mod h1:J2pvYM5NGHofZ2/Ru6zw/TNWnEQp5crgyDeSrYpXkAw= +go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.42.0 h1:zWWrB1U6nqhS/k6zYB74CjRpuiitRtLLi68VcgmOEto= +go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.42.0/go.mod h1:2qXPNBX1OVRC0IwOnfo1ljoid+RD0QK3443EaqVlsOU= +go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.42.0 h1:uLXP+3mghfMf7XmV4PkGfFhFKuNWoCvvx5wP/wOXo0o= +go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.42.0/go.mod h1:v0Tj04armyT59mnURNUJf7RCKcKzq+lgJs6QSjHjaTc= +go.opentelemetry.io/otel/log v0.18.0 h1:XgeQIIBjZZrliksMEbcwMZefoOSMI1hdjiLEiiB0bAg= +go.opentelemetry.io/otel/log v0.18.0/go.mod h1:KEV1kad0NofR3ycsiDH4Yjcoj0+8206I6Ox2QYFSNgI= +go.opentelemetry.io/otel/log/logtest v0.18.0 h1:2QeyoKJdIgK2LJhG1yn78o/zmpXx1EditeyRDREqVS8= +go.opentelemetry.io/otel/log/logtest v0.18.0/go.mod h1:v1vh3PYR9zIa5MK6HwkH2lMrLBg/Y9Of6Qc+krlesX0= +go.opentelemetry.io/otel/metric v1.42.0 h1:2jXG+3oZLNXEPfNmnpxKDeZsFI5o4J+nz6xUlaFdF/4= +go.opentelemetry.io/otel/metric v1.42.0/go.mod h1:RlUN/7vTU7Ao/diDkEpQpnz3/92J9ko05BIwxYa2SSI= +go.opentelemetry.io/otel/sdk v1.42.0 h1:LyC8+jqk6UJwdrI/8VydAq/hvkFKNHZVIWuslJXYsDo= +go.opentelemetry.io/otel/sdk v1.42.0/go.mod h1:rGHCAxd9DAph0joO4W6OPwxjNTYWghRWmkHuGbayMts= +go.opentelemetry.io/otel/sdk/log v0.18.0 h1:n8OyZr7t7otkeTnPTbDNom6rW16TBYGtvyy2Gk6buQw= +go.opentelemetry.io/otel/sdk/log v0.18.0/go.mod h1:C0+wxkTwKpOCZLrlJ3pewPiiQwpzycPI/u6W0Z9fuYk= +go.opentelemetry.io/otel/sdk/log/logtest v0.18.0 h1:l3mYuPsuBx6UKE47BVcPrZoZ0q/KER57vbj2qkgDLXA= +go.opentelemetry.io/otel/sdk/log/logtest v0.18.0/go.mod h1:7cHtiVJpZebB3wybTa4NG+FUo5NPe3PROz1FqB0+qdw= +go.opentelemetry.io/otel/sdk/metric v1.42.0 h1:D/1QR46Clz6ajyZ3G8SgNlTJKBdGp84q9RKCAZ3YGuA= +go.opentelemetry.io/otel/sdk/metric v1.42.0/go.mod h1:Ua6AAlDKdZ7tdvaQKfSmnFTdHx37+J4ba8MwVCYM5hc= +go.opentelemetry.io/otel/trace v1.42.0 h1:OUCgIPt+mzOnaUTpOQcBiM/PLQ/Op7oq6g4LenLmOYY= +go.opentelemetry.io/otel/trace v1.42.0/go.mod h1:f3K9S+IFqnumBkKhRJMeaZeNk9epyhnCmQh/EysQCdc= +go.opentelemetry.io/proto/otlp v1.10.0 h1:IQRWgT5srOCYfiWnpqUYz9CVmbO8bFmKcwYxpuCSL2g= +go.opentelemetry.io/proto/otlp v1.10.0/go.mod h1:/CV4QoCR/S9yaPj8utp3lvQPoqMtxXdzn7ozvvozVqk= go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= +go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE= +go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0= go.uber.org/dig v1.19.0 h1:BACLhebsYdpQ7IROQ1AGPjrXcP5dF80U3gKoFzbaq/4= go.uber.org/dig v1.19.0/go.mod h1:Us0rSJiThwCv2GteUN0Q7OKvU7n5J4dxZ9JKUXozFdE= go.uber.org/fx v1.24.0 h1:wE8mruvpg2kiiL1Vqd0CC+tr0/24XIB10Iwp2lLWzkg= @@ -510,8 +523,8 @@ golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5y golang.org/x/crypto v0.8.0/go.mod h1:mRqEX+O9/h5TFCrQhkgjo2yKi0yYA+9ecGkdQoHrywE= golang.org/x/crypto v0.12.0/go.mod h1:NF0Gs7EO5K4qLn+Ylc+fih8BSTeIjAP05siRnAh98yw= golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg= -golang.org/x/crypto v0.46.0 h1:cKRW/pmt1pKAfetfu+RCEvjvZkA9RimPbh7bhFjGVBU= -golang.org/x/crypto v0.46.0/go.mod h1:Evb/oLKmMraqjZ2iQTwDwvCtJkczlDuTmdJXoZVzqU0= +golang.org/x/crypto v0.49.0 h1:+Ng2ULVvLHnJ/ZFEq4KdcDd/cfjrrjjNSXNzxg0Y4U4= +golang.org/x/crypto v0.49.0/go.mod h1:ErX4dUh2UM+CFYiXZRTcMpEcN8b/1gxEuv3nODoYtCA= golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20250606033433-dcc06ee1d476 h1:bsqhLWFR6G6xiQcb+JoGqdKdRU6WzPWmK8E0jxTjzo4= golang.org/x/exp v0.0.0-20250606033433-dcc06ee1d476/go.mod h1:3//PLf8L/X+8b4vuAfHzxeRUl04Adcb341+IGKfnqS8= @@ -522,8 +535,8 @@ golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4= golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= -golang.org/x/mod v0.30.0 h1:fDEXFVZ/fmCKProc/yAXXUijritrDzahmwwefnjoPFk= -golang.org/x/mod v0.30.0/go.mod h1:lAsf5O2EvJeSFMiBxXDki7sCgAxEUcZHXoXMKT4GJKc= +golang.org/x/mod v0.33.0 h1:tHFzIWbBifEmbwtGz65eaWyGiGZatSrT9prnU8DbVL8= +golang.org/x/mod v0.33.0/go.mod h1:swjeQEj+6r7fODbD2cqrnje9PnziFuw4bmLbBZFrQ5w= golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20190108225652-1e06a53dbb7e/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= @@ -543,8 +556,8 @@ golang.org/x/net v0.9.0/go.mod h1:d48xBJpPfHeWQsugry2m+kC02ZBRGRgulfHnEXEuWns= golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg= golang.org/x/net v0.14.0/go.mod h1:PpSgVXXLK0OxS0F31C1/tv6XNguvCrnXIDrFMspZIUI= golang.org/x/net v0.20.0/go.mod h1:z8BVo6PvndSri0LbOE3hAn0apkU+1YvI6E70E9jsnvY= -golang.org/x/net v0.48.0 h1:zyQRTTrjc33Lhh0fBgT/H3oZq9WuvRR5gPC70xpDiQU= -golang.org/x/net v0.48.0/go.mod h1:+ndRgGjkh8FGtu1w1FGbEC31if4VrNVMuKTgcAAnQRY= +golang.org/x/net v0.52.0 h1:He/TN1l0e4mmR3QqHMT2Xab3Aj3L9qjbhRm78/6jrW0= +golang.org/x/net v0.52.0/go.mod h1:R1MAz7uMZxVMualyPXb+VaqGSa3LIaUqk0eEt3w36Sw= golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= @@ -556,14 +569,15 @@ golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= -golang.org/x/sync v0.19.0 h1:vV+1eWNmZ5geRlYjzm2adRgW2/mcpevXNg50YZtPCE4= -golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI= +golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4= +golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0= golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20181026203630-95b1ffbd15a5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20190124100055-b90733256f2e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20190422165155-953cdadca894/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190916202348-b4ddaad3f8a3/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20200602225109-6fdc65e7d980/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= @@ -582,10 +596,10 @@ golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.11.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.16.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= -golang.org/x/sys v0.39.0 h1:CvCKL8MeisomCi6qNZ+wbb0DN9E5AATixKsvNtMoMFk= -golang.org/x/sys v0.39.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks= -golang.org/x/telemetry v0.0.0-20251111182119-bc8e575c7b54 h1:E2/AqCUMZGgd73TQkxUMcMla25GB9i/5HOdLr+uH7Vo= -golang.org/x/telemetry v0.0.0-20251111182119-bc8e575c7b54/go.mod h1:hKdjCMrbv9skySur+Nek8Hd0uJ0GuxJIoIX2payrIdQ= +golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo= +golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= +golang.org/x/telemetry v0.0.0-20260209163413-e7419c687ee4 h1:bTLqdHv7xrGlFbvf5/TXNxy/iUwwdkjhqQTJDjW7aj0= +golang.org/x/telemetry v0.0.0-20260209163413-e7419c687ee4/go.mod h1:g5NllXBEermZrmR51cJDQxmJUHUOfRAaNyWBM+R+548= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k= @@ -593,8 +607,8 @@ golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY= golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo= golang.org/x/term v0.11.0/go.mod h1:zC9APTIj3jG3FdV/Ons+XE1riIZXG4aZ4GTHiPZJPIU= golang.org/x/term v0.16.0/go.mod h1:yn7UURbUtPyrVJPGPq404EukNFxcm/foM+bV/bfcDsY= -golang.org/x/term v0.38.0 h1:PQ5pkm/rLO6HnxFR7N2lJHOZX6Kez5Y1gDSJla6jo7Q= -golang.org/x/term v0.38.0/go.mod h1:bSEAKrOT1W+VSu9TSCMtoGEOUcKxOKgl3LE5QEF/xVg= +golang.org/x/term v0.41.0 h1:QCgPso/Q3RTJx2Th4bDLqML4W6iJiaXFq2/ftQF13YU= +golang.org/x/term v0.41.0/go.mod h1:3pfBgksrReYfZ5lvYM0kSO0LIkAl4Yl2bXOkKP7Ec2A= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= @@ -603,8 +617,8 @@ golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8= golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8= golang.org/x/text v0.12.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE= golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU= -golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU= -golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY= +golang.org/x/text v0.35.0 h1:JOVx6vVDFokkpaq1AEptVzLTpDe9KGpj5tR4/X+ybL8= +golang.org/x/text v0.35.0/go.mod h1:khi/HExzZJ2pGnjenulevKNX1W67CUy0AsXcNubPGCA= golang.org/x/time v0.12.0 h1:ScB/8o8olJvc+CQPWrK3fPZNfh7qgwCrY0zJmoEQLSE= golang.org/x/time v0.12.0/go.mod h1:CDIdPxbZBQxdj6cxyCIdrNogrJKMJ7pr37NYpMcMDSg= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= @@ -617,8 +631,8 @@ golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roY golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA= golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc= golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= -golang.org/x/tools v0.39.0 h1:ik4ho21kwuQln40uelmciQPp9SipgNDdrafrYA4TmQQ= -golang.org/x/tools v0.39.0/go.mod h1:JnefbkDPyD8UU2kI5fuf8ZX4/yUeh9W877ZeBONxUqQ= +golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k= +golang.org/x/tools v0.42.0/go.mod h1:Ma6lCIwGZvHK6XtgbswSoWroEkhugApmsXyrUmBhfr0= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= @@ -635,18 +649,18 @@ google.golang.org/genproto v0.0.0-20200423170343-7949de9c1215/go.mod h1:55QSHmfG google.golang.org/genproto v0.0.0-20200513103714-09dca8ec2884/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c= google.golang.org/genproto v0.0.0-20230803162519-f966b187b2e5 h1:L6iMMGrtzgHsWofoFcihmDEMYeDR9KN/ThbPWGrh++g= google.golang.org/genproto v0.0.0-20230803162519-f966b187b2e5/go.mod h1:oH/ZOT02u4kWEp7oYBGYFFkCdKS/uYR9Z7+0/xuuFp8= -google.golang.org/genproto/googleapis/api v0.0.0-20251222181119-0a764e51fe1b h1:uA40e2M6fYRBf0+8uN5mLlqUtV192iiksiICIBkYJ1E= -google.golang.org/genproto/googleapis/api v0.0.0-20251222181119-0a764e51fe1b/go.mod h1:Xa7le7qx2vmqB/SzWUBa7KdMjpdpAHlh5QCSnjessQk= -google.golang.org/genproto/googleapis/rpc v0.0.0-20251222181119-0a764e51fe1b h1:Mv8VFug0MP9e5vUxfBcE3vUkV6CImK3cMNMIDFjmzxU= -google.golang.org/genproto/googleapis/rpc v0.0.0-20251222181119-0a764e51fe1b/go.mod h1:j9x/tPzZkyxcgEFkiKEEGxfvyumM01BEtsW8xzOahRQ= +google.golang.org/genproto/googleapis/api v0.0.0-20260319201613-d00831a3d3e7 h1:41r6JMbpzBMen0R/4TZeeAmGXSJC7DftGINUodzTkPI= +google.golang.org/genproto/googleapis/api v0.0.0-20260319201613-d00831a3d3e7/go.mod h1:EIQZ5bFCfRQDV4MhRle7+OgjNtZ6P1PiZBgAKuxXu/Y= +google.golang.org/genproto/googleapis/rpc v0.0.0-20260319201613-d00831a3d3e7 h1:ndE4FoJqsIceKP2oYSnUZqhTdYufCYYkqwtFzfrhI7w= +google.golang.org/genproto/googleapis/rpc v0.0.0-20260319201613-d00831a3d3e7/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8= google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c= google.golang.org/grpc v1.23.0/go.mod h1:Y5yQAOtifL1yxbo5wqy6BxZv8vAUGQwXBOALyacEbxg= google.golang.org/grpc v1.25.1/go.mod h1:c3i+UQWmh7LiEpx4sFZnkU36qjEYZ0imhYfXVyQciAY= google.golang.org/grpc v1.27.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk= google.golang.org/grpc v1.29.1/go.mod h1:itym6AZVZYACWQqET3MqgPpjcuV5QH3BxFS3IjizoKk= google.golang.org/grpc v1.33.1/go.mod h1:fr5YgcSWrqhRRxogOsw7RzIpsmvOZ6IcH4kBYTpR3n0= -google.golang.org/grpc v1.78.0 h1:K1XZG/yGDJnzMdd/uZHAkVqJE+xIDOcmdSFZkBUicNc= -google.golang.org/grpc v1.78.0/go.mod h1:I47qjTo4OKbMkjA/aOOwxDIiPSBofUtQUI5EfpWvW7U= +google.golang.org/grpc v1.79.3 h1:sybAEdRIEtvcD68Gx7dmnwjZKlyfuc61Dyo9pGXXkKE= +google.golang.org/grpc v1.79.3/go.mod h1:KmT0Kjez+0dde/v2j9vzwoAScgEPx/Bw1CYChhHLrHQ= google.golang.org/protobuf v1.36.11 h1:fV6ZwhNocDyBLK0dj+fg8ektcVegBBuEolpbTQyBNVE= google.golang.org/protobuf v1.36.11/go.mod h1:HTf+CrKn2C3g5S8VImy6tdcUvCska2kB7j23XfzDpco= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= diff --git a/jmdn_default.yaml b/jmdn_default.yaml index 065a7f76..ddc82a58 100644 --- a/jmdn_default.yaml +++ b/jmdn_default.yaml @@ -41,6 +41,9 @@ binds: database: username: "" password: "" + redis: + url: "127.0.0.1:6379" + password: "" # ── Logging (Ion) ──────────────────────────────────────── # Maps directly to Ion's config struct. All env vars like @@ -69,6 +72,7 @@ logging: endpoint: "" # e.g. "collector.example.com:4317" protocol: "grpc" insecure: false + headers: {} # e.g. {"Authorization": "Bearer "} username: "" # Prefer env: JMDN_LOGGING_OTEL_USERNAME password: "" # Prefer env: JMDN_LOGGING_OTEL_PASSWORD batch_size: 512 @@ -83,14 +87,15 @@ features: use_legacy_bft: false grotrack: false # Requires ports.metrics > 0 -# ── Alerts ───────────────────────────────────────────── -# External alerting service (Telegram via tg.jmdt.io). -# Prefer env vars for secrets: JMDN_ALERTS_API_KEY, JMDN_ALERTS_CHAT_ID -alerts: - url: "" # e.g. "https://tg.jmdt.io/multi-channel" - api_key: "" # Prefer env: JMDN_ALERTS_API_KEY - chat_id: "" # Prefer env: JMDN_ALERTS_CHAT_ID - http_timeout: 10s +# ── FastSync V2 ───────────────────────────────────────── +fastsync: + enabled: true # Register protocol handlers and serve data to peers + sync: true # Allow this node to pull data and update its local DB + # Set false for sequencers/authoritative nodes that must + # never overwrite their own state (they still serve data) + startup_sync: true # Catch up on missed blocks automatically on node restart + sync_timeout: 10m # Max wall-clock time for a single full sync operation + allowed_peers: [] # Whitelist of peer IDs to sync FROM (empty = any peer) # ── Security ──────────────────────────────────────────── # Enterprise Security Module (Gatekeeper) @@ -192,3 +197,12 @@ security: auth_type: "mtls" rate_limit: 0 # NEVER rate-limit BFT consensus burst: 0 + +# ── Alerts ───────────────────────────────────────────── +# External alerting service (Telegram via tg.jmdt.io). +# Prefer env vars for secrets: JMDN_ALERTS_API_KEY, JMDN_ALERTS_CHAT_ID +alerts: + url: "" # e.g. "https://tg.jmdt.io/multi-channel" + api_key: "" # Prefer env: JMDN_ALERTS_API_KEY + chat_id: "" # Prefer env: JMDN_ALERTS_CHAT_ID + http_timeout: 10s diff --git a/logging/otelsetup/setup.go b/logging/otelsetup/setup.go index 4509ada8..31a5db3d 100644 --- a/logging/otelsetup/setup.go +++ b/logging/otelsetup/setup.go @@ -66,16 +66,15 @@ func Setup(logDir string, logFileName string) (*ion.Ion, []ion.Warning, error) { // OTEL export if logCfg.OTEL.Enabled && logCfg.OTEL.Endpoint != "" { - cfg.OTEL = ion.OTELConfig{ - Enabled: true, - Endpoint: logCfg.OTEL.Endpoint, - Protocol: logCfg.OTEL.Protocol, - Insecure: logCfg.OTEL.Insecure, - Username: logCfg.OTEL.Username, - Password: logCfg.OTEL.Password, - BatchSize: logCfg.OTEL.BatchSize, - ExportInterval: logCfg.OTEL.ExportInterval, - } + cfg.OTEL.Enabled = true + cfg.OTEL.Endpoint = logCfg.OTEL.Endpoint + cfg.OTEL.Protocol = logCfg.OTEL.Protocol + cfg.OTEL.Insecure = logCfg.OTEL.Insecure + cfg.OTEL.Headers = logCfg.OTEL.Headers + cfg.OTEL.Username = logCfg.OTEL.Username + cfg.OTEL.Password = logCfg.OTEL.Password + cfg.OTEL.BatchSize = logCfg.OTEL.BatchSize + cfg.OTEL.ExportInterval = logCfg.OTEL.ExportInterval // Tracing (inherits OTEL endpoint) cfg.Tracing = ion.TracingConfig{ diff --git a/main.go b/main.go index c03a0413..a2008d14 100644 --- a/main.go +++ b/main.go @@ -27,6 +27,7 @@ import ( "gossipnode/CA/ImmuDB_CA" cli "gossipnode/CLI" "gossipnode/DB_OPs" + NodeInfo "gossipnode/DB_OPs/Nodeinfo" "gossipnode/DID" "gossipnode/Pubsub" "gossipnode/Security" @@ -51,6 +52,7 @@ import ( "github.com/libp2p/go-libp2p/core/host" "github.com/libp2p/go-libp2p/core/network" _ "github.com/mattn/go-sqlite3" + "github.com/redis/go-redis/v9" "github.com/rs/zerolog/log" ) @@ -258,9 +260,8 @@ func runCommand(command string, args []string, grpcPort int) { fmt.Println(" broadcast - Broadcast message") fmt.Println(" getdid - Get DID document") fmt.Println(" propagatedid [balance] - Propagate DID to network") - fmt.Println(" fastsync - Fast sync with peer") - fmt.Println(" fastsyncv2 - Fast sync with peer using JMDN-FastSync V2 engine") - fmt.Println(" firstsync - First sync: get all data from peer (server) or receive all data (client)") + fmt.Println(" fastsync - Fast sync with peer (V2 Engine)") + fmt.Println(" accountsync - Sync missing accounts only (skip block sync)") fmt.Println("\nUsage: ./jmdn -cmd [args...]") fmt.Println("\nNote: Some interactive commands (mempoolStats, seednodeStats, etc.)") fmt.Println("are only available in interactive mode.") @@ -417,65 +418,54 @@ func runCommand(command string, args []string, grpcPort int) { os.Exit(1) } - case "fastsync": + case "fastsync", "fastsyncv2", "firstsync": if len(args) < 1 { fmt.Println("Usage: jmdn -cmd fastsync ") os.Exit(1) } - fmt.Println("Starting fast sync...") - stats, err := client.FastSync(args[0]) + fmt.Println("Starting FastSync (V2 Engine)...") + stats, err := client.FastSyncV2(args[0]) if err != nil { fmt.Printf("Error: %v\n", err) os.Exit(1) } - // Defensive guards against nil responses to prevent panics if stats == nil { - fmt.Println("FastSync returned no stats (nil). The target peer may be unreachable or rejected the request.") + fmt.Println("FastSync returned no stats. The target peer may be unreachable.") os.Exit(1) } - fmt.Printf("Sync completed in %dms\n", stats.TimeTaken) + if stats.Error != "" { + fmt.Printf("FastSync failed: %s\n", stats.Error) + os.Exit(1) + } + fmt.Printf("Sync completed in %ds\n", stats.TimeTaken) if stats.MainState == nil { - fmt.Println(" Main DB TxID: unavailable (no state returned)") + fmt.Println(" Main DB TxID: unavailable") } else { fmt.Printf(" Main DB TxID: %d\n", stats.MainState.TxId) } if stats.AccountsState == nil { - fmt.Println(" Accounts DB TxID: unavailable (no state returned)") + fmt.Println(" Accounts DB TxID: unavailable") } else { fmt.Printf(" Accounts DB TxID: %d\n", stats.AccountsState.TxId) } - case "firstsync": - if len(args) < 2 { - fmt.Println("Usage: jmdn -cmd firstsync ") - os.Exit(1) - } - mode := args[1] - if mode != "server" && mode != "client" { - fmt.Println("Error: mode must be 'server' or 'client'") - fmt.Println("Usage: jmdn -cmd firstsync ") + case "accountsync": + if len(args) < 1 { + fmt.Println("Usage: jmdn -cmd accountsync ") os.Exit(1) } - fmt.Printf("Starting first sync in %s mode...\n", mode) - stats, err := client.FirstSync(args[0], mode) + fmt.Println("Starting AccountSync (accounts only, no block sync)...") + stats, err := client.AccountSync(args[0]) if err != nil { fmt.Printf("Error: %v\n", err) os.Exit(1) } - // Defensive guards against nil responses to prevent panics - if stats == nil { - fmt.Println("FirstSync returned no stats (nil). The target peer may be unreachable or rejected the request.") + if stats.Error != "" { + fmt.Printf("AccountSync failed: %s\n", stats.Error) os.Exit(1) } - fmt.Printf("Sync completed in %dms\n", stats.TimeTaken) - if stats.MainState == nil { - fmt.Println(" Main DB TxID: unavailable (no state returned)") - } else { - fmt.Printf(" Main DB TxID: %d\n", stats.MainState.TxId) - } - if stats.AccountsState == nil { - fmt.Println(" Accounts DB TxID: unavailable (no state returned)") - } else { + fmt.Printf("AccountSync completed in %ds\n", stats.TimeTaken) + if stats.AccountsState != nil { fmt.Printf(" Accounts DB TxID: %d\n", stats.AccountsState.TxId) } @@ -519,9 +509,8 @@ func runCommand(command string, args []string, grpcPort int) { fmt.Println(" sendfile - Send file") fmt.Println(" broadcast - Broadcast message") fmt.Println(" getdid - Get DID document") - fmt.Println(" fastsync - Fast sync with peer") - fmt.Println(" fastsyncv2 - Fast sync with peer using V2 Engine") - fmt.Println(" firstsync - First sync: get all data from peer (server) or receive all data (client)") + fmt.Println(" fastsync - Fast sync with peer (V2 Engine)") + fmt.Println(" accountsync - Sync missing accounts only (skip block sync)") os.Exit(1) } } @@ -631,8 +620,8 @@ func initFastSync(n *config.Node, mainClient *config.PooledConnection, accountsC } // initFastsyncV2 initializes the FastSync V2 service -func initFastsyncV2(n *config.Node) *FastsyncV2.FastsyncV2 { - fs, err := FastsyncV2.NewFastsyncV2(n.Host) +func initFastsyncV2(n *config.Node, syncTimeout time.Duration) *FastsyncV2.FastsyncV2 { + fs, err := FastsyncV2.NewFastsyncV2(n.Host, syncTimeout) if err != nil { log.Error().Err(err).Msg("Failed to start FastsyncV2 engine") return nil @@ -875,6 +864,24 @@ func main() { log.Fatal().Err(err).Msg("Failed to initialize accounts database pool") } + // ── Account Sync Worker (Redis Stream) ─────────────────────────────────── + // WriteAccounts and BatchUpdateAccounts enqueue to a Redis Stream and return + // immediately, decoupling callers from the ~15 s ImmuDB commit latency. + // The worker drains the stream and writes batches to ImmuDB asynchronously. + // Required before FastsyncV2 starts — it calls WriteAccounts during sync. + if cfg.Database.Redis.URL == "" { + log.Warn().Msg("[AccountSyncWorker] database.redis.url not configured — WriteAccounts will fail; set url in jmdn.yaml or JMDN_DATABASE_REDIS_URL") + } else { + redisClient := redis.NewClient(&redis.Options{ + Addr: cfg.Database.Redis.URL, + Password: cfg.Database.Redis.Password, + }) + accountStreamer := NodeInfo.NewRedisStreamer(redisClient) + NodeInfo.StartAccountSyncWorker(accountStreamer, NodeInfo.DefaultWorkerConfig()) + log.Info().Str("redis_url", cfg.Database.Redis.URL).Msg("[accountqueue] installed — WriteAccounts is now async, worker starts lazily") + fmt.Println("✅ Account sync worker started (Redis Stream → ImmuDB async)") + } + // Discover Yggdrasil address BEFORE creating the node fmt.Println("Discovering Yggdrasil address...") ipv6, err := helper.GetTun0GlobalIPv6() @@ -954,7 +961,66 @@ func main() { // Initialize FastSync service fastSyncer = initFastSync(n, mainDBClient, didDBClient) - fastSyncerV2 = initFastsyncV2(n) + if cfg.FastSync.Enabled { + fastSyncerV2 = initFastsyncV2(n, cfg.FastSync.SyncTimeout) + } else { + log.Info().Msg("[FastSync] disabled by config — protocol handlers not registered") + } + + // Startup sync: catch up on blocks missed while offline. + if fastSyncerV2 != nil && cfg.FastSync.EnablePulling && cfg.FastSync.PullOnStartup { + if err := goMaybeTracked(MainLM, GRO.MainAM, GRO.MainLM, GRO.StartupSyncThread, func(ctx context.Context) error { + // Wait for peer connections to establish after node startup + time.Sleep(5 * time.Second) + + peers := n.Host.Network().Peers() + if len(peers) == 0 { + // TODO: Query seed node for available sync peers when no direct peers are connected + log.Info().Msg("[StartupSync] No peers connected, skipping startup sync") + return nil + } + + log.Info().Int("peers", len(peers)).Msg("[StartupSync] Attempting startup sync with connected peers") + + for _, peerID := range peers { + // Honour allowed_peers whitelist if configured + if len(cfg.FastSync.AllowedPeers) > 0 { + allowed := false + for _, ap := range cfg.FastSync.AllowedPeers { + if ap == peerID.String() { + allowed = true + break + } + } + if !allowed { + log.Info().Str("peer", peerID.String()).Msg("[StartupSync] Skipping peer not in allowed_peers") + continue + } + } + + addrs := n.Host.Peerstore().Addrs(peerID) + if len(addrs) == 0 { + continue + } + + log.Info().Str("peer", peerID.String()).Msg("[StartupSync] Trying peer") + if err := fastSyncerV2.HandleStartupSync(peerID, addrs); err != nil { + log.Warn().Err(err).Str("peer", peerID.String()).Msg("[StartupSync] Failed, trying next peer") + continue + } + + log.Info().Str("peer", peerID.String()).Msg("[StartupSync] Sync completed successfully") + return nil + } + + log.Warn().Msg("[StartupSync] Failed to sync with any connected peer") + return nil + }); err != nil { + log.Error().Err(err).Str("thread", GRO.StartupSyncThread).Msg("Failed to start startup sync goroutine") + } + } else if fastSyncerV2 != nil && !cfg.FastSync.EnablePulling { + log.Info().Msg("[FastSync] Node configured with enable_pulling=false (serve-only participant); skipping StartupSync") + } // Initialize Yggdrasil messaging if enabled if cfg.Network.Yggdrasil { @@ -1124,6 +1190,7 @@ func main() { ChainID: cfg.Network.ChainID, FacadePort: cfg.Ports.Facade, WSPort: cfg.Ports.WS, + PullAllowed: cfg.FastSync.EnablePulling, } // Only set database clients if they're properly initialized diff --git a/messaging/broadcast.go b/messaging/broadcast.go index 2c082eb6..2dd03325 100644 --- a/messaging/broadcast.go +++ b/messaging/broadcast.go @@ -575,6 +575,116 @@ func BroadcastVoteTrigger(h host.Host, consensusMessage *PubSubMessages.Consensu return nil } +// BroadcastVoteTriggerToCommittee sends a vote trigger message only to the specified committee peers +// instead of broadcasting to all connected peers. This prevents non-committee nodes from receiving +// the trigger and submitting votes that go nowhere. +func BroadcastVoteTriggerToCommittee(h host.Host, consensusMessage *PubSubMessages.ConsensusMessage, committeePeers []peer.ID) error { + if BroadcastLocalGRO == nil { + var err error + BroadcastLocalGRO, err = GROHelper.InitializeGRO(GRO.BroadcastLocal) + if err != nil { + log.Error().Err(err).Msg("Failed to initialize BroadcastLocalGRO") + return err + } + } + + if consensusMessage == nil { + return fmt.Errorf("consensus message cannot be nil") + } + + if consensusMessage.GetZKBlock().BlockHash.String() == "" { + return fmt.Errorf("consensus message ZKBlock block hash is empty") + } + + if len(committeePeers) == 0 { + return fmt.Errorf("no committee peers to broadcast vote trigger to") + } + + // Set the voting timer when broadcast starts + now := time.Now().UTC() + consensusMessage.SetStartTime(now) + consensusMessage.SetEndTimeout(now.Add(config.ConsensusTimeout)) + + // Marshal the consensus message to JSON + consensusData, err := json.Marshal(consensusMessage) + if err != nil { + return fmt.Errorf("failed to marshal consensus message: %w", err) + } + + // Create a vote trigger broadcast message + msg := BroadcastMessageStruct{ + Sender: h.ID().String(), + Content: "Vote trigger broadcast - initiate voting process", + Timestamp: now.Unix(), + Hops: 0, + Type: "vote_trigger", + Data: string(consensusData), + } + + msg.ID = generateMessageID(msg.Sender, msg.Content, now.Unix()) + markMessageSeen(msg.ID) + + msgBytes, err := json.Marshal(msg) + if err != nil { + return fmt.Errorf("failed to marshal vote trigger broadcast message: %w", err) + } + msgBytes = append(msgBytes, '\n') + + log.Info(). + Str("msg_id", msg.ID). + Int("committee_peers", len(committeePeers)). + Msg("Starting targeted vote trigger broadcast to committee peers") + + wg, err := BroadcastLocalGRO.NewFunctionWaitGroup(context.Background(), GRO.BroadcastVoteTriggerWG) + if err != nil { + log.Error().Err(err).Msg("Failed to create waitgroup for committee broadcast vote trigger") + return fmt.Errorf("failed to create waitgroup for committee broadcast vote trigger: %w", err) + } + var successCount int + var successMutex sync.Mutex + + for _, peerID := range committeePeers { + BroadcastLocalGRO.Go(GRO.BroadcastVoteTriggerThread, func(ctx context.Context) error { + peer := peerID + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stream, err := h.NewStream(ctx, peer, config.BroadcastProtocol) + if err != nil { + log.Error().Err(err).Str("peer", peer.String()).Msg("Failed to open broadcast stream for committee vote trigger") + return err + } + defer stream.Close() + + _, err = stream.Write(msgBytes) + if err != nil { + log.Error().Err(err).Str("peer", peer.String()).Msg("Failed to send committee vote trigger message") + return err + } + + successMutex.Lock() + successCount++ + successMutex.Unlock() + + metrics.MessagesSentCounter.WithLabelValues("broadcast", peer.String()).Inc() + return nil + }, local.AddToWaitGroup(GRO.BroadcastVoteTriggerWG)) + } + + wg.Wait() + + if successCount == 0 { + return fmt.Errorf("failed to broadcast vote trigger to any committee peers") + } + + log.Info(). + Str("msg_id", msg.ID). + Int("success", successCount). + Int("total", len(committeePeers)). + Msg("Committee vote trigger broadcast complete") + return nil +} + // BroadcastBlockToEveryNodeWithExtraData sends a block to all connected peers and attaches extra metadata. // The extra map will be merged into msg.Data. Keys in extra override existing keys. func BroadcastBlockToEveryNodeWithExtraData(h host.Host, block *config.ZKBlock, result bool, extra map[string]string, bls []BLS_Signer.BLSresponse) error {