Status
Discovered while securing cluster WAL replication for GHSA-wfgr-8x84-22q7 (X1 audit). The internal/cluster/sharding package compiles and has tests, but none of its constructors are called from production code:
NewShardReplicationManager — only called from internal/cluster/sharding/shard_replication_test.go
NewShardReceiverManager — only called from internal/cluster/sharding/shard_replication_test.go
NewShardRouter — only called from internal/cluster/sharding/router_test.go (route helpers RouteShardedWrite / RouteShardedQuery exist in internal/api/routing.go but are not invoked from any non-test caller either)
The coordinator dispatch in internal/cluster/coordinator.go has no server-side listener for the sharding MsgReplicateSync flow. Both shard_replication.go (primary side) and shard_receiver.go (replica side) define connect() functions that dial out, leaving no side that accepts the inbound handshake. The flow is asymmetric and incomplete.
What Arc Enterprise actually uses today
The single-writer + multi-reader Raft + replication model from internal/cluster/coordinator.go + internal/cluster/replication/ (the path secured by GHSA-wfgr-8x84-22q7). Customer deployments run this model. Sharding is not a deployed feature.
Why this matters
- Confused-reviewer hazard. Security audits + Gemini reviews keep flagging
internal/cluster/sharding/* as relevant attack surface. It isn't — but every reviewer has to re-derive that fact from scratch. The X1 fix had to explicitly carve it out of scope in the release notes.
- Maintenance surface without a customer. ~1.2k lines of replication code that compiles and tests but ships dead in every binary. Every refactor in
internal/cluster/replication/protocol.go or internal/cluster/security/auth.go has to either also touch the sharding mirrors or knowingly skip them.
- Half-finished design. When sharding IS picked up, the implementer should redesign the auth model from scratch — the current handshake direction is ambiguous (both ends dial). Retrofitting the X1 HMAC onto today's shape would be wasted work.
Options
- (a) Finish wiring. Add the coordinator-side dispatch for
MsgReplicateSync to the shard receiver, plumb the HMAC primitives, and ship sharding as an Enterprise feature. Significant work — neither the routing nor the failover paths are wired either. Not on 26.05.1 or 26.06.1 roadmap.
- (b) Remove the package. Delete
internal/cluster/sharding/. Anyone who needs sharding later starts from a clean design. Small PR, removes maintenance liability, removes confused-reviewer noise.
- (c) Document as intentional scaffolding. Add a
doc.go to the package explaining that it's scaffolding for a future feature, no production callers expected, security primitives must be redesigned at activation time. Keeps the code, mutes the reviewer noise.
Recommendation
(b) remove unless we have a concrete near-term plan to ship sharding. (c) documents the status but leaves the maintenance debt; (a) is unbudgeted. Removing is reversible — anyone who picks it up later has the git history and the linked memory note as design context.
References
- X1 / CVE-2026-48106 release notes (
RELEASE_NOTES_2026.06.1.md section "Cluster replication stream now HMAC-authenticated end-to-end") explicitly carves sharding out of scope.
- Audit finding lives at GHSA-wfgr-8x84-22q7
Status
Discovered while securing cluster WAL replication for GHSA-wfgr-8x84-22q7 (X1 audit). The
internal/cluster/shardingpackage compiles and has tests, but none of its constructors are called from production code:NewShardReplicationManager— only called frominternal/cluster/sharding/shard_replication_test.goNewShardReceiverManager— only called frominternal/cluster/sharding/shard_replication_test.goNewShardRouter— only called frominternal/cluster/sharding/router_test.go(route helpersRouteShardedWrite/RouteShardedQueryexist ininternal/api/routing.gobut are not invoked from any non-test caller either)The coordinator dispatch in
internal/cluster/coordinator.gohas no server-side listener for the shardingMsgReplicateSyncflow. Bothshard_replication.go(primary side) andshard_receiver.go(replica side) defineconnect()functions that dial out, leaving no side that accepts the inbound handshake. The flow is asymmetric and incomplete.What Arc Enterprise actually uses today
The single-writer + multi-reader Raft + replication model from
internal/cluster/coordinator.go+internal/cluster/replication/(the path secured by GHSA-wfgr-8x84-22q7). Customer deployments run this model. Sharding is not a deployed feature.Why this matters
internal/cluster/sharding/*as relevant attack surface. It isn't — but every reviewer has to re-derive that fact from scratch. The X1 fix had to explicitly carve it out of scope in the release notes.internal/cluster/replication/protocol.goorinternal/cluster/security/auth.gohas to either also touch the sharding mirrors or knowingly skip them.Options
MsgReplicateSyncto the shard receiver, plumb the HMAC primitives, and ship sharding as an Enterprise feature. Significant work — neither the routing nor the failover paths are wired either. Not on 26.05.1 or 26.06.1 roadmap.internal/cluster/sharding/. Anyone who needs sharding later starts from a clean design. Small PR, removes maintenance liability, removes confused-reviewer noise.doc.goto the package explaining that it's scaffolding for a future feature, no production callers expected, security primitives must be redesigned at activation time. Keeps the code, mutes the reviewer noise.Recommendation
(b) remove unless we have a concrete near-term plan to ship sharding. (c) documents the status but leaves the maintenance debt; (a) is unbudgeted. Removing is reversible — anyone who picks it up later has the git history and the linked memory note as design context.
References
RELEASE_NOTES_2026.06.1.mdsection "Cluster replication stream now HMAC-authenticated end-to-end") explicitly carves sharding out of scope.