feat: opt-in mobile client-RX coverage (crowdsourced RF reach) + /api/nodes/resolve#1728
feat: opt-in mobile client-RX coverage (crowdsourced RF reach) + /api/nodes/resolve#1728efiten wants to merge 41 commits into
Conversation
…ated) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… gate Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…default-off) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rescope-rx) The PR was missing docs/client-rx-coverage.md (the MQTT payload contract) and gave operators/users no pointer to the mobile capture app. Add the doc with a 'Companion app' section + operator enable steps, link corescope-rx from the config.example.json flag comment, and add a 'Get the companion app' link on the Coverage dashboard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… it in static nav) The static nav link broke upstream's nav-overflow e2e (test-nav-priority-1391): it counts all .nav-link elements regardless of display, so a hidden opt-in link still failed the expected-nav-set assertion. Remove the link from index.html and inject it from roles.js after Analytics only when clientRxCoverage is enabled, nudging applyNavPriority via a resize event. Default-off nav now exactly matches upstream (deterministic CI), and the link appears when the feature is on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Kent Beck Gate (round 1) — TDD + test qualityVerdict: APPROVED (with TDD-history caveat noted under net-new-feature exemption). TDD historyPR commit log on
No pure red commit precedes any of the three feature commits — production and tests land together per layer. Strict red-then-green: FAIL. I verified both halves of the exemption (see test-quality below). Per-commit CI history is no longer queryable on individual SHAs ( Recommendation for next time on net-new layers: split each layer into two commits — Test-quality interrogation (anti-tautology + six questions)
None of these read like "test the stub returns what the stub returns." Every test would catch a plausible bug.
Six Questions roll-up
Must-fix: 0Should-fix (non-blocking, file:line):
Out of scope
Tests demonstrate behavior coverage, anti-tautology holds, six questions satisfactorily answered. Net-new-feature exemption applies. Approved on TDD + test quality. |
Munger Review (round 1) — schema & data model
The opt-in gate is genuinely clean: The schema is additive ( Where it dies: Must-fix
Out-of-scope (pre-existing — file separate issues)
Verdict: NEEDS-WORK. The feature is well-isolated and the gate is correct. The data-model decisions, however, are written for the demo dataset, not for "this table grows by 10⁵ rows/day forever." Fix points 1–6 before merge; 7–11 are cheap and should land in the same pass. — Charlie |
DJB Review (round 1) — input parsing & opt-in gateThreat model: this PR exposes three new attack surfaces — (1) a JSON ingest path tied to a public MQTT topic, (2) a per-prefix node-resolution endpoint, and (3) anonymous GeoJSON endpoints that materialize opted-in users' positions. The opt-in gate is well-placed (single chokepoint, fails closed), but several inputs cross trust boundaries without being parsed-into-a-validated-type. The gate itself ( Must-fix
Out-of-scope (pre-existing, not introduced by this PR)
Quiet approval
Verdict: comment-only, do not merge until #1–#3 and #5 are addressed. Items #4, #6, #7, #8, #9 are must-fix for the feature to be safe to enable on a public deployment, but can be argued as follow-ups if the feature ships off by default (it does) and the docs flag the open issues loudly. |
Independent review (round 1)Cold-read adversarial review. Diff matches PR title scope (no scope creep). The Verdict: NEEDS-WORK (no hard blockers; multiple correctness, perf and discipline items). Must-fix
Out-of-scope (don't block; file follow-ups)
Re-spawn with |
…lawbot#10) The companion identity is the topic segment in meshcore/client/<pk>/packets, which the broker is expected to ACL-bind to the publisher. On a broker without ACLs an attacker could publish under an arbitrary topic (e.g. !@#$) and pollute client_receptions / client_observers with junk pubkeys. Reject any topic pubkey that is not lowercase hex (^[0-9a-f]{2,64}$, mirroring the server-side hexPrefixRe) before any write. Because the topic value is now always validated, the firstNonEmpty(rxPubkey, origin_id) payload fallback is both unreachable (Kpa-clawbot#10) and a trust hole, so it is removed; the companion identity comes only from the ACL-bound topic. Test: TestHandleClientPacketRejectsNonHexPubkey drives non-hex topic segments and asserts zero rows in both tables (fails without the guard). Existing fixtures updated to use a valid hex companion pubkey. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
handleMessage dispatched the meshcore/client/<pk>/packets topic to handleClientPacket and returned before the IsObserverBlacklisted check that guards the observer path. A blacklisted operator could therefore keep feeding coverage data through the client topic. Check the blacklist for the companion pubkey at the top of the client dispatch branch and drop (with a log) before any write. Test: TestClientRxCoverageBlacklistedDropped drives handleMessage with the feature ON and the companion pubkey blacklisted, asserting zero rows; it fails without the gate because the old order inserted before the blacklist ran. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
requireClientRxCoverage dereferenced s.cfg directly while the coverage routes are registered unconditionally, so a nil server/cfg would panic instead of returning a clean 404. Guard s == nil / s.cfg == nil and make the (*Config).ClientRxCoverageEnabled() receiver nil-safe too. Test: TestRequireClientRxCoverageNilSafe drives handleRxCoverage with a nil cfg and asserts 404 (panics without the guard), plus the nil-receiver helper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Kpa-clawbot#15) The prefix resolver accepted 2-char hex prefixes, so the 256 one-byte prefixes enumerated every node name; and, unlike /api/nodes/search and /api/resolve-hops, it returned blacklisted / hidden-prefix (Kpa-clawbot#1181) node identities the rest of the API hides. - Require >= 4 hex chars. 1-byte (2 hex) keys are never stored (the ingestor rejects heard keys shorter than 2 bytes), so the floor matches the data model while ruling out trivial full-table enumeration. - A unique match that is blacklisted or hidden now resolves as not-found. - Apply the same identity-hiding to resolveHeardKey so coverage tooltips don't leak hidden node names either. The endpoint is kept (single-prefix -> name lookup, distinct from the q= fuzzy search and the hop-context resolver) and stays in openapi_known_gaps.json. Per-IP rate limiting is left to follow-up #1. Tests: TestResolvePrefix gains <4-hex 400 cases and an aabb-collision ambiguous case; TestResolvePrefixHidesBlacklistedAndHidden asserts blacklisted/hidden matches resolve as not-found (both fail without the fix). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lawbot#14) renderBoard built each leaderboard row by string concatenation and interpolated o.pubkey raw into data-rx="..." (and into the truncated fallback label when the observer has no name) while only o.name was escaped. A non-hex pubkey (possible on a no-ACL broker, or in rows ingested before the #2 validation) could break out of the attribute and inject markup. escapeHtml() both the data-rx value and the truncated-pubkey label. Test: test-rx-coverage-escape.js slices the real row-builder out of rx-coverage.js and renders it with a markup-bearing pubkey, asserting no raw tag survives (fails when the escaping is reverted). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…a-clawbot#8, Kpa-clawbot#20) aggregateCoverage built its GeoJSON features by ranging a Go map, so feature order was randomized — defeating ETag/caching and making "first feature" e2e checks flaky (Kpa-clawbot#8). And the per-node name was set with `if name != "" { name = ... }`, so when several heard_keys mapped to the same node the displayed name depended on row/map order (Kpa-clawbot#20). - Sort fc.Features by cell before returning. - Lock the node identity (name and display-prefix fallback) to the most specific (longest) heard_key that resolved, tie-broken lexicographically, so a full-pubkey reception outranks a short prefix independent of order. Tests: TestAggregateCoverageDeterministicFeatureOrder asserts features come out sorted by cell; TestAggregateCoverageNamePrecedenceOrderIndependent feeds the same rows in both orders with a resolver that returns different names per heard_key and asserts the name is stable (fails under the old last-writer rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bot#12) A wide bbox at high zoom over the 30-day window could return unbounded GeoJSON: every hex cell, each with every node heard there. - Cap the per-cell node breakdown at coverageCellNodeCap (25); set properties.nodes_truncated when more nodes were heard than returned (Kpa-clawbot#11). The client only renders ~10, so this just bounds the wire payload. - Cap the feature collection at coverageFeatureCap (5000) cells, keeping the densest (count desc, cell asc tie-break for determinism) and setting the top-level truncated flag (Kpa-clawbot#12). Both flags are omitempty so untruncated responses are unchanged. truncated/nodes_truncated are GeoJSON foreign members that Leaflet ignores. Tests: TestAggregateCoverageCapsNodesPerCell (30 nodes -> 25 + flag) and TestAggregateCoverageCapsFeatures (5625 cells -> 5000 + flag, still cell-sorted, small query untruncated). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mobileRxStats was defined and tested but never called by a handler — dead production code. Rather than drop it, surface the node-wide totals it computes: the per-node coverage endpoint now returns mobile_receptions and mobile_clients (distinct contributing companions) as foreign members on the FeatureCollection, so the UI can show "heard by N clients" independent of the current bbox/pan. Both are omitempty, so the global /api/rx-coverage payload is unchanged. Test: TestRxCoverageEndpointGeoJSON now asserts the wired-in totals (1/1 for the single seeded reception); fails if the stats aren't attached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bot#17) hexMercator diverges toward the poles (tan(π/4 + lat·π/360) → ∞), so a coverage submission past ~85.05° produced NaN hex rings via hexInvMercator. Clamp lat to ±hexMaxLat (85.05112878) in hexCellAt and document that coverage is defined only within that range. Test: TestHexCellAtClampsPolarLatitude drives ±89.9/±90° and asserts they bin to the clamped edge cell with a finite (non-NaN/Inf) boundary ring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestHandleClientPacketAdvertWritesReception used a RELAYED advert (non-empty path), so it exercised the rxlog last-hop branch — not the 0-hop src='advert' path its name implied. Rename it to ...RelayedAdvert... and add TestHandleClientPacketZeroHopAdvertWritesReception, which rebuilds the same advert with zero hops (header + "00" + payload) and asserts the advertiser is stored by its full pubkey with src='advert', plus gps/snr capture. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…a-clawbot#18) The per-node coverage query filters by lat/lon bbox AND matches the heard node by full key or 2-3 byte prefix; the only indexes were single-column heard_key / rx_pubkey, so the bbox path (the sargable common filter) fell back to a full table scan. Add a composite (heard_key, heard_keylen, lat, lon) — which serves the heard_key-equality seek, carries lat/lon for the range, and supersedes the old single-column heard_key index — plus idx_client_recept_latlon so the planner can drive from a selective bbox (Kpa-clawbot#18 covers heard_keylen). CREATE INDEX IF NOT EXISTS in the base schema covers fresh and existing DBs (the table is new in this PR). Test: TestClientReceptionsCoverageQueryUsesIndex asserts EXPLAIN QUERY PLAN uses a client_recept index and no longer SCANs the table. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, Kpa-clawbot#7) Both coverage layers bound moveend/zoomend straight to an immediate fetch, so a single drag fired a storm of /api/rx-coverage requests (Kpa-clawbot#6); and every coverage fetch swallowed errors with an empty .catch (Kpa-clawbot#7), so failures were invisible. - Wrap the pan/zoom redraw in the shared 200ms debounce (keeping a stable reference so node-reach-coverage can still off() the handler). Direct redraws (day switch, fit-to-observer) stay immediate. - Replace the empty catches with console.warn; the leaderboard failure also shows a one-line in-page message. Test: test-node-reach-coverage-debounce.js loads addLayer with controllable timers + the real debounce, fires a 6-event burst and asserts exactly one coalesced fetch after the settle (7 fetches without the debounce). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Kpa-clawbot#19) nqCovLegend's visibility was driven by an inline style="display:flex|none" that applyCoverage rewrote, so CSS (print rules, themes) couldn't override it. Use an .is-hidden class instead (matching the .nav-more-wrap.is-hidden pattern) toggled via classList, and add .nq-cov-legend.is-hidden { display:none !important } to node-reach-coverage.css. Test: test-coverage-gate.js now asserts the rendered legend carries the is-hidden class and uses no inline display style (fails if the inline style returns). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, carmack) Adds BenchmarkCoverageQuery: seeds ~1M receptions across a metro-area bbox and times the per-node coverage query three ways. On a Ryzen 7 PRO 8845HS, modernc.org/sqlite, 1M rows (benchtime 5x): or_query_indexed 2.28 s/op (original OR/substr query, latlon index) or_query_table_scan 0.30 s/op (same query, indexes dropped) inlist_query_indexed 0.42 ms/op (sargable heard_key IN-list + composite) The OR/substr shape can't use the heard_key index, so the planner drives from the bbox and pays a random row fetch per candidate — slower than a plain scan. Rewriting the per-node match as a heard_key IN-list (next commit) lets the (heard_key, …) composite seek the few hundred matching rows: ~5400x faster than the indexed OR query and bounded by node, not table size. Benchmarks don't run in CI; this is on-demand evidence for the perf claim. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…N-list (#5) The benchmark (previous commit) showed the OR/substr match — "heard_keylen=32 AND heard_key=? OR heard_keylen IN (2,3) AND substr(?,1,keylen*2)=heard_key" — can't use the heard_key index, so the planner drove from the bbox and paid a random row fetch per candidate: 2.28 s/op at 1M rows, slower even than a full scan (0.30 s). A node's heard_key is always exactly its full pubkey or its 2-/3-byte prefix, so replace the OR/substr with heard_key IN (pubkey, pubkey[:6], pubkey[:4]) — an equivalent but sargable predicate. The (heard_key, …) composite index then seeks the few hundred matching rows: 0.42 ms/op (~5400x faster), bounded by node not table size. Applied to queryCoverageRows, queryCoverageFiltered (node filter) and mobileRxStats via the shared coverageHeardKeyCandidates helper. Test: TestClientReceptionsCoverageQueryUsesIndex now EXPLAINs the IN-list shape and asserts an index seek, not a table scan. Existing coverage/endpoint tests (which assert the same result rows) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Kent Beck Gate (round 2) — TDD + test qualityVerdict: PASS (1 minor / non-blocking). Round-1 findings I sampled each have a dedicated test that fails on revert with an assertion (not a build error), and the commit history shows one fix-per-commit with the test bundled — clean red-bait → green pattern for a backfill of issues found in review. I did not require classic red-then-green commits per finding because the round-1 work is a review-rework batch (not greenfield), and the tests demonstrably gate the behaviour. Per-finding gate (8 sampled)
Spot-checked the JS layer fixes too:
CI on tip Minor (non-blocking) — M1
Six Questions roll-up:
Gate: PASS. Ship it. |
Munger Review (round 2) — schema & data modelRe-reviewed after the sargable rewrite (15b20f8/f46f67df/67358053), cap policy (22655fc), identity lock (c780646), and the new retention reaper (6eba937). Verdict: 1 must-fix, 2 strong recommendations, the rest is good work. The OR/substr → Cap policy (#11/#12) is correct in shape: per-cell sorted by SNR desc → top 25 with MUST-FIX1. Three places do
UNIQUE(rx_pubkey, heard_key, rx_at) does NOT help — rx_at is not the leading column. Fix: This is the lollapalooza I'm warning about: a feature that grows unbounded between reaper sweeps (PR shipped with no growth-rate ceiling other than retention), a reaper that scans the whole table, and an opt-in dashboard whose own queries also scan. Each of the three pieces looks fine in isolation; together they compound into "coverage works for a month, then ingest mysteriously stalls every midnight." Strong recommendations (not blocking)2. Reaper holds the writer lock for the entire delete in one transaction. (
3. Same shape: Out-of-scope (UI / future)The server flags
|
Kpa-clawbot
left a comment
There was a problem hiding this comment.
Independent review (round 2)
Cold review — verified the 20 round-1 must-fix items are real (read every fix in gh pr diff, cross-checked against the noted commits). All accounted for:
- #1 blacklist on client topic —
handleMessagecheckscfg.IsObserverBlacklisted(parts[2])before dispatch, plus test (TestClientRxCoverageBlacklistedDropped). ✓ - #2/#10 hex pubkey validation —
clientPubkeyReenforced; payloadorigin_idfallback removed; test covers!@#$,companionpk,xyz. ✓ - #3 mobileRxStats wired in — per-node response carries
mobile_receptions/mobile_clients; asserted. ✓ - #4 nil-safe gate —
Server/Config/ClientRxCoverageEnabledall nil-safe;TestRequireClientRxCoverageNilSafecovers it. ✓ - #5/#18 sargable IN-list + composite index —
coverageHeardKeyCandidates+ composite(heard_key, heard_keylen, lat, lon); EXPLAIN-QUERY-PLAN test + 1M-row benchmark with numbers. ✓ - #6/#7 debounced redraws + surfaced errors —
debounce(refresh,200)with stable handler,console.warnreplaces empty.catch; in-DOM test fires 6-event burst → 1 fetch. ✓ - #8/#20 deterministic features + name precedence —
sort.Sliceby cell, longest-heard-key name wins; order-independent test. ✓ - #9 true 0-hop advert — rebuilt
1100 + payload, assertssrc='advert'/keylen=32+ GPS captured. ✓ - #11/#12 response bounds —
coverageCellNodeCap=25,coverageFeatureCap=5000, omitempty flags; tests cover both caps. ✓ - #13 MeshConfigReady race — both
rx-coverage.init()andnode-reach.load()await;loadGenrace guarded; controllable-promise test. ✓ - #14 escape pubkey —
escapeHtml(o.pubkey)fordata-rx+ label fallback; sliced-source test asserts no raw<imgsurvives. ✓ - #15 resolve hardening — 4-hex floor, blacklist/hidden-name parity (also applied to
resolveHeardKey); both tests. ✓ - #16 docs — single-flag/ACL-required language, retention/indexes/clamp/bounds documented. ✓
- #17 polar clamp —
±hexMaxLat=85.05112878inhexCellAt; ±90° test checks finite ring. ✓ - #19 class-based legend hiding —
.is-hiddenclass + CSS!important; gate test grep-asserts no inlinestyle="display:. ✓ - #1727 retention reaper —
PruneOldClientReceptionsviaWriterTx(#1283 honored), startup + 24h ticker, independent of feature flag,clientRxDaysinconfig.example.json. ✓
Scope matches the PR description (opt-in mobile client-RX coverage + /api/nodes/resolve); no scope creep beyond the gofmt/struct-alignment whitespace touch-ups in cmd/server/{config,types}.go.
New must-fix (nits surfaced in round 2)
public/rx-coverage.js:9,118,134—selectedNameis assigned in three places but never read or rendered. Dead var; either drop it or surface it (e.g., a "Filtered to: " badge above the map).cmd/server/rx_coverage.go:160-176—aggregateCoveragesorts features twice when truncation applies (densest-desc to slice, then cell-asc unconditionally). One pass after the slice is enough — secondsort.Sliceis cheap but the densest-first sort can be replaced with a partial selection.cmd/server/rx_dashboard.go:33-45—heardKeyResolver's per-request cache is unbounded. A wide query touching N distinct heard_keys grows the map O(N) without an eviction cap. Bound to, say, 4096 entries or skip caching when over a threshold; the existing tests pass either way.cmd/server/config.go:320-325— serverRetentionConfigis missingClientRxDays. The ingestor reads it; the server's JSON struct silently drops the operator'sretention.clientRxDayskey on parse. Even though the server doesn't prune, the field-set mismatch is a future footgun — mirror the field (or add ajson:"-"placeholder with a comment) for symmetry.cmd/server/rx_dashboard.go:107-126,165-183—rxLeaderboardandqueryCoverageFiltereddon't filter blacklistedrx_pubkey. Ingest is blocked now (#1), but pre-blacklist legacy rows still surface a banned companion's name in the leaderboard and let an attacker target?rx=<blacklisted>. Defense-in-depth: add aNOT IN (blacklist)filter (or aWHERE NOT EXISTS …) mirroring how/api/nodes/resolvehides identities.cmd/server/rx_coverage.go:queryCoverageRows / mobileRxStats— no guard against emptypubkey. Today the mux route guarantees non-empty, butcoverageHeardKeyCandidates("")returns[]which would yieldWHERE heard_key IN ()(SQL syntax error). One-line guard (if pubkey == "" { return nil, nil }) is cheap insurance against a future caller.
Out-of-scope
- Per-IP rate limiting on
/api/nodes/resolve(acknowledged in author commit message as follow-up #1). - Optional companion-signed broker token (called out in docs as future hardening).
Verdict: request changes on the six nits above. The 20 round-1 fixes are real and well-tested.
DJB Review (round 2) — input parsing & opt-in gateVerdict: request changes — round 1 must-fixes are present and correct, but the same threat class survives on two read-side endpoints. Two new must-fixes, one out-of-scope hardening note. Round 1 fixes — verified
Must-fix #1 —
|
…d (review r2)
Round-1 hid the node *name* via the resolver, but two read endpoints still leaked
the identities the rest of the API suppresses:
- handleNodeRxCoverage returned GPS hex bins + mobile_receptions/mobile_clients
for any {pubkey}, so a blacklisted or hidden-prefix node's coverage was
mappable by anyone who knew the key. Now mirrors handleNodeReach: reject
non-hex pubkeys (400) and 404 blacklisted / isPubkeyHidden nodes before any
query.
- rxLeaderboard had no blacklist/hidden filter, so a pre-PR or post-blacklist
client_receptions row kept a banned operator on the board (with name). Now
drops IsObserverBlacklisted contributors and blanks the name when
IsBlacklisted(pubkey) || IsNameHidden(name). Result set is <=100, filtered in
Go with the same helpers the resolver uses.
Tests: TestNodeRxCoverageHidesBlacklistedAndHidden (404 for both) and
TestRxLeaderboardHidesBlacklistedAndHidden (drop + blank); both fail without the
gates. setupTestDBv2's nodes table gains foreign_advert so GetNodeByPubkey (used
by isPubkeyHidden) works against the test schema. The per-node endpoint now
requires a full 64-hex pubkey, matching handleNodeReach.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Automated polish review — 4-persona parallel fan-outReviewers: adversarial · carmack (backend) · kent-beck (tests) · tufte (frontend). CI all-green ✅ (Go build, Playwright, Docker). Verdict: not merge-ready as-is. 1 BLOCKER + 10 MAJOR across all four reviewers, predominantly clustered around privacy + N+1 SQL + frontend a11y/dark-theme. Holding for operator triage rather than auto-merging. BLOCKER (1)
MAJOR (10)Privacy / abuse (adversarial)
Backend efficiency (carmack)
Frontend (tufte)
Tests (kent-beck)
MINOR (selected — full lists available on request)
Test discipline (kent-beck)Strong on assertions — every Go Clean surfaces noted
This is an automated review; a human operator will follow up on the BLOCKER (privacy) and MAJOR triage. Bot will not auto-merge or push fixes to a community branch. |
Review round 2 — both must-fixes addressed (
|
… index (polish review) The retention reaper's DELETE … WHERE rx_at < ? and the leaderboard's WHERE rx_at >= ? both filtered on an unindexed column → full scan under the writer lock (the reaper from 6eba937 introduced the DELETE). Add idx_client_recept_rxat. Drop idx_client_recept_rxpk: it duplicates the rx_pubkey-leading index the UNIQUE(rx_pubkey, heard_key, rx_at) constraint already creates, which serves the ?rx= filter and leaderboard GROUP BY. Also drop the dead `na.count == 1` guard in aggregateCoverage (latestAt starts "", so the first row always satisfies rx_at >= latestAt) — no behavior change. Test: TestClientReceptionsRetentionUsesRxAtIndex asserts the DELETE plan seeks idx_client_recept_rxat instead of scanning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…CKER) The polish review flagged that the per-observer view (/api/rx-coverage?rx=) is an unauthenticated, fine-grained movement trail of a single contributor, with no warning in onboarding. Per operator decision this stays an accepted tradeoff (opt-in, default OFF, fine resolution is what makes the aggregate map useful), but consent must be informed. Add a "Privacy — contributor location is public" section: it states plainly that contributions are world-readable, that the per-observer view reconstructs movements, and that a pseudonymous companion name does NOT mitigate it (locations are identifying; the pubkey links all points). Operators are told to warn users; contributors are told not to use a carried device if that matters. The enabling steps cross-link to it. Notes the further-hardening levers (lower clientRxDays, auth proxy, future coarsening / k-anonymity). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…view) heardKeyResolver issued one `LIKE ? LIMIT 2` per distinct heard_key per request on the writer-shared read connection. Replace it with heardKeyResolverFor(rows): collect the distinct heard_keys once and resolve them all in a single round-trip per 200-key chunk — a UNION ALL of per-prefix `LIMIT 2` subqueries (subquery form because a bare LIMIT on a UNION ALL term is a SQLite syntax error; prefixes are hexPrefixRe-validated so literal interpolation is injection-safe). Per-key work stays bounded at 2 rows, and the Kpa-clawbot#15/Kpa-clawbot#1181 blacklist/hidden hiding is preserved. resolveHeardKey is now a thin wrapper over the batch (single code path). Test: TestBatchResolveHeardKeys checks unique/ambiguous/unknown/hidden in one call; existing resolver + endpoint tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tufte) - Dark theme: the saturated --nq-cov-* palette only existed in :root and glared on dark basemaps. Add [data-theme="dark"] variants (matches the dashboard's theme-aware tokens). - Colour-blind: SNR tiers were hue-only. Add a redundant, monotonic fill-opacity ramp (strong>mid>weak>grey) in both coverage layers so orange vs red are distinguishable without relying on hue; the per-cell SNR stays in the tooltip. - Keyboard/SR: the leaderboard .rxb-row was a click-only <div>. Add role="button", tabindex=0, aria-pressed, aria-label, an Enter/Space keydown handler, and a :focus-visible outline. aria-pressed added to the day buttons. Tests: test-node-reach-coverage.js asserts the opacity ramp is strictly decreasing; test-rx-coverage-escape.js asserts the row carries role/tabindex/ aria-pressed. coverageFillOpacity is exported for the unit test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Polish review (4-persona) — triaged; most items fixedThanks — genuinely useful sweep. Most of it is now addressed; the rest is deferred with rationale. Opt-in/default-OFF gate untouched, bleed-clean, CI green. 🚫 BLOCKER — per-observer movement trailOperator decision: kept as a documented tradeoff rather than changing the map — fine resolution is what makes the aggregate map useful, and the feature is opt-in / OFF by default. You're right there's no auth; the gap the review most clearly identifies is informed consent, which was missing. Added a "Privacy — contributor location is public" section to Fixed — backend (carmack)
Fixed — frontend a11y/theming (tufte) (
|
Bot watcher — operator decision needed (no further auto-review)Polish loop hard cap (2 rounds) reached. Verified the 5 fixup commits since round-2 polish ( CI ✅ on tip. mergeable=MERGEABLE, mergeStateStatus=CLEAN. Three open items, all author-deferred to the maintainer:
Minor adversarial nits from the round-2 review remain unaddressed ( Not auto-merging: community PR + unresolved privacy BLOCKER deferred to maintainer. |
Kpa-clawbot
left a comment
There was a problem hiding this comment.
Kent Beck Gate (round 2) — TDD + test quality
Verdict: APPROVED — all 5 follow-up commits land with appropriate tests; no must-fix issues.
Per-commit TDD audit
588b4afd fix(coverage): hide blacklisted/hidden nodes on coverage + leaderboard — ✅
TestNodeRxCoverageHidesBlacklistedAndHidden(rx_coverage_endpoint_test.go) asserts 404 for blacklisted and hidden-prefix pubkeys at/api/nodes/{pk}/rx-coverage. Reverting the gate inhandleNodeRxCoverageflips the response from 404→200 → assertion fails. ✓TestRxLeaderboardHidesBlacklistedAndHidden(rx_dashboard_test.go) covers three distinct behaviors: observer-blacklisted dropped entirely, node-blacklisted name blanked but row kept, hidden-prefix name blanked but row kept. Reverting the post-loop filter loop flips all three → assertions fail. ✓- Six questions: tests assert observable behavior (HTTP code, map of pubkey→row), not internal call order. Setup is wide (4 rows × 3 hide policies) but justified — it's the matrix the fix is designed to cover.
2b41dd25 perf(coverage): index rx_at; drop redundant rx_pubkey index — ✅
- Perf-exempt per AGENTS.md, but the commit still lands a real perf gate:
TestClientReceptionsRetentionUsesRxAtIndexusesEXPLAIN QUERY PLANand assertsidx_client_recept_rxatappears. A future drop of the index, or a query rewrite that bypasses it, fails the test on the assertion (not on perf flake). This is the right shape for a perf regression test. ✓ - Drive-by behavior change in
aggregateCoverage(dropcount==1guard, rely on lexical>= "") is invariant-preserving but unasserted by a new test. Out-of-scope nit only.
e147d8df docs(coverage): warn contributor location is public — ✅ pure-docs exemption.
c11465a0 perf(coverage): batch heard_key resolution to kill N+1 — ✅
- Perf-exempt, but
TestBatchResolveHeardKeyscovers the four resolution semantics (unique, ambiguous, unknown, hidden-prefix) of the new batch function with one call. RevertingbatchResolveHeardKeysback to a per-key loop wouldn't fail this test (it asserts correctness, not query count), so the test gates semantic regressions in the new batched code path, not the N+1 itself. Parent flagged this as a "is the query count pinned?" question — answer: not pinned, accepted under perf exemption. resolveHeardKeyis now a thin wrapper over the batch path, so existing single-key resolve tests transitively exercise the batched code. Good consolidation.
d6ea742f fix(coverage): a11y + dark-theme for coverage layers — ✅
test-node-reach-coverage.jsasserts monotonic opacity ramp (strong > mid > weak > grey). RevertingcoverageFillOpacityto constant0.45fails the strict-inequality assertion. ✓test-rx-coverage-escape.jsadds three keyboard-a11y assertions on the rendered row (role="button",tabindex="0",aria-pressed="(true|false)"). Reverting therenderBoardHTML changes fails all three. ✓- Six questions: tests name the behavior (
opacity must ramp strong>mid>weak>grey,row must be focusable), not the implementation. Setup is minimal.
Six-question pass over test files in this batch
cmd/server/rx_coverage_endpoint_test.go: behavior-named, minimal setup, no tautologies.cmd/server/rx_dashboard_test.go: three new tests, each with a single distinct invariant. TheBatchResolveHeardKeyscases map is the smallest test that catches each semantic class.cmd/ingestor/client_reception_test.go:EXPLAIN QUERY PLANis exactly the right tool for the perf invariant; no flake surface.test-node-reach-coverage.js/test-rx-coverage-escape.js: strict-inequality + regex assertions are tight; no implementation-coupling.
Out-of-scope (do not block)
cmd/server/rx_dashboard.goaggregateCoverageremovedcount==1guard — behaviorally equivalent (lexical>=""always true on first row) but unasserted; consider a one-line test if revisited.- Dropping
idx_client_recept_rxpkis reasoned-about ("UNIQUE leading column"), but no test pins that the?rx=filter still hits an index. EXPLAIN-QUERY-PLAN-style test would harden it.
Kpa-clawbot
left a comment
There was a problem hiding this comment.
Munger Review (round 2)
Cold re-review of the 5 follow-up commits (588b4af, 2b41dd2, e147d8d, c11465a, d6ea742) against the round-1 findings, applied with inversion + second-order incentives.
What the round-2 fixes got right
- 588b4af correctly closes the two identity leaks:
handleNodeRxCoveragenow mirrorshandleNodeReach(400 on non-hex, 404 on blacklist/hidden) so GPS hex bins andmobile_*counts aren't fetchable at a hidden pubkey, andrxLeaderboarddropsIsObserverBlacklistedand blanks names for blacklisted/hidden identities. Test coverage matches both gates. ThesetupTestDBv2schema patch (addingforeign_advert) is the right minimal change to makeisPubkeyHiddenreal in tests. - 2b41dd2 is a clean swap:
idx_client_recept_rxatbacks the reaperDELETE WHERE rx_at < ?, andidx_client_recept_rxpkis genuinely redundant because theUNIQUE(rx_pubkey, heard_key, rx_at)constraint already creates anrx_pubkey-leading index. Dropping the deadna.count == 1guard inaggregateCoverageis correct (""≤ any RFC3339). - e147d8d says the quiet part out loud: opt-in + default-OFF + fine resolution + per-observer view = informed-consent territory. The doc is explicit that a pseudonymous companion name does not mitigate the trail, names the further-hardening levers, and links it from the enabling steps.
- c11465a kills the per-key round-trip cleanly. Resolver is built once per request from the rows that will be aggregated, so every key the resolver is asked about is in the batch (no fall-through to a slow path). Per-request scoping means no cross-request cache to poison. The
'pfx' AS pfxecho-back keys the per-prefix aggregator deterministically, andLIMIT 2per prefix preserves the unique-vs-ambiguous distinction. - d6ea742 addresses both round-1 a11y items (keyboard-reachable role="button" rows with aria-pressed + focus-visible, monotonic opacity ramp as a non-hue SNR cue) and the dark-theme palette gap.
Must-fix
-
cmd/server/rx_dashboard.go(batchResolveHeardKeys, ~lines 78–96): literal-interpolatedLIKEis one regex change away from SQL injection. The safety argument ("k ishexPrefixRe-validated") lives in a comment, not in the function. Anyone who later widenshexPrefixReinnode_resolve.go(uppercase,:, etc., for whatever reason) silently turns this into an injection sink. Either (a) parameterize: build?placeholders for both thepfxliteral and theLIKEoperand and passargs ...interface{}, or (b) re-assert inside the loop with a function-local stricter check before interpolating. "Show me the incentive": a future contributor optimizes the prefix regex and has no reason to grep this file. -
cmd/server/rx_dashboard.go(rxLeaderboard, ~lines 199–215): drop-after-LIMIT shrinks the leaderboard below what the caller asked for. The SQLLIMIT ?is applied before the Go-sideIsObserverBlacklisted/IsBlacklisted/IsNameHiddenfilter. If N of the top-100 contributors are blacklisted observers, the public leaderboard returns 100−N rows — not 100. Not a leak, but a quality regression on a public surface, and the second-order incentive is bad (a blacklisted operator can occupy a top slot for free, just invisibly). Fix options: pushNOT IN (observer_blacklist)into the SQL (cheapest), or over-fetch (LIMIT ?*2) and re-cap after filtering. -
cmd/ingestor/client_reception_test.go(TestClientReceptionsRetentionUsesRxAtIndex): only the reaper is asserted; the leaderboard plan is not. The commit message and index comment both promise the new index coversWHERE rx_at >= ?inrxLeaderboard, but there's noEXPLAIN QUERY PLANtest for the leaderboard SELECT. SQLite's planner is fond of choosing the UNIQUE-backedrx_pubkey-leading index instead, especially with theGROUP BY rx_pubkey. Add a second EXPLAIN test against the actual leaderboard query (or a representative form) so a future schema tweak doesn't quietly regress to a full scan under the writer lock. -
cmd/server/rx_dashboard.go(batchResolveHeardKeys, query error path, ~lines 102–106): silent fallback hides a real bug. Ondb.Queryerror, every key in the batch is mapped to(k, "")and the function continues. Operationally this presents as "all resolutions ambiguous" — a coverage map with no names — and there's no log line. Either log the error once (already have a logger on*Server), or surface it up so the handler can decide. Inversion: how would I notice this is broken? — answer: I wouldn't.
Out-of-scope (pre-existing or accepted)
- The privacy tradeoff itself (per-observer view = movement trail) is an accepted, documented operator decision.
clientRxDaysretention default lives outside this PR.- The companion-side signed-token follow-up is already tracked.
Verdict
NEEDS-WORK — 4 must-fix items, all small. The round-1 BLOCKERs are genuinely closed; the remaining items are hardening against the next change rather than today's behavior, which is the right time to fix them.
…og query errors (review r2) Round-2 must-fix items on the coverage dashboard: - #1 batchResolveHeardKeys: parameterize the per-prefix LIKE (bound args, no literal interpolation) so it stays injection-safe if hexPrefixRe ever widens. - #2 rxLeaderboard: the SQL LIMIT runs before the Go-side observer-blacklist drop, shrinking the public board below the requested limit. Over-fetch by the blacklist size and re-cap to limit. Test: blacklisted top contributors no longer shrink the result. - #4 batchResolveHeardKeys: log.Printf the query error instead of swallowing it (a silent fallback presents as 'every name ambiguous' with no signal). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ent (review r2 #3) EXPLAIN shows the leaderboard SELECT is served by the UNIQUE(rx_pubkey,heard_key, rx_at) covering index, NOT idx_client_recept_rxat. Add an EXPLAIN test that pins it as index-backed (guards against a regression to a bare table scan under the writer lock; the table is retention-bounded so a covering scan is fine), and correct the db.go comment that wrongly claimed rx_at backs the leaderboard window. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Round-2 must-fix items addressed (
|
Heads-up: the e2e red here is a pre-existing fixture time-bomb on
|
Replace raw COUNT(*) leaderboard ranking with a per-observer score that sums 1/(observers covering each ~150m cell), plus cells/score fields. Spam-proof (parked node = 1 cell) and kills dense-area bias. Aggregated in Go over the window; ties broken by receptions then pubkey. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… rewrite The leaderboard no longer GROUPs BY rx_pubkey in SQL (it range-scans rx_at and aggregates in Go), so the old "served by the UNIQUE index as a COVERING scan" note was inaccurate. The conclusion stands: a dedicated rx_pubkey index is still redundant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Render the Top mobile observers board as a table with score/cells columns, every column sortable (default score desc), and a hover tooltip on the score header explaining the frontier weighting. Row click-to-filter preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
386841d to
547e562
Compare
deploy-live.sh and deploy-staging.sh contain host-specific deploy logic (on8ar.eu, mesh-internal, container names) — they belong only on the deploy host, not in version control. Remove them from the repo and gitignore them so future deploys (git reset --hard) no longer overwrite the host-local copies. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Incremental review — commits 2206564..eabbae6Scope: Adversarial
Carmack (perf/correctness)
Kent Beck (TDD gate)
Munger (incentives/footguns)
Verdict
|
…ew r2) Two production-data MAJORs in the new frontier-weighted leaderboard: - Unbounded scan: the Go-side rarity weighting dropped the SQL LIMIT, so an unauthenticated /api/rx-leaderboard streamed the whole window into maps. rxLeaderboard now takes a context.Context (QueryContext + batched ctx.Err() checks every 2048 rows), caps the scan at leaderboardScanCap (500k) ORDER BY rx_at DESC (keep most recent on truncation), and logs when the cap is hit. - Score dilution: cellObservers was populated before the blacklist filter, so a blacklisted-node operator parked in a cell silently lowered every legitimate observer's 1/N frontier weight. The per-cell denominator (cellCount) now excludes observer- and node-blacklisted pubkeys; name-hidden (non-blacklisted) contributors still count. Test: TestRxLeaderboardScoreNotDilutedByBlacklisted asserts a legit observer sharing a cell with a blacklisted one scores 1.0, not 0.5 (fails without the fix). Existing leaderboard/frontier tests updated for the new ctx parameter. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Incremental review (frontier leaderboard) — both MAJORs fixedThanks. Both production-data MAJORs are fixed in MAJOR 1 — unbounded leaderboard scan
MAJOR 2 — score dilution by blacklisted observersThe per-cell denominator is now computed from a Test (kent-beck)
NITs (not done — flagging for your call)
CI status
Not merging — maintainer's call. Re-spawn review: |
Round-3 review — both r2 MAJORs verified fixedVerified MAJOR 1 (unbounded scan) — FIXED. MAJOR 2 (score dilution by blacklisted observers) — FIXED. Pre-pass builds Test (kent-beck) — present and correctly behavior-asserting. Findings
3-axis merge gate
Not auto-merging — strict gate requires CLEAN, not UNSTABLE. Recommend merging #1747 first to fix the fixture, then this PR's mergeStateStatus should flip to CLEAN on next CI run and become safely auto-mergeable. Maintainer's call. |
|
Heads-up on CI: the Fixed separately in #1747 (a one-line
|
Implements #1727.
What this adds
Mobile client-RX coverage — an opt-in, crowdsourced RF-coverage feature. A roaming MeshCore companion radio (driven by the open-source corescope-rx PWA, GPLv3) reports which nodes it heard directly, tagged with the phone's GPS and the packet's SNR/RSSI. CoreScope ingests these into a new
client_receptionstable and renders per-node hex coverage on the Reach page, plus a standalone Coverage dashboard (#/rx-coverage) with a top-mobile-observers leaderboard.Also includes
GET /api/nodes/resolve?prefix=<hex>— a read-only node-name lookup by pubkey prefix ({name, pubkey, ambiguous}), used by the companion app for friendly names.Opt-in — default OFF (zero impact on existing deployments)
The whole feature is gated behind one config flag, disabled by default:
When disabled (the default): the ingestor writes no
client_receptions; the three coverage endpoints return a clean 404; the UI hides the Coverage nav link, the#/rx-coverageroute, and the Reach-page toggle./api/nodes/resolveis always available (not coverage-specific).How it works
path[last](last forwarder) for FLOOD routes; ≥2-byte path-hash required. Upstream hops discarded.cmd/server/hexgrid.go) computed at query time (CGO_ENABLED=0/modernc.org/sqlitefriendly); frontend uses the existing Leaflet.meshcore/client/{pubkey}/packetstopic. Payload contract indocs/client-rx-coverage.md.How to enable / try it
config.json, set"clientRxCoverage": { "enabled": true }and restart server + ingestor.meshcore/client/<pubkey>/packets; the ingestor already subscribes undermeshcore/#.#/rx-coverage.What's where
cmd/ingestor/client_reception.go(ingest),db.go(client_receptions+client_observersschema),main.go(gated dispatch),config.go(flag).cmd/server/rx_coverage.go+rx_dashboard.go(endpoints, self-guard 404 when off),hexgrid.go(pure-Go grid),node_resolve.go(resolve),routes.go/types.go/config.go(wiring + flag +/api/config/clientfield).public/rx-coverage.js(dashboard),node-reach-coverage.js+.css(overlay),node-reach.js(Reach toggle, flag-gated),roles.js(reads the flag, hides nav when off).docs/client-rx-coverage.md.Testing
cd cmd/server && go test ./...andcd cmd/ingestor && go test ./...— green, including new gate tests (coverage_gate_test.goin both: off → no rows / 404, on → works) and the rx-coverage / resolve / hexgrid suites.node test-coverage-gate.js,node test-node-reach-coverage.js(wired into CI). The Playwrighttest-node-reach-coverage-e2e.jsis wired into the e2e job and skips whenclientRxCoverageis disabled, so it's safe under the default-off config.Notes for reviewers
cmd/server/openapi_known_gaps.json(the existing OpenAPI-completeness ratchet), matching how other not-yet-spec'd routes are tracked. Happy to write full OpenAPI spec entries instead if you prefer.