Context
Raised by Carmack in PR #1735 (round-1 review) as out-of-scope for that PR's chunked tx_last_seen backfill fix. Filing separately to track.
Proposal
Add a composite index (transmission_id, timestamp) on the observations table to make the per-transmission MAX(timestamp) lookup that powers the tx_last_seen backfill (and inline stmtBumpTxLastSeen) a single index probe instead of a correlated scan on transmission_id-only.
Why
- The chunked backfill in
cmd/ingestor/tx_last_seen_backfill.go issues SELECT MAX(timestamp) FROM observations WHERE transmission_id = ? per row in each UPDATE batch. With only idx_observations_transmission_id, SQLite scans every observation for the tx to find MAX.
- Inline write-path equivalent
stmtBumpTxLastSeen runs the same shape on every observation insert.
- A composite
(transmission_id, timestamp) index gives an index-only MAX via reverse scan with LIMIT 1.
Risk
- Disk: one more index on a hot table. Operator-scale DB has ~1.5M observations; index entry overhead ~32 bytes → ~50MB.
- Build cost: must run via
RunAsyncMigration. Estimated 30–120s on prod.
Context
Raised by Carmack in PR #1735 (round-1 review) as out-of-scope for that PR's chunked tx_last_seen backfill fix. Filing separately to track.
Proposal
Add a composite index
(transmission_id, timestamp)on theobservationstable to make the per-transmissionMAX(timestamp)lookup that powers the tx_last_seen backfill (and inlinestmtBumpTxLastSeen) a single index probe instead of a correlated scan ontransmission_id-only.Why
cmd/ingestor/tx_last_seen_backfill.goissuesSELECT MAX(timestamp) FROM observations WHERE transmission_id = ?per row in each UPDATE batch. With onlyidx_observations_transmission_id, SQLite scans every observation for the tx to find MAX.stmtBumpTxLastSeenruns the same shape on every observation insert.(transmission_id, timestamp)index gives an index-onlyMAXvia reverse scan withLIMIT 1.Risk
RunAsyncMigration. Estimated 30–120s on prod.