Skip to content

feat: composite (transmission_id, timestamp) index on observations for tx_last_seen MAX lookup #1738

@Kpa-clawbot

Description

@Kpa-clawbot

Context

Raised by Carmack in PR #1735 (round-1 review) as out-of-scope for that PR's chunked tx_last_seen backfill fix. Filing separately to track.

Proposal

Add a composite index (transmission_id, timestamp) on the observations table to make the per-transmission MAX(timestamp) lookup that powers the tx_last_seen backfill (and inline stmtBumpTxLastSeen) a single index probe instead of a correlated scan on transmission_id-only.

Why

  • The chunked backfill in cmd/ingestor/tx_last_seen_backfill.go issues SELECT MAX(timestamp) FROM observations WHERE transmission_id = ? per row in each UPDATE batch. With only idx_observations_transmission_id, SQLite scans every observation for the tx to find MAX.
  • Inline write-path equivalent stmtBumpTxLastSeen runs the same shape on every observation insert.
  • A composite (transmission_id, timestamp) index gives an index-only MAX via reverse scan with LIMIT 1.

Risk

  • Disk: one more index on a hot table. Operator-scale DB has ~1.5M observations; index entry overhead ~32 bytes → ~50MB.
  • Build cost: must run via RunAsyncMigration. Estimated 30–120s on prod.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions