Skip to content

fix(compact): wipe evolu_timestamp + clear first/lastTimestamp atomically#10

Merged
elkimek merged 2 commits into
mainfrom
fix/compact-owner-wipes-evolu-timestamp
May 6, 2026
Merged

fix(compact): wipe evolu_timestamp + clear first/lastTimestamp atomically#10
elkimek merged 2 commits into
mainfrom
fix/compact-owner-wipes-evolu-timestamp

Conversation

@elkimek
Copy link
Copy Markdown
Owner

@elkimek elkimek commented May 6, 2026

Summary

Both /admin/compact-owner and /self/compact-owner previously left the owner's evolu_timestamp table populated and evolu_usage.firstTimestamp/lastTimestamp pointing at deleted messages. The merkle/fingerprint mismatch this creates strands every subsequent client push — Evolu's negentropy reconciliation reports "I already have those timestamps" based on the stale fingerprint table, and peers' fresh per-row deltas get silently rejected.

This patch performs all three cleanups (DELETE FROM evolu_message, DELETE FROM evolu_timestamp, UPDATE evolu_usage SET storedBytes=0, firstTimestamp=NULL, lastTimestamp=NULL) inside the existing transaction in both endpoints, so the owner's storage is genuinely empty after compact (matches the contract).

Production reproduction (2026-05-06)

  • 6.5K-message owner triggered HMAC-authed /self/compact-owner
  • Response: { deletedMessages: 6481, beforeStoredBytes: 52,348,941, afterStoredBytes: 0 }
  • Post-compact relay state:
    evolu_message:    3      (3 stragglers from in-flight pushes)
    evolu_timestamp:  6786   ← stranded fingerprints
    evolu_usage:      270 KB ← matches the 3 stragglers (correct)
    
  • Every subsequent client forceResendCurrentProfile reported Push committed locally with 444 delta ops planned + applied, but only the legacy profileData blob (1 message) actually landed on the relay. The 443 per-row deltas evaporated.
  • Manual DELETE FROM evolu_timestamp WHERE ownerId = ? immediately unstuck the owner; on the next push all 444 ops materialized as 2720 small + 101 large messages.

What changed

  • src/lib/admin-server.ts and src/lib/self-server.ts: extend the existing db.transaction(() => {...}) to also DELETE from evolu_timestamp and clear first/lastTimestamp on evolu_usage.
  • test/self-server.integration.test.mjs:
    • fixture seeds 5 evolu_timestamp rows + non-null first/lastTimestamp alongside the 5 evolu_message rows
    • existing compact test now asserts all three cleanups happened (regression guard)

All 21 self-server tests pass (12 unit + 9 integration).

Test plan

  • npm test green locally (21/21)
  • Greptile review
  • Smoke test on staging or sync.getbased.health: after deploy, run a /self/compact-owner + verify evolu_timestamp is empty for that owner

🤖 Generated with Claude Code

…mp atomically

`/admin/compact-owner` and `/self/compact-owner` previously deleted
`evolu_message` rows + zeroed `evolu_usage.storedBytes` but left two
pieces of state behind:

1. The whole `evolu_timestamp` table for that owner. This is the
   merkle/fingerprint structure Evolu's negentropy reconciliation
   uses to decide what messages to exchange. With message rows gone
   but timestamp rows still present, the relay reports fingerprints
   for timestamps whose underlying payloads no longer exist — peers'
   subsequent per-row pushes get rejected as "you already have it"
   and silently disappear instead of refilling the log.

2. `evolu_usage.firstTimestamp` and `lastTimestamp`, which still
   pointed at deleted message rows. `getOwnerUsage` falls back to
   these on the next write, leaving stale bookkeeping.

Verified in production 2026-05-06 on a 6.5K-message owner: an HMAC-
authed `/self/compact-owner` returned `deletedMessages=6481,
afterStoredBytes=0` (correct), but `evolu_timestamp` retained 6786
rows. Every subsequent client `forceResendCurrentProfile` reported
`Push committed` locally with 444 ops planned, yet only the legacy
profileData blob (1 message) ever landed on the relay. The 443
per-row deltas got stranded by the stale fingerprints. Manual
`DELETE FROM evolu_timestamp WHERE ownerId = ?` immediately
unstuck the owner; full state replicated on the next push.

Both deletes + the usage update now run inside the same DB
transaction so a partial-failure scenario can't leave the owner in
the same wedged state. Test extended to seed `evolu_timestamp`
rows + non-null first/lastTimestamp in the fixture and assert all
three are cleaned post-compact (regression guard).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 6, 2026

Greptile Summary

This PR fixes a production bug where compacting an owner's data left evolu_timestamp rows and stale firstTimestamp/lastTimestamp pointers intact, causing Evolu's negentropy reconciliation to silently reject all subsequent client pushes. The fix extracts the compact logic into a shared compactOwner() helper that adds DELETE FROM evolu_timestamp and clears the usage bookkeeping timestamps inside the same atomic transaction used for evolu_message cleanup.

  • src/lib/compact-owner.ts — new module containing the single compactOwner() transaction; both endpoints now call it instead of duplicating the SQL inline, eliminating the drift risk that caused the original bug.
  • src/lib/admin-server.ts + src/lib/self-server.ts — inline transactions replaced with compactOwner() calls; response shapes, auth flows, and error handling are untouched.
  • test/compact-owner.test.mjs — new unit tests covering deletion, timestamp cleanup (regression guard), usage zeroing, idempotency, and cross-owner isolation; test/self-server.integration.test.mjs extended to assert all three cleanup steps end-to-end.

Confidence Score: 5/5

Safe to merge — well-scoped atomicity fix backed by unit and integration tests, with no auth or data-flow changes.

The extraction into compactOwner() is a straightforward refactor: the SQL executed is identical to the original except for two provably missing statements. The transaction boundary is unchanged, both call-sites pass a fresh DB handle with busy_timeout already set, and the response contract is preserved. Tests cover the regression scenario, idempotency, and cross-owner isolation.

No files require special attention.

Important Files Changed

Filename Overview
src/lib/compact-owner.ts New shared helper that atomically wipes evolu_message, evolu_timestamp, and clears evolu_usage bookkeeping in one better-sqlite3 transaction; logic is correct and well-documented.
src/lib/admin-server.ts Inline transaction replaced with compactOwner() call; response shape unchanged; error handling preserved.
src/lib/self-server.ts Same inline-to-compactOwner() refactor as admin-server; auth flow, DB handle lifecycle, and error paths are untouched.
test/compact-owner.test.mjs Five unit tests covering message deletion, timestamp deletion (regression guard), usage bookkeeping, idempotency, and cross-owner isolation.
test/self-server.integration.test.mjs Fixture extended to seed evolu_timestamp rows and non-null first/lastTimestamp; compact test now asserts all three cleanup steps end-to-end.

Sequence Diagram

sequenceDiagram
    participant C as Client
    participant EP as /compact-owner endpoint
    participant CO as compactOwner()
    participant DB as SQLite

    C->>EP: POST compact request (auth)
    EP->>DB: Open fresh write handle
    EP->>CO: compactOwner(db, ownerId)
    CO->>DB: BEGIN TRANSACTION
    CO->>DB: SELECT storedBytes (before)
    CO->>DB: SELECT COUNT(*) FROM evolu_message
    CO->>DB: DELETE FROM evolu_message
    CO->>DB: DELETE FROM evolu_timestamp (NEW)
    CO->>DB: UPDATE evolu_usage firstTimestamp=NULL lastTimestamp=NULL (NEW)
    CO->>DB: SELECT storedBytes (after)
    CO->>DB: COMMIT
    CO-->>EP: CompactOwnerResult
    EP->>C: 200 JSON response
    EP->>DB: db.close()
Loading

Reviews (2): Last reviewed commit: "refactor: extract compactOwner helper so..." | Re-trigger Greptile

Comment thread src/lib/admin-server.ts Outdated
…action

Greptile flagged on PR #10 that admin-server.ts had no integration test
for the new evolu_timestamp + first/lastTimestamp cleanup steps —
correct: the two endpoints had identical-but-duplicated transactions,
so a future tweak that landed on one path could silently regress the
other.

Move the transaction into `src/lib/compact-owner.ts` exporting a
single `compactOwner(db, ownerIdBytes)` function. Both /admin/compact-
owner and /self/compact-owner now call it. There's no longer
per-endpoint compact code to drift.

Added direct unit coverage at `test/compact-owner.test.mjs`:
  - deletes evolu_message rows
  - deletes evolu_timestamp rows (the production-wedge regression guard)
  - zeroes storedBytes + clears first/lastTimestamp
  - idempotent on already-empty owner
  - other owners' state untouched (defence-in-depth against a bad
    WHERE clause in a future refactor)

Net: admin-server.ts -51 lines, self-server.ts -36 lines, +1 helper
file, +1 unit test file. 26 tests pass (5 new + 21 existing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@elkimek
Copy link
Copy Markdown
Owner Author

elkimek commented May 6, 2026

Addressed Greptile's drift concern by extracting the shared transaction:

  • src/lib/compact-owner.ts — new helper. Both /admin/compact-owner and /self/compact-owner call it. Single source of truth for what "compact" means.
  • test/compact-owner.test.mjs — direct unit coverage with 5 scenarios:
    • deletes evolu_message rows ✓
    • deletes evolu_timestamp rows ✓ (the production-wedge regression guard)
    • zeroes storedBytes + clears first/lastTimestamp
    • idempotent on already-empty owner ✓
    • other owners' state untouched ✓ (defence-in-depth against a bad WHERE clause)

Both endpoints now exercise the same code path, so future tweaks can't silently regress one and not the other. 26/26 tests green.

@elkimek elkimek merged commit 9244b1f into main May 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant