Add audit DB retention policy#5
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an audit database retention mechanism to keep the SQLite audit DB from growing indefinitely by pruning sessions (and related audit data) older than a configured age.
Changes:
- Add
Store.DeleteSessionsBefore(cutoff)to delete stale sessions and dependent rows in a transaction. - Add
--retention-days Nflag tobeacon-proxyto run retention cleanup on startup. - Update README to document the new flag and revise the limitations section; bump Go version in
go.modand CI.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
internal/audit/store.go |
Adds transactional retention deletion across audit tables. |
internal/audit/store_test.go |
Adds tests for retention deletion behavior. |
cmd/beacon-proxy/main.go |
Adds --retention-days flag and runs cleanup on startup. |
README.md |
Documents the retention flag and updates limitations text. |
go.mod |
Bumps minimum Go version to 1.24. |
.github/workflows/ci.yml |
Updates CI Go version to 1.24. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| module github.com/ottojongerius/beacon | ||
|
|
||
| go 1.22 | ||
| go 1.26 |
There was a problem hiding this comment.
The PR description doesn’t mention raising the module’s minimum Go version to 1.24. If this bump isn’t required for a new language/runtime feature, consider keeping the previous Go version (or document the minimum supported Go version change in the PR/README) since it can break consumers/build environments pinned to older Go.
There was a problem hiding this comment.
Intentional — govulncheck flagged 16 CVEs in Go 1.22 stdlib. Bumped to 1.26 (latest release). The CLAUDE.md already says Go 1.22+ which is still accurate as a minimum.
| - uses: actions/setup-go@v5 | ||
| with: | ||
| go-version: '1.22' | ||
| go-version: '1.26' | ||
|
|
There was a problem hiding this comment.
CI is now pinned to Go 1.24, but the PR description doesn’t call out a minimum Go version change. If the project is intentionally raising the required Go version, it would be good to document that (and ensure it matches go.mod); otherwise consider keeping CI on the previously supported version.
| store.db.QueryRow("SELECT COUNT(*) FROM intent_contexts WHERE session_id = ?", oldID).Scan(&count) | ||
| if count != 0 { | ||
| t.Error("old intent contexts still exist") | ||
| } |
There was a problem hiding this comment.
This test verifies sessions/messages/tool_calls/intent_contexts are removed, but it doesn’t assert that rows in the join table intent_tool_calls are pruned. Add an assertion on intent_tool_calls to ensure the cascade is complete (and to catch dangling references).
| } | |
| } | |
| store.db.QueryRow("SELECT COUNT(*) FROM intent_tool_calls").Scan(&count) | |
| if count != 0 { | |
| t.Error("old intent tool call links still exist") | |
| } |
| store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", oldID).Scan(&count) | ||
| if count != 0 { | ||
| t.Error("old session still exists") | ||
| } | ||
| store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", oldID).Scan(&count) | ||
| if count != 0 { | ||
| t.Error("old messages still exist") | ||
| } | ||
| store.db.QueryRow("SELECT COUNT(*) FROM tool_calls WHERE session_id = ?", oldID).Scan(&count) | ||
| if count != 0 { | ||
| t.Error("old tool calls still exist") | ||
| } | ||
| store.db.QueryRow("SELECT COUNT(*) FROM intent_contexts WHERE session_id = ?", oldID).Scan(&count) | ||
| if count != 0 { | ||
| t.Error("old intent contexts still exist") | ||
| } | ||
|
|
||
| // New session should still be there | ||
| store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", newID).Scan(&count) | ||
| if count != 1 { | ||
| t.Error("new session was deleted") | ||
| } | ||
| store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", newID).Scan(&count) |
There was a problem hiding this comment.
These COUNT(*) checks ignore Scan() errors. If the query fails, the test may pass with a stale count value. Capture and assert the error from each Scan() to make the test reliable.
| store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", oldID).Scan(&count) | |
| if count != 0 { | |
| t.Error("old session still exists") | |
| } | |
| store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", oldID).Scan(&count) | |
| if count != 0 { | |
| t.Error("old messages still exist") | |
| } | |
| store.db.QueryRow("SELECT COUNT(*) FROM tool_calls WHERE session_id = ?", oldID).Scan(&count) | |
| if count != 0 { | |
| t.Error("old tool calls still exist") | |
| } | |
| store.db.QueryRow("SELECT COUNT(*) FROM intent_contexts WHERE session_id = ?", oldID).Scan(&count) | |
| if count != 0 { | |
| t.Error("old intent contexts still exist") | |
| } | |
| // New session should still be there | |
| store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", newID).Scan(&count) | |
| if count != 1 { | |
| t.Error("new session was deleted") | |
| } | |
| store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", newID).Scan(&count) | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", oldID).Scan(&count); err != nil { | |
| t.Fatalf("count old sessions: %v", err) | |
| } | |
| if count != 0 { | |
| t.Error("old session still exists") | |
| } | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", oldID).Scan(&count); err != nil { | |
| t.Fatalf("count old messages: %v", err) | |
| } | |
| if count != 0 { | |
| t.Error("old messages still exist") | |
| } | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM tool_calls WHERE session_id = ?", oldID).Scan(&count); err != nil { | |
| t.Fatalf("count old tool calls: %v", err) | |
| } | |
| if count != 0 { | |
| t.Error("old tool calls still exist") | |
| } | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM intent_contexts WHERE session_id = ?", oldID).Scan(&count); err != nil { | |
| t.Fatalf("count old intent contexts: %v", err) | |
| } | |
| if count != 0 { | |
| t.Error("old intent contexts still exist") | |
| } | |
| // New session should still be there | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM sessions WHERE id = ?", newID).Scan(&count); err != nil { | |
| t.Fatalf("count new sessions: %v", err) | |
| } | |
| if count != 1 { | |
| t.Error("new session was deleted") | |
| } | |
| if err := store.db.QueryRow("SELECT COUNT(*) FROM messages WHERE session_id = ?", newID).Scan(&count); err != nil { | |
| t.Fatalf("count new messages: %v", err) | |
| } |
| } | ||
| defer store.Close() | ||
|
|
||
| store.CreateSession("test", "cat") |
There was a problem hiding this comment.
The CreateSession() call’s error is ignored here, which can cause the test to proceed with an invalid setup. Please check and fail the test if CreateSession returns an error.
| store.CreateSession("test", "cat") | |
| if _, err := store.CreateSession("test", "cat"); err != nil { | |
| t.Fatalf("CreateSession: %v", err) | |
| } |
| return 0, fmt.Errorf("delete intent_tool_calls: %w", err) | ||
| } |
There was a problem hiding this comment.
The retention cascade deletes from intent_tool_calls only by stale intent_id. If an intent context from a non-stale session references a tool_call from a stale session (possible given intent/tool-call linking isn’t enforced at the DB layer), the later DELETE FROM tool_calls ... will leave dangling intent_tool_calls rows (or fail if foreign keys are enabled). Delete intent_tool_calls for stale sessions by tool_call_id as well (e.g., intent_id IN stale intents OR tool_call_id IN stale tool_calls) before deleting tool_calls.
| return 0, fmt.Errorf("delete intent_tool_calls: %w", err) | |
| } | |
| return 0, fmt.Errorf("delete intent_tool_calls by intent_id: %w", err) | |
| } | |
| if _, err := tx.Exec(`DELETE FROM intent_tool_calls WHERE tool_call_id IN ( | |
| SELECT id FROM tool_calls WHERE session_id IN (`+staleSessionQ+`))`, cutoff); err != nil { | |
| return 0, fmt.Errorf("delete intent_tool_calls by tool_call_id: %w", err) | |
| } |
| // Run retention cleanup if configured | ||
| if *retentionDays > 0 { | ||
| cutoff := time.Now().UTC().AddDate(0, 0, -*retentionDays) | ||
| deleted, err := store.DeleteSessionsBefore(cutoff) | ||
| if err != nil { | ||
| log.Printf("beacon: retention cleanup error: %v", err) | ||
| } else if deleted > 0 { | ||
| log.Printf("beacon: retention cleanup removed %d session(s) older than %d days", deleted, *retentionDays) | ||
| } |
There was a problem hiding this comment.
Retention cleanup runs synchronously on the main goroutine before the HTTP server and proxy start. On large audit DBs this can noticeably delay startup, and it conflicts with the issue’s requirement that cleanup “runs without blocking the proxy”. Consider running cleanup in a background goroutine (optionally with a timeout) or exposing a separate cleanup command.
| // Run retention cleanup if configured | |
| if *retentionDays > 0 { | |
| cutoff := time.Now().UTC().AddDate(0, 0, -*retentionDays) | |
| deleted, err := store.DeleteSessionsBefore(cutoff) | |
| if err != nil { | |
| log.Printf("beacon: retention cleanup error: %v", err) | |
| } else if deleted > 0 { | |
| log.Printf("beacon: retention cleanup removed %d session(s) older than %d days", deleted, *retentionDays) | |
| } | |
| // Run retention cleanup if configured (in background to avoid blocking startup) | |
| if *retentionDays > 0 { | |
| days := *retentionDays | |
| go func(s *audit.Store, d int) { | |
| cutoff := time.Now().UTC().AddDate(0, 0, -d) | |
| deleted, err := s.DeleteSessionsBefore(cutoff) | |
| if err != nil { | |
| log.Printf("beacon: retention cleanup error: %v", err) | |
| } else if deleted > 0 { | |
| log.Printf("beacon: retention cleanup removed %d session(s) older than %d days", deleted, d) | |
| } | |
| }(store, days) |
Prune sessions and all related data (messages, tool calls, intents) older than N days on proxy startup. Closes #3.
6a2f8d9 to
f24069e
Compare
Summary
--retention-days Nflag to auto-delete sessions older than N days on proxy startupDeleteSessionsBeforemethod cascades through all related tables (intent_tool_calls → intent_contexts → tool_calls → messages → sessions) in a single transactionCloses #3
Test plan
TestDeleteSessionsBefore— verifies old session + all related data is pruned, recent data is keptTestDeleteSessionsBefore_NothingToDelete— verifies no-op when nothing is stale--retention-days 30against a DB with old sessions