Skip to content
This repository was archived by the owner on Apr 2, 2026. It is now read-only.

Add tamper-proof hash chain to audit trail#6

Merged
ojongerius merged 4 commits intomainfrom
tamper-proof-audit
Mar 26, 2026
Merged

Add tamper-proof hash chain to audit trail#6
ojongerius merged 4 commits intomainfrom
tamper-proof-audit

Conversation

@ojongerius
Copy link
Copy Markdown
Collaborator

Summary

  • SHA-256 hash chain on the messages table: each message stores prev_hash and hash, linking it to the previous record
  • VerifyChain() method walks the chain and detects modifications, deletions, or reordering
  • Works with both plaintext and encrypted stores (decrypts before verifying)
  • Chain survives proxy restarts by resuming from the last stored hash
  • Updated README: document hash chain in security section, update known limitations

Closes #2

Test plan

  • TestVerifyChain_ValidChain — intact chain passes
  • TestVerifyChain_DetectsModifiedContent — tampered raw content detected
  • TestVerifyChain_DetectsDeletedMessage — deleted message breaks chain linkage
  • TestVerifyChain_EmptyDB — empty DB is valid
  • TestVerifyChain_WithEncryption — works with encryption enabled
  • TestHashChain_SurvivesRestart — chain continues correctly after close/reopen

Each message stores a SHA-256 hash chained to the previous message,
making modifications, deletions, and reordering detectable. Includes
VerifyChain() for offline integrity verification. Works with both
plaintext and encrypted stores, and survives proxy restarts. Closes #2.
govulncheck flagged 16 CVEs in Go 1.22 stdlib (crypto/tls, net/url,
os/exec, etc). All fixed in Go 1.24+.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a tamper-evident integrity mechanism to the audit messages log by introducing a SHA-256 hash chain, along with verification and tests, and bumps the Go toolchain version used in CI.

Changes:

  • Extend messages schema with sequence, prev_hash, and hash, compute chained hashes on insert, and add VerifyChain() to validate integrity (including decrypt-then-verify when encryption is enabled).
  • Add unit tests covering valid chains, tampering, deletion, empty DB, encryption, and restart behavior.
  • Update documentation and CI/toolchain Go version to 1.24.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
internal/audit/store.go Adds message hash-chain fields/state, hash computation, and VerifyChain() implementation.
internal/audit/store_test.go Adds tests for hash chain verification and restart continuity.
README.md Documents the new hash-chain integrity feature and updates limitations.
go.mod Bumps module Go version to 1.24.
.github/workflows/ci.yml Updates CI to use Go 1.24.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/audit/store.go
Comment on lines +212 to +221
// computeHash creates a SHA-256 hash of the message content chained with the previous hash.
func computeHash(id, sessionID, direction, raw, prevHash string) string {
h := sha256.New()
h.Write([]byte(prevHash))
h.Write([]byte(id))
h.Write([]byte(sessionID))
h.Write([]byte(direction))
h.Write([]byte(raw))
return hex.EncodeToString(h.Sum(nil))
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeHash concatenates variable-length fields without separators/length-prefixing, so different tuples of (sessionID,direction,raw,...) can produce the same byte stream without needing a SHA-256 collision (e.g., boundary-shifting attacks). Also, the hash currently omits timestamp, jsonrpc_id, method, and sequence, so tampering with those columns would not be detected. Use an unambiguous encoding (length-prefixed/binary or canonical JSON of a struct) and include all fields whose integrity you want to protect.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — now uses length-prefixed encoding (8-byte big-endian length prefix per field) and includes all fields: jsonrpc_id, method, sequence in addition to the existing ones.

Comment thread internal/audit/store.go Outdated
Comment on lines 35 to 40
sequence INTEGER,
prev_hash TEXT NOT NULL DEFAULT '',
hash TEXT NOT NULL DEFAULT ''
);

CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, timestamp);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sequence is nullable and has no uniqueness guarantee, but VerifyChain() relies on ORDER BY sequence and scans into a non-nullable int. A NULL/duplicate sequence (possible during migration or manual DB edits) will either error or make ordering ambiguous. Consider making sequence INTEGER NOT NULL with a UNIQUE constraint (or primary key) and add an index on it for efficient startup resume and verification.

Suggested change
sequence INTEGER,
prev_hash TEXT NOT NULL DEFAULT '',
hash TEXT NOT NULL DEFAULT ''
);
CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, timestamp);
sequence INTEGER NOT NULL,
prev_hash TEXT NOT NULL DEFAULT '',
hash TEXT NOT NULL DEFAULT ''
);
CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, timestamp);
CREATE UNIQUE INDEX IF NOT EXISTS idx_messages_session_sequence ON messages(session_id, sequence);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — sequence is now NOT NULL with a UNIQUE index. Also added a schema migration (ALTER TABLE) for existing databases.

Comment thread internal/audit/store.go
Comment on lines 193 to 205
s.mu.Lock()
defer s.mu.Unlock()

s.sequence++
prevHash := s.lastHash
hash := computeHash(id, sessionID, direction, stored, prevHash)
s.lastHash = hash

_, err := s.db.Exec(
"INSERT INTO messages (id, session_id, direction, timestamp, jsonrpc_id, method, raw) VALUES (?, ?, ?, ?, ?, ?, ?)",
"INSERT INTO messages (id, session_id, direction, timestamp, jsonrpc_id, method, raw, sequence, prev_hash, hash) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
id, sessionID, direction, time.Now().UTC(), nullIfEmpty(jsonrpcID), nullIfEmpty(method), s.encrypt(stored),
s.sequence, prevHash, hash,
)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash chaining state (s.lastHash/s.sequence) is maintained only in-process. If two processes (or two Store instances) write to the same SQLite DB concurrently, they can both resume the same last hash/sequence and generate conflicting sequence/prev_hash/hash, breaking the chain. To make this robust, compute the next sequence + previous hash inside a single DB transaction with an appropriate write lock (e.g., BEGIN IMMEDIATE) and derive prev_hash from the last committed row.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. By design each proxy instance owns its own Store — concurrent multi-process writes to the same DB isn't a supported use case. The UNIQUE constraint on sequence will catch any accidental collision. Adding a comment to document this assumption.

Comment thread internal/audit/store.go Outdated
Comment on lines +370 to +379
// decrypt decrypts a string if it looks encrypted. Returns as-is otherwise.
func (s *Store) decrypt(ciphertext string) string {
if s.enc == nil {
return ciphertext
}
decrypted, err := s.enc.Decrypt(ciphertext)
if err != nil {
return ciphertext // not encrypted or wrong key
}
return decrypted
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decrypt() silently returns the ciphertext on decrypt errors (including wrong key), which makes VerifyChain() report a generic "hash mismatch" rather than a clear decryption/key problem. This can be misleading during offline verification for encrypted DBs. Consider having VerifyChain() treat decrypt failures as an explicit verification error (or include a distinct status/error message for decryption failure) when s.enc != nil.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — VerifyChain now uses a separate decryptVerify() that returns errors explicitly. Decrypt failures now produce a clear 'decryption failed (wrong key or corrupted data)' status instead of a misleading hash mismatch.

Comment thread README.md Outdated
- **Minimal stderr logging**: error messages do not leak paths, session IDs, or message contents
- **Encryption at rest**: sensitive columns (`raw`, `arguments`, `result`, `error`) are encrypted with AES-256-GCM when an encryption key is provided. Enable by setting the `BEACON_ENCRYPTION_KEY` environment variable. Without a key, data is stored in plaintext (redaction still applies).
- **Parameterized SQL**: all database queries use parameterized statements — no SQL injection risk
- **Tamper-proof hash chain**: each audit message includes a SHA-256 hash chained to the previous message. Any modification, deletion, or reordering of records is detectable. Verify with `VerifyChain()` — works offline against the DB file.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README calls this a "Tamper-proof hash chain", but the implementation is only tamper-evident and (as noted below) cannot prevent an attacker with DB access from rewriting the entire chain. To avoid overstating guarantees, rename this to "tamper-evident" (or similar) and ensure the description matches the actual threat model.

Suggested change
- **Tamper-proof hash chain**: each audit message includes a SHA-256 hash chained to the previous message. Any modification, deletion, or reordering of records is detectable. Verify with `VerifyChain()` — works offline against the DB file.
- **Tamper-evident hash chain**: each audit message includes a SHA-256 hash chained to the previous message. Any modification, deletion, or reordering of existing records is detectable. Verify with `VerifyChain()` — works offline against the DB file.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — renamed to 'tamper-evident' throughout.

Comment thread internal/audit/store_test.go Outdated
Comment on lines +311 to +315
sessionID, _ := store.CreateSession("test", "cat")
store.LogMessage(sessionID, "client_to_server", "1", "initialize", `{"jsonrpc":"2.0"}`)
store.LogMessage(sessionID, "server_to_client", "1", "", `{"jsonrpc":"2.0","result":{}}`)
store.LogMessage(sessionID, "client_to_server", "2", "tools/call", `{"jsonrpc":"2.0"}`)

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new tests ignore errors from CreateSession and LogMessage. If an insert fails (e.g., due to a schema/migration issue), the test failure can become harder to diagnose or may not fail at the intended assertion. Please check and assert err for these calls in the new hash-chain tests for clearer failures.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — all CreateSession and LogMessage calls now check and assert errors.

Comment thread internal/audit/store.go
Comment on lines +135 to +145
// Resume hash chain from the most recent message (supports restarts)
var lastHash sql.NullString
var seq sql.NullInt64
err = db.QueryRow("SELECT hash, sequence FROM messages ORDER BY sequence DESC LIMIT 1").Scan(&lastHash, &seq)
if err == nil {
store.lastHash = lastHash.String
store.sequence = seq.Int64
}
// err == sql.ErrNoRows is fine — empty DB, chain starts fresh

return store, nil
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new resume query assumes the messages table already has sequence and hash columns, but the schema is applied via CREATE TABLE IF NOT EXISTS, which does not add columns on upgrade. On an existing DB, SELECT hash, sequence ... will fail (e.g. "no such column"), and Open() currently ignores that error—later LogMessage() will fail on insert. Add an explicit migration step (e.g., PRAGMA table_info(messages) + ALTER TABLE ... ADD COLUMN), and decide how to backfill/handle pre-chain rows so upgraded databases remain usable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added migrateHashChain() that checks PRAGMA table_info(messages) and runs ALTER TABLE ADD COLUMN for missing columns. Pre-chain rows will have empty hashes (sequence=0, hash='') which is acceptable — the chain starts from the first post-migration message.

- Use length-prefixed encoding in computeHash to prevent boundary attacks
- Include all fields (jsonrpc_id, method, sequence) in hash computation
- Make sequence column NOT NULL with UNIQUE index
- Add schema migration for existing DBs (ALTER TABLE for hash columns)
- Surface decryption failures explicitly in VerifyChain
- Rename "tamper-proof" to "tamper-evident" in README
- Check all errors in tests
@ojongerius ojongerius merged commit 97db5af into main Mar 26, 2026
1 check passed
@ojongerius ojongerius deleted the tamper-proof-audit branch March 26, 2026 07:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tamper-proof audit trail

2 participants