fix(store): skip corrupted part_json and info_json rows instead of crashing#7
Closed
Milofax wants to merge 5 commits intoPlutarch01:masterfrom
Closed
fix(store): skip corrupted part_json and info_json rows instead of crashing#7Milofax wants to merge 5 commits intoPlutarch01:masterfrom
Milofax wants to merge 5 commits intoPlutarch01:masterfrom
Conversation
Add shared message validation (getValidMessageInfo) and filter out malformed messages across all archive, resume, describe, transform, search-index, and capture code paths. Prevents crashes when message.info is missing required fields (id, sessionID, role, time.created). Logs warnings for dropped messages instead of throwing. Fixes Plutarch01#4
transformMessages() now splices malformed entries out of the caller's messages array so they never reach the opencode backend, preventing the Bad Request that followed the earlier TypeError fix.
A single corrupted `parts.part_json`, `messages.info_json`, `summary_nodes.message_ids_json`, `summary_state.root_node_ids_json`, or `artifacts.metadata_json` row in the SQLite archive currently kills the entire grep / describe / transform / summary pipeline via an uncaught `Failed to parse JSON` error from `parseJson`. Add a non-throwing `parseJsonSafe<T>()` to utils.ts and replace every stored-row `parseJson` call site with it: Store read paths (14 call sites): - readSessionsBatchSync: part_json + info_json + artifact metadata_json - readSessionSync: part_json + info_json - readMessageSync: info_json + part_json - diagnoseSummarySession: root_node_ids_json - ensureSummaryGraphSync: root_node_ids_json - readSummaryNodeSync: message_ids_json - materializeArtifactRow: metadata_json Legacy migration paths (2 call sites): - migrateLegacyArtifacts: per-file try/catch so one bad sessions/*.json no longer kills the remaining files, and corrupted resume.json is gracefully skipped Each skipped row is logged with operation/sessionID/messageID/partID context and a 120-char preview of the offending blob, so the damage is auditable without crashing the plugin. The three in-memory `parseJson(JSON.stringify(...))` round-trips in store-artifacts.ts and the already-try/catch-guarded events.jsonl line parser stay as throwing `parseJson` — they cannot fail from disk corruption. Regression test: - Capture a session, corrupt one part_json row in SQLite, clear all FTS indexes to force the scan path, call grep and assert the surviving messages are returned instead of throwing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9c28ceb to
11aad29
Compare
Owner
|
Closing in favor of #8, which lands the same corrupted-JSON-row fix cleanly against current master without the stale conflicts and version churn in this PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
One corrupted
parts.part_jsonormessages.info_jsonrow in the SQLitearchive currently takes out
grep,describe, and the automaticretrieval transform pipeline.
parseJsonis called directly in sixstored-row load sites inside
store.ts, so a single unparseable blobbubbles up as
Failed to parse JSON: Unterminated stringand kills theentire batch load instead of skipping the bad row.
This PR adds a non-throwing
parseJsonSafe<T>()companion toparseJsonin
utils.tsand replaces the six unsafe call sites:readSessionsBatchSync: thepart_jsonandinfo_jsonloops (this isthe exact stack in the production crash report).
readSessionSync: thepart_jsonandinfo_jsonloops.readMessageSync: theinfo_jsonload and the innerpart_jsonmap.A corrupted part or message row now gets skipped with a structured
logger warning that carries
operation,sessionID,messageID,partID,error, and a 120-char preview of the offending blob — thesurrounding batch load keeps going and returns the healthy rows instead
of throwing.
Reproducer
Triggered by a
lcm_grepcall whose FTS path returned zero hits andfell through to the scan path against a session that had one
on-disk-corrupted
parts.part_jsonrow.Regression test
tests/store-transform.test.mjsgets a new testgrep scan survives a single corrupted part_json row. It:UPDATE parts SET part_json = '{"id": "m2-p", "type": "text", "text": "cosmic-ray corrupt b'to mimic on-disk truncation.DELETEs all three FTS tables (message_fts,summary_fts,artifact_fts) so grep is forced down the scan path — the exact code path the user crashed on.store.grep({ query: 'cosmic-ray', sessionID: 's1' }).Without the fix, step 4 throws. With the fix, step 4 returns m1 and m3.
Test plan
npm run typecheck— cleannpm run lint— cleannpm test— 169/169 green (was 168 before the new regression test).lcm/lcm.db—lcm_grepno longer crashes, logs the skipped row, returns the surviving hitsNotes
parseJsonitself is unchanged — structural invariants (e.g.summary_state.root_node_ids_json) still throw on corruption, which is the right behavior for places where a failure means the store is unrecoverable rather than just one row being bad.🤖 Generated with Claude Code