fix: resolve SQLite transaction bugs, FTS5 sync, and atomic operations#94
fix: resolve SQLite transaction bugs, FTS5 sync, and atomic operations#94murilloimparavel wants to merge 3 commits intotirth8205:mainfrom
Conversation
tirth8205
left a comment
There was a problem hiding this comment.
This fixes a real and impactful SQLite transaction nesting bug (closes #135, #110). The in_transaction guard and atomic batch wrapping are the right approach.
Please rebase on latest main and add test coverage for the transaction nesting scenario (e.g., update with deleted files in diff). Once done, this is ready to merge.
…205#135, tirth8205#181) Python's sqlite3 with default isolation_level="" silently opens a DEFERRED transaction on the first DML statement. When remove_file_data() is called for stale/deleted files, the DELETE statements open an implicit transaction that is never committed. The subsequent store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that already-open transaction, causing: sqlite3.OperationalError: cannot start a transaction within a transaction This affects both full_build (stale-file purge loop) and incremental_update (deleted-file handling). Fix: - graph.py: Add in_transaction guard before BEGIN IMMEDIATE in store_file_nodes_edges() to flush any pending implicit transaction - graph.py: Add public rollback() method to GraphStore - incremental.py: Commit after stale-file purge loop in full_build() - incremental.py: Commit after deleted-file removal in incremental_update() Tests: - test_store_after_remove_no_transaction_error (single remove → store) - test_store_after_multiple_removes_no_transaction_error (bulk removes → store) Closes tirth8205#135, tirth8205#181, relates to tirth8205#162, tirth8205#193, tirth8205#94 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: flush implicit SQLite transaction before BEGIN IMMEDIATE (#135, #181) Python's sqlite3 with default isolation_level="" silently opens a DEFERRED transaction on the first DML statement. When remove_file_data() is called for stale/deleted files, the DELETE statements open an implicit transaction that is never committed. The subsequent store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that already-open transaction, causing: sqlite3.OperationalError: cannot start a transaction within a transaction This affects both full_build (stale-file purge loop) and incremental_update (deleted-file handling). Fix: - graph.py: Add in_transaction guard before BEGIN IMMEDIATE in store_file_nodes_edges() to flush any pending implicit transaction - graph.py: Add public rollback() method to GraphStore - incremental.py: Commit after stale-file purge loop in full_build() - incremental.py: Commit after deleted-file removal in incremental_update() Tests: - test_store_after_remove_no_transaction_error (single remove → store) - test_store_after_multiple_removes_no_transaction_error (bulk removes → store) Closes #135, #181, relates to #162, #193, #94 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address root cause — set isolation_level=None, atomic summaries The in_transaction guard from the previous commit addresses the symptom. This commit fixes the root cause: Python's sqlite3 default isolation_level="" silently opens implicit DEFERRED transactions on every DML statement. Setting isolation_level=None disables this legacy behavior entirely, making all transaction management explicit. Changes: - graph.py: Set isolation_level=None in sqlite3.connect() to disable implicit transactions — this is the actual root-cause fix - graph.py: Downgrade in_transaction guard log from debug to warning (it should never fire now; if it does, something is wrong) - tools/build.py: Wrap each _compute_summaries block in explicit BEGIN IMMEDIATE / COMMIT / ROLLBACK — previously these DELETE+INSERT sequences ran without transaction protection, risking partial writes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: improve comments — explain why, not what - incremental.py: Reword commit-after-delete comments to explain the reason (avoiding nested transactions) instead of describing the mechanism (flushing implicit transactions) - test_graph.py: Reword test comments to explain the scenario being guarded against rather than implementation details that changed with the isolation_level=None fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: apply isolation_level=None to all connection sites, atomic batch ops Audit of the entire codebase found two additional sqlite3.connect() calls that were still using the default isolation_level="" (implicit transactions), plus two batch-write functions without transaction protection. Connections: - registry.py: ConnectionPool.get() now passes isolation_level=None to prevent implicit transactions on pooled connections - embeddings.py: EmbeddingStore.__init__ now passes isolation_level=None to match GraphStore behavior Atomic batch operations: - flows.py: store_flows() wrapped in BEGIN IMMEDIATE / COMMIT / ROLLBACK to prevent partial flow data if an exception interrupts the loop - communities.py: store_communities() wrapped in BEGIN IMMEDIATE / COMMIT / ROLLBACK to prevent partial community data on crash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: update uv.lock to match pyproject.toml v2.2.2 Lock file was out of sync — still referenced v2.0.0 while pyproject.toml already declared v2.2.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review: PR #94 — fix: resolve SQLite transaction bugs, FTS5 sync, and atomic operationsThis is a high-quality, focused bug fix PR. The core fixes are correct:
One issue to fix: Rebase status: The owner requested a rebase and test coverage for the transaction nesting scenario. The author rebased (comment from April 8). Please confirm the test for the nesting scenario was added — the diff shown does not include a test for concurrent Merge order note: The owner stated this should merge before #95 (which overlaps on post-processing). Please ensure rebase is current with main and CI passes. Verdict: Nearly ready to merge. Fix the duplicate logger declaration, confirm CI passes, and this is mergeable. |
|
Update: Merge conflict detected. GitHub reports this PR has merge conflicts with main (mergeStateStatus: DIRTY). Despite the author rebasing on April 8, there have been additional merges since then. Please rebase again on the current main branch. Once conflicts are resolved, fix the duplicate logger in search.py, and CI passes, this is ready to merge. |
…perations
Fixes multiple SQLite transaction management issues:
1. **BEGIN IMMEDIATE conflict** (graph.py): Add `in_transaction` guard
before `BEGIN IMMEDIATE` to prevent "cannot start a transaction within
a transaction" errors when implicit transactions are left open.
2. **FTS5 content sync** (search.py): Recreate FTS5 table with
`content='nodes'` and use `INSERT INTO nodes_fts(nodes_fts)
VALUES('rebuild')` for proper content-sync population.
3. **Atomic batch operations** (flows.py, communities.py): Wrap
DELETE + INSERT loops in `BEGIN IMMEDIATE` / `try` / `COMMIT` /
`except ROLLBACK` to prevent partial writes on crash.
4. **Missing rollback on error** (tools/build.py): Add `store.rollback()`
in post-processing except blocks to prevent lingering transactions.
5. **Public rollback API** (graph.py): Expose `GraphStore.rollback()`
method instead of accessing `_conn` directly.
6. **Diagnostic logging**: Add `logger.warning()` before silent rollbacks
in transaction guards for debuggability.
Tested on monorepo with 8,352 nodes, 50,259 edges — zero transaction
errors after fix.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Rebased on latest main and addressed reviewer feedback:
|
d6f6c58 to
c91141e
Compare
Summary
Fixes multiple SQLite transaction management issues that cause
cannot start a transaction within a transactionerrors and potential data corruption:graph.py): Addin_transactionguard beforeBEGIN IMMEDIATEto prevent errors when implicit transactions are left open by prior DML operationssearch.py): Recreate FTS5 table withcontent='nodes'and useINSERT INTO nodes_fts(nodes_fts) VALUES('rebuild')for proper content-sync population (was creating standalone table disconnected from nodes)flows.py,communities.py): Wrap DELETE + INSERT loops inBEGIN IMMEDIATE/try/COMMIT/except ROLLBACKto prevent partial writes on crashtools/build.py): Addstore.rollback()in post-processing except blocks to prevent lingering transactions that corrupt the DB on next rungraph.py): ExposeGraphStore.rollback()method instead of accessing_conndirectlylogger.warning()before silent rollbacks in transaction guards for debuggabilityReproduction
The
cannot start a transaction within a transactionerror occurs when:store_file_nodes_edges()callsBEGIN IMMEDIATEwhile a prior operation (e.g.,set_metadata,upsert_node) left an implicit Python sqlite3 transaction openTest plan
🤖 Generated with Claude Code