Handle nested SQLite transactions during builds#193
Handle nested SQLite transactions during builds#193GrimoireScribe wants to merge 1 commit intotirth8205:mainfrom
Conversation
…205#135, tirth8205#181) Python's sqlite3 with default isolation_level="" silently opens a DEFERRED transaction on the first DML statement. When remove_file_data() is called for stale/deleted files, the DELETE statements open an implicit transaction that is never committed. The subsequent store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that already-open transaction, causing: sqlite3.OperationalError: cannot start a transaction within a transaction This affects both full_build (stale-file purge loop) and incremental_update (deleted-file handling). Fix: - graph.py: Add in_transaction guard before BEGIN IMMEDIATE in store_file_nodes_edges() to flush any pending implicit transaction - graph.py: Add public rollback() method to GraphStore - incremental.py: Commit after stale-file purge loop in full_build() - incremental.py: Commit after deleted-file removal in incremental_update() Tests: - test_store_after_remove_no_transaction_error (single remove → store) - test_store_after_multiple_removes_no_transaction_error (bulk removes → store) Closes tirth8205#135, tirth8205#181, relates to tirth8205#162, tirth8205#193, tirth8205#94 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: flush implicit SQLite transaction before BEGIN IMMEDIATE (#135, #181) Python's sqlite3 with default isolation_level="" silently opens a DEFERRED transaction on the first DML statement. When remove_file_data() is called for stale/deleted files, the DELETE statements open an implicit transaction that is never committed. The subsequent store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that already-open transaction, causing: sqlite3.OperationalError: cannot start a transaction within a transaction This affects both full_build (stale-file purge loop) and incremental_update (deleted-file handling). Fix: - graph.py: Add in_transaction guard before BEGIN IMMEDIATE in store_file_nodes_edges() to flush any pending implicit transaction - graph.py: Add public rollback() method to GraphStore - incremental.py: Commit after stale-file purge loop in full_build() - incremental.py: Commit after deleted-file removal in incremental_update() Tests: - test_store_after_remove_no_transaction_error (single remove → store) - test_store_after_multiple_removes_no_transaction_error (bulk removes → store) Closes #135, #181, relates to #162, #193, #94 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address root cause — set isolation_level=None, atomic summaries The in_transaction guard from the previous commit addresses the symptom. This commit fixes the root cause: Python's sqlite3 default isolation_level="" silently opens implicit DEFERRED transactions on every DML statement. Setting isolation_level=None disables this legacy behavior entirely, making all transaction management explicit. Changes: - graph.py: Set isolation_level=None in sqlite3.connect() to disable implicit transactions — this is the actual root-cause fix - graph.py: Downgrade in_transaction guard log from debug to warning (it should never fire now; if it does, something is wrong) - tools/build.py: Wrap each _compute_summaries block in explicit BEGIN IMMEDIATE / COMMIT / ROLLBACK — previously these DELETE+INSERT sequences ran without transaction protection, risking partial writes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: improve comments — explain why, not what - incremental.py: Reword commit-after-delete comments to explain the reason (avoiding nested transactions) instead of describing the mechanism (flushing implicit transactions) - test_graph.py: Reword test comments to explain the scenario being guarded against rather than implementation details that changed with the isolation_level=None fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: apply isolation_level=None to all connection sites, atomic batch ops Audit of the entire codebase found two additional sqlite3.connect() calls that were still using the default isolation_level="" (implicit transactions), plus two batch-write functions without transaction protection. Connections: - registry.py: ConnectionPool.get() now passes isolation_level=None to prevent implicit transactions on pooled connections - embeddings.py: EmbeddingStore.__init__ now passes isolation_level=None to match GraphStore behavior Atomic batch operations: - flows.py: store_flows() wrapped in BEGIN IMMEDIATE / COMMIT / ROLLBACK to prevent partial flow data if an exception interrupts the loop - communities.py: store_communities() wrapped in BEGIN IMMEDIATE / COMMIT / ROLLBACK to prevent partial community data on crash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: update uv.lock to match pyproject.toml v2.2.2 Lock file was out of sync — still referenced v2.0.0 while pyproject.toml already declared v2.2.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Closing in favor of #205 which provides a more complete fix. Your savepoint approach is technically valid and clever — it preserves atomic per-file replace semantics when the connection is already in a transaction. However, it is a symptom-level fix: it works around the implicit transaction left open by PR #205 goes further in three ways:
Additionally, the |
Summary
Why
On Windows, rebuilding the graph could fail with:
ext sqlite3.OperationalError: cannot start a transaction within a transactionThis happens when store_file_nodes_edges() is called while the connection is already in a transaction. Using a savepoint preserves the atomic per-file replace behavior without assuming top-level transaction ownership.
Verification