Skip to content

fix: flush implicit transaction before BEGIN IMMEDIATE (#135)#162

Closed
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:claude/interesting-johnson
Closed

fix: flush implicit transaction before BEGIN IMMEDIATE (#135)#162
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:claude/interesting-johnson

Conversation

@azizur100389
Copy link
Copy Markdown
Contributor

Summary

Root Cause

Python's sqlite3 with default isolation_level="" auto-begins a DEFERRED transaction on the first DML. In incremental_update and full_build, remove_file_data() issues DELETE statements for deleted/stale files, silently opening an implicit transaction. The subsequent store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that already-open transaction, which SQLite rejects.

Fix

Check self._conn.in_transaction before BEGIN IMMEDIATE and commit any pending implicit transaction. This preserves atomic replace semantics while being robust against prior implicit transaction state. The guard is a no-op when no implicit transaction is open.

Tests Added

  • test_store_after_remove_no_transaction_error — direct GraphStore unit test
  • test_incremental_deleted_and_modified_files — end-to-end incremental update with mixed deletes and modifications
  • test_full_build_stale_files_then_parse — end-to-end full build with stale file purging

Test Plan

  • All 3 new regression tests pass
  • All existing test_graph.py and test_incremental.py tests pass (46/46)
  • ruff check clean on all changed files

When git diff includes deleted files, remove_file_data() DELETEs open
an implicit DEFERRED transaction. The subsequent store_file_nodes_edges()
call then fails with "cannot start a transaction within a transaction"
because BEGIN IMMEDIATE cannot nest inside DEFERRED.

Commit any pending implicit transaction before BEGIN IMMEDIATE.
Add regression tests for both incremental_update and full_build paths.
@azizur100389 azizur100389 force-pushed the claude/interesting-johnson branch from 18a9c1b to bcb3dad Compare April 8, 2026 22:12
ghiemer added a commit to ghiemer/code-review-graph that referenced this pull request Apr 10, 2026
…205#135, tirth8205#181)

Python's sqlite3 with default isolation_level="" silently opens a
DEFERRED transaction on the first DML statement. When remove_file_data()
is called for stale/deleted files, the DELETE statements open an implicit
transaction that is never committed. The subsequent
store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that
already-open transaction, causing:

  sqlite3.OperationalError: cannot start a transaction within a transaction

This affects both full_build (stale-file purge loop) and
incremental_update (deleted-file handling).

Fix:
- graph.py: Add in_transaction guard before BEGIN IMMEDIATE in
  store_file_nodes_edges() to flush any pending implicit transaction
- graph.py: Add public rollback() method to GraphStore
- incremental.py: Commit after stale-file purge loop in full_build()
- incremental.py: Commit after deleted-file removal in incremental_update()

Tests:
- test_store_after_remove_no_transaction_error (single remove → store)
- test_store_after_multiple_removes_no_transaction_error (bulk removes → store)

Closes tirth8205#135, tirth8205#181, relates to tirth8205#162, tirth8205#193, tirth8205#94

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tirth8205 pushed a commit that referenced this pull request Apr 11, 2026
* fix: flush implicit SQLite transaction before BEGIN IMMEDIATE (#135, #181)

Python's sqlite3 with default isolation_level="" silently opens a
DEFERRED transaction on the first DML statement. When remove_file_data()
is called for stale/deleted files, the DELETE statements open an implicit
transaction that is never committed. The subsequent
store_file_nodes_edges() then executes BEGIN IMMEDIATE inside that
already-open transaction, causing:

  sqlite3.OperationalError: cannot start a transaction within a transaction

This affects both full_build (stale-file purge loop) and
incremental_update (deleted-file handling).

Fix:
- graph.py: Add in_transaction guard before BEGIN IMMEDIATE in
  store_file_nodes_edges() to flush any pending implicit transaction
- graph.py: Add public rollback() method to GraphStore
- incremental.py: Commit after stale-file purge loop in full_build()
- incremental.py: Commit after deleted-file removal in incremental_update()

Tests:
- test_store_after_remove_no_transaction_error (single remove → store)
- test_store_after_multiple_removes_no_transaction_error (bulk removes → store)

Closes #135, #181, relates to #162, #193, #94

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address root cause — set isolation_level=None, atomic summaries

The in_transaction guard from the previous commit addresses the symptom.
This commit fixes the root cause: Python's sqlite3 default
isolation_level="" silently opens implicit DEFERRED transactions on
every DML statement. Setting isolation_level=None disables this legacy
behavior entirely, making all transaction management explicit.

Changes:
- graph.py: Set isolation_level=None in sqlite3.connect() to disable
  implicit transactions — this is the actual root-cause fix
- graph.py: Downgrade in_transaction guard log from debug to warning
  (it should never fire now; if it does, something is wrong)
- tools/build.py: Wrap each _compute_summaries block in explicit
  BEGIN IMMEDIATE / COMMIT / ROLLBACK — previously these DELETE+INSERT
  sequences ran without transaction protection, risking partial writes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: improve comments — explain why, not what

- incremental.py: Reword commit-after-delete comments to explain the
  reason (avoiding nested transactions) instead of describing the
  mechanism (flushing implicit transactions)
- test_graph.py: Reword test comments to explain the scenario being
  guarded against rather than implementation details that changed
  with the isolation_level=None fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: apply isolation_level=None to all connection sites, atomic batch ops

Audit of the entire codebase found two additional sqlite3.connect()
calls that were still using the default isolation_level="" (implicit
transactions), plus two batch-write functions without transaction
protection.

Connections:
- registry.py: ConnectionPool.get() now passes isolation_level=None
  to prevent implicit transactions on pooled connections
- embeddings.py: EmbeddingStore.__init__ now passes isolation_level=None
  to match GraphStore behavior

Atomic batch operations:
- flows.py: store_flows() wrapped in BEGIN IMMEDIATE / COMMIT / ROLLBACK
  to prevent partial flow data if an exception interrupts the loop
- communities.py: store_communities() wrapped in BEGIN IMMEDIATE / COMMIT /
  ROLLBACK to prevent partial community data on crash

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update uv.lock to match pyproject.toml v2.2.2

Lock file was out of sync — still referenced v2.0.0 while
pyproject.toml already declared v2.2.2.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tirth8205
Copy link
Copy Markdown
Owner

Closing in favor of #205 which was just merged and provides a more complete fix.

Your PR correctly identifies the problem and the in_transaction guard in store_file_nodes_edges() is the right instinct. The three regression tests (direct unit test, incremental update, full build) are thorough and well-structured.

However, PR #205 goes further:

  1. Root cause elimination: sets isolation_level=None on all sqlite3.connect() calls — disabling implicit transactions project-wide so the guard becomes a true last-resort safety net rather than the primary defense.
  2. Full incremental path: also commits after the deleted-file loop in incremental_update() (your incremental.py diff only patches full_build()).
  3. Other connect() sites: your fix only touches graph.py, while fix: resolve SQLite transaction nesting errors (#135, #181) #205 also patches registry.py and embeddings.py.
  4. Atomicity across batch ops: wraps store_flows(), store_communities(), and _compute_summaries blocks in explicit transactions.

Your end-to-end regression tests were valuable — they were the most comprehensive integration coverage among all the competing PRs. Thank you for the contribution.

@tirth8205 tirth8205 closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants