Skip to content

[codex] Fix session creation D1 failure ordering#563

Merged
ColeMurray merged 1 commit intomainfrom
codex/d1-session-index-before-init
Apr 26, 2026
Merged

[codex] Fix session creation D1 failure ordering#563
ColeMurray merged 1 commit intomainfrom
codex/d1-session-index-before-init

Conversation

@ColeMurray
Copy link
Copy Markdown
Owner

@ColeMurray ColeMurray commented Apr 26, 2026

Summary

Moves the POST /sessions D1 session-index write before SessionDO initialization. SessionDO init starts sandbox warming, so this makes D1 failures fail before any sandbox can be spawned.

Adds a focused router regression test proving that a D1 session-index failure does not call /internal/init, and that successful creates write D1 before initializing the SessionDO.

Why

During D1 stalls, the previous order could initialize a SessionDO and start sandbox warming before the caller received a sessionId. Bot flows send prompts in a second request after POST /sessions returns, so if the D1 write stalled or failed after DO init, a sandbox could boot without ever receiving the prompt.

Validation

  • npm run build -w @open-inspect/shared
  • npm test -w @open-inspect/control-plane -- router.create-session.test.ts
  • npm run typecheck -w @open-inspect/control-plane
  • npm run lint -w @open-inspect/control-plane

Summary by CodeRabbit

  • Tests

    • Added test suite for session creation flow, verifying correct error handling and request sequencing.
  • Bug Fixes

    • Updated session creation to validate database persistence before initializing the session backend, ensuring database failures are caught earlier.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

📝 Walkthrough

Walkthrough

This pull request reorders session creation operations in the handler to persist records to D1 before initializing SessionDO, ensuring database failures abort early. A new test suite verifies error handling and call ordering for this behavior.

Changes

Cohort / File(s) Summary
Session Creation Test Suite
packages/control-plane/src/router.create-session.test.ts
New Vitest suite with 2 tests validating handleCreateSession behavior: confirms HTTP 500 response and skipped session fetch when D1 creation fails; verifies HTTP 201, successful D1 creation, and session fetch invocation on success, asserting create is called before DO initialization.
Router Session Handler
packages/control-plane/src/router.ts
Reordered handleCreateSession to call SessionIndexStore.create() before SessionDO initialization via stub.fetch. Removed post-initialization D1 write block, ensuring D1 failures prevent durable object sandbox warming.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • ColeMurray/background-agents#550: Modifies handleCreateSession in router.ts to add userId resolution and persistence during session creation, directly related to the session handling flow reordering.

Suggested reviewers

  • open-inspect

Poem

🐰 A rabbit hops through sessions new,
D1 writes before DO's debut,
Early failures save the day—no waste,
Tests confirm the proper pace,
Thump thump goes the ordered trace! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[codex] Fix session creation D1 failure ordering' clearly and concisely summarizes the main change: reordering when D1 writes occur during session creation to improve failure handling.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/d1-session-index-before-init

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Terraform Validation Results

Step Status
Format
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

@ColeMurray ColeMurray marked this pull request as ready for review April 26, 2026 07:18
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/control-plane/src/router.ts (1)

884-940: ⚠️ Potential issue | 🟡 Minor

Orphaned D1 row when SessionDO init fails after D1 write.

The reorder correctly prevents orphaned sandboxes when D1 fails, but the inverse case is now possible: if stub.fetch(SessionInternalPaths.init) returns non-OK (line 938) or throws, the D1 row written at lines 887–902 with status: "created" is left behind. It will surface in handleListSessions/the sidebar without a working DO, and the caller never receives the sessionId so it cannot retry or clean up.

For symmetry with the prompt-failure path in handleSpawnChild (lines 1991, 2004 mark the child as "failed"), consider compensating on init failure—either delete the row or mark it as "failed"—so the user-visible session list stays consistent with reality.

♻️ Suggested compensating cleanup on init failure
   if (!initResponse.ok) {
+    // Compensate: mark the just-created D1 row so it doesn't appear as a live "created" session.
+    try {
+      await sessionStore.updateStatus(sessionId, "failed");
+    } catch (e) {
+      logger.warn("Failed to mark session as failed after init failure", {
+        session_id: sessionId,
+        error: e instanceof Error ? e : String(e),
+      });
+    }
     return error("Failed to create session", 500);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/control-plane/src/router.ts` around lines 884 - 940, The D1 row
created by SessionIndexStore.create can be left orphaned if
stub.fetch(buildSessionInternalUrl(SessionInternalPaths.init)) fails or returns
non-OK; wrap the init call in a try/catch and on any failure (non-ok response or
thrown error) perform a compensating update via the SessionIndexStore instance
(sessionStore) to mark the session id as failed (e.g., set status: "failed" and
updatedAt: Date.now()) or remove the row, then return the error—do this
immediately around the await stub.fetch(...) block so sessionStore.create(...)
(the call that created the row) is always reconciled on init failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/control-plane/src/router.ts`:
- Around line 884-940: The D1 row created by SessionIndexStore.create can be
left orphaned if stub.fetch(buildSessionInternalUrl(SessionInternalPaths.init))
fails or returns non-OK; wrap the init call in a try/catch and on any failure
(non-ok response or thrown error) perform a compensating update via the
SessionIndexStore instance (sessionStore) to mark the session id as failed
(e.g., set status: "failed" and updatedAt: Date.now()) or remove the row, then
return the error—do this immediately around the await stub.fetch(...) block so
sessionStore.create(...) (the call that created the row) is always reconciled on
init failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b55f0a38-098e-42b8-b5e2-132ea9086ace

📥 Commits

Reviewing files that changed from the base of the PR and between 8bee466 and 34964b4.

📒 Files selected for processing (2)
  • packages/control-plane/src/router.create-session.test.ts
  • packages/control-plane/src/router.ts

@ColeMurray ColeMurray merged commit a2db39a into main Apr 26, 2026
18 checks passed
@ColeMurray ColeMurray deleted the codex/d1-session-index-before-init branch April 26, 2026 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant