Skip to content

fix(code-review): retry broken sandboxes#3110

Open
alex-alecu wants to merge 4 commits intomainfrom
heartbreaking-ragamuffin
Open

fix(code-review): retry broken sandboxes#3110
alex-alecu wants to merge 4 commits intomainfrom
heartbreaking-ragamuffin

Conversation

@alex-alecu
Copy link
Copy Markdown
Contributor

Why

Sometimes a code review can fail because the machine it is using breaks. When that happens, the review should get one clean second try instead of staying failed.

What changed

Code reviews now remember which sandbox and attempt they are using. When cloud-agent-next confirms a sandbox failed and destroys it, the web app claims affected reviews only once and starts them again in a fresh attempt. Old updates from the first attempt are ignored, so they cannot overwrite the retry. The same pull request check is reused and shows that the review is trying again.

How to test

  1. Run pnpm drizzle migrate.
  2. Run pnpm test -- "apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts" apps/web/src/app/api/internal/code-review-sandbox-destroyed/route.test.ts apps/web/src/lib/code-reviews/sandbox-retry.test.ts apps/web/src/lib/code-reviews/db/code-reviews.test.ts apps/web/src/lib/code-reviews/dispatch/dispatch-pending-reviews.test.ts apps/web/src/routers/code-reviews-router.test.ts.
  3. Run pnpm --filter kilo-code-review-worker test.
  4. Run pnpm --filter cloud-agent-next test -- sandbox-recovery.test.ts session-prepare.test.ts.
  5. Run pnpm --filter @kilocode/worker-utils test.

Comment thread packages/db/src/migrations/0117_new_jasper_sitwell.sql Outdated
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 7, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (3 files)
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts
  • packages/db/src/migrations/0117_new_jasper_sitwell.sql
Resolved Findings
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - sandbox retry now only skips terminal cleanup when a retry was actually claimed
  • packages/db/src/migrations/0117_new_jasper_sitwell.sql - sandbox ID index now uses CREATE INDEX CONCURRENTLY

Reviewed by gpt-5.5-2026-04-23 · 810,280 tokens

@alex-alecu
Copy link
Copy Markdown
Contributor Author

Manual test passed.

Tested:

  • Dispatched a code review through code-review-infra using a public git URL and cloud-agent-next.
  • Simulated sandbox destruction through the internal sandbox-destroyed endpoint.
  • Dispatched and verified retry attempt 2.
  • Exercised stale attempt-1 callback handling and duplicate sandbox-destroyed notification handling.

Verified:

  • cloud-agent-next prepared a sandbox, cloned the repo, produced session IDs, and completed the review callback flow.
  • Retry claim persisted current_attempt = 2, sandbox_retry_count = 1, sandbox_retry_reason = sandbox_500_destroyed, and cleared old session/sandbox fields.
  • Retry attempt 2 used a fresh session_id, cli_session_id, and sandbox_id, then completed successfully.
  • Stale attempt-1 callback was ignored, and duplicate sandbox-destroyed notification returned claimed = 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant