-
Notifications
You must be signed in to change notification settings - Fork 244
ClawSweeper re-review fails with Codex transient_transport on moderate fork PRs (3/3 retries, then falls back to off-meta tidepool) #282
Copy link
Copy link
Open
Labels
P1Urgent regression or broken agent/channel workflow affecting real users now.Urgent regression or broken agent/channel workflow affecting real users now.clawsweeper:current-main-reproClawSweeper found a high-confidence current-main issue reproduction.ClawSweeper found a high-confidence current-main issue reproduction.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.This issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🦀 challenger crabExceptional issue quality: high-confidence current-main reproduction and actionable evidence.Exceptional issue quality: high-confidence current-main reproduction and actionable evidence.
Metadata
Metadata
Assignees
Labels
P1Urgent regression or broken agent/channel workflow affecting real users now.Urgent regression or broken agent/channel workflow affecting real users now.clawsweeper:current-main-reproClawSweeper found a high-confidence current-main issue reproduction.ClawSweeper found a high-confidence current-main issue reproduction.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.This issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🦀 challenger crabExceptional issue quality: high-confidence current-main reproduction and actionable evidence.Exceptional issue quality: high-confidence current-main reproduction and actionable evidence.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Symptom
@clawsweeper re-reviewon a fork PR consistently fails withreason=transient_transportfrom Codex retries, then falls back to a placeholder review:decision=keep_open confidence=low action=kept_openreview_summary: "Review failed before ClawSweeper could summarize the requested change."🌊 off-meta tidepool / [P1] Review did not complete (retryable codex transport failure)The run itself reports
conclusion: success(all GitHub Actions steps succeed), so the failure is only visible by reading step 13 "Review exact event item" logs.Repro
PR: openclaw/openclaw#92181 (fork PR,
+135 src / +103 tests, 4 files, body 4243 B, diff 21460 B — moderate size).Three independent
@clawsweeper re-reviewattempts:Run 27389971002 step 13 log excerpt
Each attempt completed in well under the configured
--codex-timeout-ms 600000andtimeout 12mouter limits — Codex returned the transport error early rather than hitting the time budget.Codex CLI invocation
From the same job log:
codex-cli 0.139.0is what was installed in the cache hit step.Why this looks like a backend issue, not a content issue
openclaw/openclawPRs in the same dispatch event window succeeded), so the runner / setup / token plumbing is finePossible directions
The failing attempts all use
--codex-reasoning-effort high, which produces long thinking streams. If the internal Codex backend or any intermediate proxy has a stream / idle timeout shorter than the long-tail thinking time on certain prompts, the connection would close and surface astransient_transportregardless of the explicit per-call timeout. That would also explain why specifically this PR keeps reproducing while other PRs in the same workflow complete.Possible mitigations on the ClawSweeper side, in case the upstream Codex fix is not quick:
🌊 off-meta tidepool / Review did not completetransient_transportin the verdict comment (currently it just saysdid not finish cleanly, which reads like a queue / dispatcher problem)additional-promptknob to let an author request a smaller-context retry without changing the underlying PRAffected user-visible state
🌊 off-meta tidepoolrating with[P1] Review did not complete (retryable codex transport failure)clawsweeper-command-progress:endblock showsState: Failed / Detail: The targeted re-review did not finish cleanly. Check the workflow run for details.@clawsweeper re-reviewcalls reproduce the same outcomeHappy to provide more logs, retry traces, or attempt a smaller-diff variant of the same PR if it helps narrow down whether prompt size, reasoning effort, or backend capacity is the dominant factor.