fix(review): gate exact reviews by provider lease#281
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Codex review: needs maintainer review before merge. Reviewed June 12, 2026, 11:47 AM ET / 15:47 UTC. Summary Reproducibility: yes. at source level: current main reduces background capacity when exact runs are active but has no shared gate before independent exact-review workflows invoke Codex. The live run reproduces the proposed contention model, although it intentionally does not reproduce an actual Codex-provider overload. Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Use one capacity model where practical: first determine whether the existing worker limiter can provide cross-workflow exact-review admission and terminal waiting semantics; retain the provider lease only if stronger provider-key protection requires it, with the state-outage policy and defaults explicitly approved. Do we have a high-confidence way to reproduce the issue? Yes at source level: current main reduces background capacity when exact runs are active but has no shared gate before independent exact-review workflows invoke Codex. The live run reproduces the proposed contention model, although it intentionally does not reproduce an actual Codex-provider overload. Is this the best way to solve the issue? Unclear. The lease is a coherent solution with strong lifecycle proof, but the repository-member author has appropriately paused the branch because extending the existing limiter may provide a simpler single source of capacity policy. AGENTS.md: found and applied where relevant. Codex review notes: model internal, reasoning high; reviewed against be65dd0db8a3. Label changesLabel justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
9aa183f to
071ccec
Compare
071ccec to
c562d12
Compare
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
c562d12 to
2e5a31d
Compare
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review |
2e5a31d to
14eed4e
Compare
14eed4e to
fe01297
Compare
fe01297 to
aca476a
Compare
aca476a to
64760bc
Compare
|
Putting this back to draft for now. The immediate overload symptoms appear to have been mitigated by lowering the ClawSweeper worker/shard defaults on That said, this implementation adds a separate provider-lease system backed by Before treating this as merge-ready, I think we should first decide whether exact re-review should be folded into the existing worker/capacity limiter instead of introducing a parallel provider-level limiter. If the existing limiter can count active exact-review runs as workers and make new exact-review runs wait or time out consistently, that would keep the capacity model easier to reason about. A provider lease may still be the right second-stage hardening if we need stronger cross-workflow/provider-key protection than GitHub Actions active-run accounting can provide. |
Why
ClawSweeper exact reviews can run from many GitHub workflow instances at once. Worker sizing only controls local/shard concurrency; it does not protect the shared Codex provider/API-key capacity across independent runs. The important behavior change in this PR is to move overload handling from "start Codex, hit provider capacity, then retry/fail" to "reserve a global provider slot before starting Codex."
That gives us a controlled queue/backpressure point inside ClawSweeper: while global provider capacity is full, an exact review waits for a lease and reports
Waiting for Codex capacityinstead of stampeding the provider and amplifying transient transport/capacity failures. If capacity remains full until the configured wait deadline, the workflow now writes a terminalCapacity timeoutcommand status instead of leaving the command looking like it is still waiting. If the provider-lease state update itself fails, the run fails explicitly rather than pretending capacity is merely full.The lease is renewed while the exact review is running. Review runtime is not bounded by the original lease TTL in practice. A long-running review keeps its slot; if renewal stops or the lease disappears, the review step fails closed instead of continuing after its slot has expired.
Summary
Testing
pnpm run build:repairnode --test test/repair/provider-lease.test.tsnode --test --test-name-pattern "provider lease|exact event reviews|live proof mode" test/clawsweeper.test.tspnpm run checkon Node v24.15.0pnpm run format:check,node --test --test-name-pattern "event re-review status|provider capacity|provider lease" test/clawsweeper.test.ts, andautoreview --mode localclean after fixesObserved local results on head
64760bcb6afa74b427c716f9aecb7eab87df73ea:pnpm run checkpassed: 454 unit tests, 504 repair tests, coverage checks, and format check.Code locations
Head:
64760bcb6afa74b427c716f9aecb7eab87df73easrc/repair/provider-lease.ts#L205-L225src/repair/provider-lease.ts#L237-L265src/repair/provider-lease.ts#L282-L323.github/workflows/sweep.yml#L424-L507.github/workflows/sweep.yml#L509-L519.github/workflows/sweep.yml#L1949-L2335test/repair/provider-lease.test.ts#L110-L142test/repair/provider-lease.test.ts#L181-L209test/repair/provider-lease.test.ts#L211-L331test/repair/provider-lease.test.ts#L333-L408test/repair/provider-lease.test.ts#L423-L445test/clawsweeper.test.ts#L17789-L17797,test/clawsweeper.test.ts#L18491-L18498,test/clawsweeper.test.ts#L18501-L18534Automated proof
The local provider-lease proof is CI-safe and deterministic. It uses a temporary local bare git repository as the shared state remote, then clones it into multiple independent checkouts to model multiple workflow runners racing on the same
statebranch.What the shared-state proof executes:
statebranchcapacity=1from the first checkout and verifiesacquired=truewithactiveWeight=1acquired=false,activeWeight=1, and aprovider capacity fullreasonprovider capacity fullpre-receivehook on the bare origin and verifies a failed shared-state push throwsshared state push failedinstead of returning a false successful acquisitionLive Actions proof
Run: https://github.com/openclaw/clawsweeper/actions/runs/27425442802
How it was triggered:
gh workflow run sweep.yml \ --repo openclaw/clawsweeper \ --ref codex/provider-lease-throttle \ -f target_repo=openclaw/clawsweeper \ -f additional_prompt='[provider-lease-live-proof]'The marker
[provider-lease-live-proof]skips normal review/planning/publish/shard jobs and runs only the proof jobs. This proof uses real GitHub Actions runners and the realopenclaw/clawsweeper-staterepository, but an isolated provider namespacecodex-live-proof-27425442802; it does not invoke Codex.Observed workflow metadata:
workflow_dispatch64760bcb6afa74b427c716f9aecb7eab87df73easuccessPlan review candidates,Review shard ${{ matrix.shard }},Review, comment, and apply event item,Publish review artifacts,Recover failed review shards,Apply close proposals,Audit stateObserved real concurrency:
Provider lease live proof concurrent waiterstarted at2026-06-12T15:27:05ZProvider lease live proof holderstarted at2026-06-12T15:27:06Zprovider-leases/codex-live-proof-27425442802.jsonObserved holder/acquire/renew/release behavior:
proof-holder-27425442802-1at2026-06-12T15:27:41Zwith JSONacquired: true,activeWeight: 1,capacity: 115:27:52Z,15:28:04Z,15:28:16Z, and15:28:28Z, each returning JSONrenewed: true15:28:29Zwith JSONreleased: trueObserved concurrent capacity timeout:
2026-06-12T15:27:42Zcapacity=1,weight=1,wait-seconds=202026-06-12T15:28:09Zwith JSONacquired: false,activeWeight: 1,capacity: 1, and reasonprovider capacity full: active weight 1/1, requested 1Observed release recovery:
Provider lease live proof release checkacquiredproof-release-check-27425442802-1at2026-06-12T15:29:08Zwith JSONacquired: true,activeWeight: 1,capacity: 115:29:10Zwith JSONreleased: trueObserved cancellation/TTL recovery:
proof-abandoned-27425442802-1at2026-06-12T15:29:53Zwithttl-seconds=15, JSONacquired: true,activeWeight: 1,capacity: 1proof-recovered-27425442802-1succeeded at2026-06-12T15:30:15Zwith JSONacquired: true,activeWeight: 1,capacity: 115:30:17Zwith JSONreleased: true15:30:19ZWhat this proof does not claim: