Fix CI failures to re-dispatch instead of reject#736
Conversation
When CI is failing, the system was rejecting PRs and starting from scratch. Now it uses "changes requested" flow which: - Keeps the existing branch/PR - Re-dispatches the agent with feedback about CI failures - Agent can fix the issue without losing work Changes: - Remove "DO NOT APPROVE" language from CI status in prompt - Add CI failure check before approval/rejection flow - Use request_changes_merge_entry for CI failures - Update system prompt to clarify CI is handled separately
There was a problem hiding this comment.
This PR correctly addresses the core problem — using request_changes_merge_entry instead of full rejection keeps the branch alive and lets the agent fix CI without starting over. The prompt changes are clean and the system prompt update makes the LLM/CI separation explicit.
Two correctness issues worth fixing before this ships:
Double API fetch (run_loop.rs ~line 2193–2233): The first get_pull_request call detects the CI failure but drops the PR object. A second identical call fetches check names for the feedback string. These should be one call — the current approach wastes an API round-trip and creates a window where the second fetch could return different CI state than the first.
Comment/event ordering (run_loop.rs ~line 2191): emit_orchestrator_decision and post_issue_comment both fire before the CI gate is checked. When CI is failing, the PR gets an "Approved" or "Rejected" comment and a mismatched decision event in the log, while the actual outcome is "Changes Requested". This confuses the agent (sees "Approved" but gets re-dispatched) and leaves an inconsistent audit trail. Moving the CI gate earlier — before the comment and event — is the simplest fix.
One suggestion: when both CI and code quality are failing, only CI feedback reaches the agent. Including the quality rejection reasoning would prevent a cycle where the agent fixes CI but reintroduces the same code problem.
Reviewed by PR / Review
| "CI checks are failing. Please fix the build/test failures and push again.".to_string() | ||
| }; | ||
|
|
||
| if let Err(e) = orch_server.request_changes_merge_entry(&entry_id, "CI checks failing", &feedback).await { |
There was a problem hiding this comment.
[IMPORTANT]
Priority: Correctness
Two separate get_pull_request calls are made when CI is failing — one at line 2193 to detect the failure, and a second one here to get the check names for feedback. The PR object fetched on the first call is dropped after extracting the boolean, so it can't be reused.
This is wasteful (an extra API call per evaluation cycle when CI is red) and creates a subtle correctness gap: the two fetches may return different data if checks change between calls, so ci_failing could be true while the second fetch now shows Success.
A straightforward fix is to change the first fetch to return an Option(PullRequest) and reuse it:
let ci_pr = if let Some((owner, repo, number)) = tasks_orchestrator::parse_pr_url(&pr_url) {
match merge_github.get_pull_request(&owner, &repo, number).await {
Ok(pr) => Some(pr),
Err(e) => {
warn!(entry_id = %entry_id, error = %e, "failed to fetch PR for CI check");
None
}
}
} else {
None
};
let ci_failing = ci_pr.as_ref().map(|pr| {
use tasks_github::model::StatusCheckRollupState;
matches!(pr.ci_status, Some(StatusCheckRollupState::Failure) | Some(StatusCheckRollupState::Error))
}).unwrap_or(false);
if ci_failing {
let feedback = ci_pr.as_ref().map(|pr| {
// build feedback from pr.check_runs directly, no second fetch needed
...
}).unwrap_or_else(|| "CI checks are failing. ...".to_string());
...
}| evaluated_prs.insert(pr_url.clone(), sha.clone()); | ||
| } | ||
|
|
||
| // GATE: Check CI status before acting on evaluation (#690) |
There was a problem hiding this comment.
[IMPORTANT]
Priority: Correctness
The emit_orchestrator_decision event (line ~2148) and post_issue_comment (line ~2173) both run before this CI gate, so by the time ci_failing is checked the PR already has an "Orchestrator Evaluation: Approved" (or "Rejected") comment and the event log records the evaluation decision.
When CI is failing:
- If code was approved: PR shows "Approved" but
request_changes_merge_entryis immediately called → the entry transitions toChangesRequested. The agent re-dispatches seeing a confusing "Approved" comment. - If code was rejected: same mismatch — PR shows "Rejected" but the system re-dispatches rather than deleting the branch.
The audit trail in the event log is also inconsistent: a Decision(approved=true) event is emitted that doesn't reflect the actual outcome.
The CI gate should run before posting the comment and emitting the decision event, or the comment/event should reflect the CI-gated final action. At minimum, when the CI gate fires it should post a follow-up comment (or amend the decision event) so the PR history isn't misleading.
| false | ||
| }; | ||
|
|
||
| if ci_failing { |
There was a problem hiding this comment.
[SUGGESTION]
Priority: Code Quality
When CI is failing and the orchestrator also rejected the PR for code quality reasons, only the CI feedback is forwarded to the agent (evaluation.feedback is ignored here). The agent is told "fix CI" but not "also the code has these problems". It may push a CI-green commit that still has the same code quality issues, triggering another evaluation → rejection cycle.
Consider combining both signals:
let mut feedback = ci_feedback_string;
if !evaluation.approved {
if let Some(quality_feedback) = &evaluation.feedback {
feedback.push_str(&format!("\n\nAdditionally, the code review found issues:\n{}", quality_feedback));
}
}
Summary
Fix the CI failure handling to re-dispatch agents instead of rejecting PRs.
Problem
When CI was failing, the orchestrator would:
approved: false)reject_merge_entrywhich deletes the branch and re-dispatches from scratchThis broke the whole loop because agents kept recreating the same broken code.
Solution
request_changes_merge_entrywhich:ChangesRequestedstate (priority dispatch)Test plan
Fixes the issue where all PRs were being rejected due to CI failures.