Fetch full file context during review, not just diffs by iamnbutler · Pull Request #700 · iamnbutler/tasks

iamnbutler · 2026-03-31T16:38:12Z

Summary

Adds proactive context fetching to the orchestrator's PR evaluation flow: before pass 1, the diff is parsed to identify the most-changed files (up to 3), and surrounding code (±50 lines around each changed line) is fetched from the PR branch and included in the evaluation prompt
Adds a new diff module with a unified diff parser (parse_diff_files) and context windowing (extract_context_window) with configurable limits (max 3 files, ~1000 lines total, ±50 line window)
Updates the system prompt to instruct the reviewer to use the proactive file context for checking callers/dependents

Test plan

All 65 orchestrator tests pass (including 7 new tests for diff parsing, context extraction, and prompt integration)
Full workspace compiles cleanly
Verify proactive context appears in evaluation prompts during live review
Confirm token budget stays reasonable with context included

Closes #668

🤖 Generated with Claude Code

Add proactive context fetching to the orchestrator's PR evaluation flow. Before pass 1, the diff is parsed to identify the most-changed files (up to 3), and surrounding code (±50 lines around each change) is fetched from the PR branch and included in the evaluation prompt. - Add `diff` module with unified diff parser and context windowing - Extend `build_evaluation_prompt` with `file_contexts` parameter - Update system prompt to instruct reviewer to use file context - Fetch proactive context in `evaluate()` before pass 1 LLM call - Add comprehensive tests for diff parsing and context extraction Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions

This PR adds proactive context fetching to the Pass 1 evaluation flow. The approach is well-scoped: parse the diff, rank files by change count, fetch ±50-line windows around changed lines for the top 3 files, and include them in the prompt. The new diff module is cleanly isolated with good unit tests. All 65 tests pass, clippy is clean.

Two suggestions inline — no blockers.

Full Review

Summary

What changed: New diff module (parse_diff_files, extract_context_window) + proactive context fetching loop in ClaudeOrchestrator::evaluate() before Pass 1 + build_evaluation_prompt extended with file_contexts parameter.

Correctness

Error handling in the fetch loop is correct: Ok(None) (new file, not yet on branch) is silently skipped; Err is warned and skipped. Neither case aborts evaluation — appropriate for a best-effort enhancement.
build_deep_review_prompt correctly passes &[] for file contexts — deep review already includes full file contents, so no duplication.
std::mem::take correctly clears changed_lines on each file flush.
Deleted files (+++ /dev/null) are correctly excluded: current_path = None prevents accumulation, and the empty changed_lines guard would catch it even if it didn't.
The parse_hunk_new_start parser handles @@ -0,0 +1,5 @@ (new file) and single-number hunks (@@ -1 +1 @@) correctly, as verified by the tests.

Spec Compliance

No spec impact — this is an internal orchestrator enhancement to the evaluation prompt.

Architecture

Crate boundaries are correct. The diff module lives in orchestrator where it's used; the public visibility is consistent with other orchestrator modules. get_file_content_at_ref is an existing GitHub client method — no new API surface needed.

Code Quality

Constants (MAX_CONTEXT_FILES, MAX_CONTEXT_LINES, CONTEXT_WINDOW) are well-named and centralized.
The two suggestions (budget enforcement + removal-only files) are worth addressing but not blockers.
Test coverage is solid for the pure functions; the fetch loop in claude.rs is untested but that's consistent with how the rest of ClaudeOrchestrator::evaluate() is structured (it makes real network calls).

Reviewed by PR / Review

github-actions · 2026-03-31T16:46:11Z

+                                context_lines = line_count,
+                                "Fetched proactive context"
+                            );
+                            total_lines += line_count;


[SUGGESTION] Priority: Correctness

MAX_CONTEXT_LINES is checked before fetching the next file, but never enforced after extracting context for a single file. A file with many scattered changes can blow well past 1000 lines — e.g., 50 changes each 110 lines apart yields ~50 × 101 = ~5,000 output lines from extract_context_window.

The const comment says "Maximum total lines of context to include (across all files)" but the current loop treats it as a "stop fetching more files after exceeding the budget" cap, not a hard cap.

If token budget matters here (and it will on large refactoring PRs), consider one of:

// Option A: enforce a per-file cap inside extract_context_window (simplest) let context_window = extract_context_window( &content, &file.changed_lines, CONTEXT_WINDOW, ); let context_window_capped: String = context_window .lines() .take(MAX_CONTEXT_LINES.saturating_sub(total_lines)) .collect::(Vec<_)>() .join("\n");

Or truncate before pushing. Either way, updating the doc comment on MAX_CONTEXT_LINES to match actual behavior would clarify intent.

github-actions · 2026-03-31T16:46:11Z

+        } else if current_path.is_some() && new_line_num > 0 {
+            if line.starts_with('+') && !line.starts_with("+++") {
+                // Added line
+                changed_lines.push(new_line_num);


[SUGGESTION] Priority: Code Quality

Only added lines are pushed to changed_lines; removed lines only increment change_count. As a result, a file whose diff is entirely removals (e.g., deleting a function) ends up with empty changed_lines and gets filtered out of context fetching at the flush step (if !changed_lines.is_empty()).

This is arguably correct — there's no "new" line position to anchor a context window to — but it creates a blind spot for the exact case the system prompt calls out: "For removals: is anything left that depends on the removed code?" That check needs context around the removal site in the post-PR file, which is currently not fetched.

Might be worth a comment explaining the trade-off, or tracking the nearest surviving context line for pure-deletion hunks.

iamnbutler · 2026-04-02T02:16:28Z

Closing duplicate - newer PR #701 exists

github-actions bot reviewed Mar 31, 2026

View reviewed changes

iamnbutler closed this Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch full file context during review, not just diffs#700

Fetch full file context during review, not just diffs#700
iamnbutler wants to merge 1 commit intomainfrom
tasks/gh-iamnbutler-tasks-issue-668--92055ba7

iamnbutler commented Mar 31, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot Mar 31, 2026

Uh oh!

github-actions bot Mar 31, 2026

Uh oh!

iamnbutler commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iamnbutler commented Mar 31, 2026

Summary

Test plan

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Summary

Correctness

Spec Compliance

Architecture

Code Quality

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

iamnbutler commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant