[codex] Add Phase 2 validate_patch decision engine by qkal · Pull Request #2 · qkal/techne

qkal · 2026-06-19T20:51:02Z

Summary

Add a Phase 2 validate_patch response contract centered on deterministic decisions, blockers, required checks, evidence, next actions, and fix plans.
Wire the service and MCP tool output through agent_quality_mcp.response while preserving shadow-workspace validation and structured rejection for unsupported mutation modes.
Add regression coverage for decision precedence, optional quick-mode tooling, diagnostic truncation evidence, resource-limit classification, and stale response-contract exports.

Why

The previous response shape exposed raw status buckets and suggestions but did not give callers a clear, deterministic decision path. This adds a decision-engine-first contract so agents can tell whether to apply, revise, fix tooling, reject, or escalate.

Validation

.venv/bin/python -m pytest -v -> 277 passed
.venv/bin/ruff check . -> passed
.venv/bin/pyright --pythonpath .venv/bin/python -> 0 errors
git diff --check -> passed

Notes

The branch was rebased onto master before publishing so the PR diff is scoped to the Phase 2 implementation.

Summary by CodeRabbit

Release Notes

New Features
- Redesigned validation response with structured decision outcomes, confidence levels, actionable next steps, and targeted fix plans replacing previous status-based format.
- Added comprehensive evidence tracking including diagnostic counts, truncation flags, tool availability, and workspace modification state.
Bug Fixes
- Improved error handling for resource limit violations with dedicated error type.
Documentation
- Updated documentation explaining new Phase 2 response structure and field mappings from previous format.

coderabbitai · 2026-06-19T20:51:10Z

Warning

Review limit reached

@qkal, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 26 minutes and 1 second. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9a044544-6760-4d6b-9296-250548aa5665

📥 Commits

Reviewing files that changed from the base of the PR and between fafca53 and cbaa070.

📒 Files selected for processing (2)

src/agent_quality_mcp/response.py
tests/unit/test_response_contract.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/phase-2-decision-engine-split

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

devin-ai-integration

Devin Review found 1 potential issue.

devin-ai-integration · 2026-06-19T21:10:56Z

🟡 Incomplete transformation: _final_response computes risk_score from compressed while success path uses diagnostics

The PR intentionally changed the success path (line 172-176) from compute_risk_score(compressed, ...) to compute_risk_score(diagnostics, ...) so that decision-making uses the full uncompressed diagnostic set. The _final_response function was partially updated — the _response_from_parts call was changed to pass diagnostics instead of compressed (line 422), but the compute_risk_score and _missing_tools calls on lines 407-410 still use compressed. This creates a semantic inconsistency: the decision pipeline in build_validate_patch_response (response.py:157-169) operates on the full diagnostics, but the risk_score passed alongside was computed from a potentially different compressed subset. Currently this has no observable impact because _final_response is always called with a single blocking diagnostic that compression never removes, but the inconsistency would surface if _final_response were ever called with multiple diagnostics where compression truncates some.

(Refers to lines 406-410)

Was this helpful? React with 👍 or 👎 to provide feedback.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/agent_quality_mcp/response.py`:
- Around line 60-66: The decision field in the ValidatePatchResponse class is
currently typed as a raw str, which weakens type safety and contract validation.
Change the type annotation of the decision field from str to PatchDecision enum
to ensure deterministic schema validation and stronger contract guarantees while
maintaining string serialization for API responses.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c404e0e4-8928-4bb8-9321-0d1fe15dc866

📥 Commits

Reviewing files that changed from the base of the PR and between f4c7cd0 and fafca53.

📒 Files selected for processing (19)

README.md
src/agent_quality_mcp/actions.py
src/agent_quality_mcp/decision.py
src/agent_quality_mcp/exceptions.py
src/agent_quality_mcp/grouping.py
src/agent_quality_mcp/models.py
src/agent_quality_mcp/response.py
src/agent_quality_mcp/service.py
src/agent_quality_mcp/shadow.py
src/agent_quality_mcp/tools.py
tests/integration/test_validate_patch_demo.py
tests/unit/test_actions.py
tests/unit/test_decision.py
tests/unit/test_grouping.py
tests/unit/test_models.py
tests/unit/test_response_contract.py
tests/unit/test_service.py
tests/unit/test_tools_server.py
tests/unit/test_workspace_shadow.py

💤 Files with no reviewable changes (1)

src/agent_quality_mcp/models.py

📜 Review details

🧰 Additional context used

🪛 ast-grep (0.43.0)

tests/unit/test_response_contract.py

[info] 19-19: Do not hardcode temporary file or directory names
Context: "/tmp/demo"
Note: [CWE-377].

(hardcoded-tmp-file)

tests/unit/test_service.py

[info] 26-26: Do not hardcode temporary file or directory names
Context: "/tmp/shadow"
Note: [CWE-377].

(hardcoded-tmp-file)

🔇 Additional comments (24)

README.md (1)

10-11: LGTM!

Also applies to: 110-164

src/agent_quality_mcp/response.py (1)

76-335: LGTM!

tests/unit/test_response_contract.py (1)

24-209: LGTM!

tests/unit/test_models.py (1)

3-5: LGTM!

Also applies to: 103-105

tests/unit/test_service.py (1)

23-31: LGTM!

Also applies to: 37-38, 52-52, 65-65, 113-116, 138-173, 190-190, 202-202, 212-212, 268-270, 306-307, 336-337, 366-369, 386-389, 392-629, 676-677

src/agent_quality_mcp/tools.py (1)

10-11: LGTM!

Also applies to: 41-45, 74-89

tests/integration/test_validate_patch_demo.py (1)

14-14: LGTM!

Also applies to: 54-70, 95-106

tests/unit/test_tools_server.py (1)

5-5: LGTM!

Also applies to: 88-88, 144-148, 151-203

src/agent_quality_mcp/exceptions.py (1)

24-26: LGTM!

src/agent_quality_mcp/shadow.py (1)

12-12: LGTM!

Also applies to: 99-106

src/agent_quality_mcp/service.py (1)

25-25: LGTM!

Also applies to: 48-48, 173-183, 368-371, 384-384, 422-422, 453-478, 481-489, 530-531

tests/unit/test_workspace_shadow.py (1)

6-6: LGTM!

Also applies to: 75-78, 91-94

src/agent_quality_mcp/decision.py (4)

1-99: LGTM!

101-146: LGTM!

149-203: LGTM!

205-298: LGTM!

tests/unit/test_decision.py (1)

1-268: LGTM!

src/agent_quality_mcp/grouping.py (3)

1-98: LGTM!

101-169: LGTM!

172-238: LGTM!

tests/unit/test_grouping.py (1)

1-232: LGTM!

src/agent_quality_mcp/actions.py (2)

1-104: LGTM!

106-226: LGTM!

tests/unit/test_actions.py (1)

1-190: LGTM!

qkal added 16 commits June 19, 2026 22:52

feat: add phase 2 diagnostic grouping

0380f23

fix: honor diagnostic grouping review findings

9c3d24c

fix: include diagnostic ranges in grouping counts

894b3b8

fix: narrow compressed grouping types

cdaf02f

feat: add phase 2 decision precedence

5e5842f

fix: honor record timeouts in phase 2 decisions

f4ccfd7

feat: add phase 2 next actions

96fec7f

fix: align phase 2 action plan defaults

7f1579c

feat: add phase 2 response assembly

7671b89

fix: prefer reject blocker summary

8f91515

feat: switch validate_patch to decision response

f9dcc9f

fix: restore validate_patch response contract

e4e37af

fix: harden phase 2 response contracts

bd55f20

docs: document phase 2 decision response

3dc2ecb

docs: use phase 2 evidence path

53b2b18

fix: close phase 2 final review gaps

fafca53

qkal force-pushed the feat/phase-2-decision-engine-split branch from 3bf8e04 to fafca53 Compare June 19, 2026 20:53

qkal marked this pull request as ready for review June 19, 2026 20:59

devin-ai-integration Bot reviewed Jun 19, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread src/agent_quality_mcp/response.py

fix: type validate patch decision enum

cbaa070

qkal merged commit 554b512 into master Jun 19, 2026
3 checks passed

qkal deleted the feat/phase-2-decision-engine-split branch June 19, 2026 21:35

coderabbitai Bot mentioned this pull request Jun 22, 2026

[codex] add pyright LSP validation #3

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add Phase 2 validate_patch decision engine#2

[codex] Add Phase 2 validate_patch decision engine#2
qkal merged 17 commits into
masterfrom
feat/phase-2-decision-engine-split

qkal commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Review limit reached

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 19, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qkal commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qkal commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading