feat: add dry_run preview to apply_refactor_tool (#176) by azizur100389 · Pull Request #226 · tirth8205/code-review-graph

azizur100389 · 2026-04-11T20:02:38Z

Summary

Adds a dry_run: bool = False parameter to apply_refactor_tool so callers can preview a unified diff of every edited file without writing to disk. When dry_run=True the pending refactor is kept in the cache so the caller can review the diffs and call the tool again with the same refactor_id to actually apply — the two-step commit pattern the maintainer explicitly validated in the issue thread.

Closes #176.

Root cause

apply_refactor_tool wrote changes to disk immediately with no confirmation step. In agentic workflows where multiple MCP tools are chained, this tool could be called as a side effect of a broader task the developer didn't intend to modify files for — and there was no undo path beyond git checkout.

Maintainer validation from the issue thread:

Valid security/UX concern. apply_refactor_tool currently writes changes to disk immediately with no confirmation step. The suggested two-step approach (dry_run first, then confirmed=true to apply) is the right pattern for agentic workflows...

Fix

code_review_graph/refactor.py — apply_refactor() gains dry_run: bool = False. When True, compute a unified diff via stdlib difflib.unified_diff between the original file content and the post-edit content, collect per-file entries in a diffs list, and return without writing. The preview is not popped from _pending_refactors in dry-run mode so a follow-up real apply works.
code_review_graph/tools/refactor_tools.py — apply_refactor_func() threads dry_run through.
code_review_graph/main.py — apply_refactor_tool MCP decorator exposes dry_run to clients with an updated docstring.
Path-traversal validation and expiry checks still run in dry-run mode — dry-run is not an escape hatch for security.

Return shape

The non-dry-run path is unchanged (no new keys) so no existing caller breaks. Dry-run adds two keys:

{
    "status": "ok",
    "applied": 0,
    "files_modified": [],
    "edits_applied": 0,
    "dry_run": True,
    "diffs": [{"file": "...", "diff": "--- a/...\n+++ b/...\n@@ ..."}]
}

Tests added (`tests/test_refactor.py::TestApplyRefactor`)

test_apply_refactor_dry_run_returns_diff — valid unified diff, file on disk untouched, preview still cached
test_apply_refactor_dry_run_then_apply — dry-run followed by real apply with the same refactor_id writes the file and consumes the preview
test_apply_refactor_dry_run_blocks_path_traversal — path-traversal check still fires
test_apply_refactor_dry_run_rejects_expired — expired previews still rejected

Test results

Stage	Result
Stage 1 — new targeted tests	4/4 passed
Stage 2 — `tests/test_refactor.py` full	26 passed, 16 pre-existing Windows file-lock teardown errors (verified identical on unchanged `main`)
Stage 3 — adjacent `tests/test_tools.py`	62 passed, 15 pre-existing Windows teardown errors
Stage 4 — full suite	699 passed, 6 pre-existing Windows failures in `test_incremental`/`test_notebook` (verified identical on `main`)
Stage 5 — `ruff check` on all 4 touched files	clean
Stage 6 — MCP tool-wrapper smoke test	dry_run=True returns valid diff without writing; dry_run=False after dry-run writes and consumes

Zero regressions. All pre-existing test issues are Windows-specific file-lock teardown failures that reproduce on unchanged main.

Why this fix is safe

No new dependency — difflib is stdlib.
Non-dry-run return shape is unchanged; existing callers and tests work unmodified.
Dry-run preserves the refactor_id so a real apply can follow without re-running rename_preview.
Security invariants (path traversal, expiry) are enforced in both modes.

Add a dry_run: bool = False parameter to apply_refactor_tool so callers can preview a unified diff of every edited file without touching disk. When dry_run is True, the pending refactor is left in the cache so the caller can review the diffs and invoke the tool again with the same refactor_id to actually apply. This is the two-step commit pattern the maintainer explicitly validated in the issue thread. Root cause ---------- apply_refactor_tool wrote changes to disk immediately with no confirmation step. Agentic workflows that chain MCP tool calls could trigger file rewrites as a side effect of a broader task the developer did not intend to modify files for. There was no undo path beyond `git checkout`, and unstaged work could be lost. Fix --- 1. refactor.py::apply_refactor gains `dry_run: bool = False`. When true, compute a unified diff via stdlib difflib.unified_diff between the original file content and the post-edit content, collect per-file entries in a `diffs` list, and return without writing. The preview is NOT popped from _pending_refactors in dry-run mode so the caller can apply it afterwards. 2. tools/refactor_tools.py::apply_refactor_func threads the flag through. 3. main.py::apply_refactor_tool exposes dry_run to MCP clients. 4. Path-traversal validation and expiry checks still run in dry-run mode so dry-run is not an escape hatch for security. Return shape ------------ Non-dry-run path is unchanged (no new keys) so no existing caller breaks. Dry-run adds `dry_run: True` and `diffs: [{file, diff}, ...]`. Tests added (tests/test_refactor.py::TestApplyRefactor) ------------------------------------------------------- - test_apply_refactor_dry_run_returns_diff: valid unified diff returned, file unchanged on disk, preview still cached. - test_apply_refactor_dry_run_then_apply: dry-run followed by real apply on the same refactor_id writes the file and consumes the preview. - test_apply_refactor_dry_run_blocks_path_traversal: path-traversal check still fires in dry-run mode. - test_apply_refactor_dry_run_rejects_expired: expired previews still rejected in dry-run mode. Test results ------------ Stage 1 (new targeted tests): 4/4 passed. Stage 2 (tests/test_refactor.py): 26 passed, 16 pre-existing Windows file-lock teardown errors (verified identical on unchanged main). Stage 3 (tests/test_tools.py adjacent): 62 passed, 15 pre-existing Windows teardown errors. Stage 4 (full suite): 699 passed, 6 pre-existing Windows failures in test_incremental/test_notebook (verified identical on main). Stage 5 (ruff lint on all 4 touched files): clean. Stage 6 (MCP tool wrapper smoke test): dry_run=True returns valid diff without writing; dry_run=False after dry-run writes and consumes. Zero regressions from this change. Why this fix is safe -------------------- - No new dependency (difflib is stdlib). - Non-dry-run return shape is unchanged; all existing callers and tests work with no modification. - Dry-run preserves the refactor_id so a real apply can follow without re-running rename_preview. - Security invariants (path traversal, expiry) still enforced in both modes.

Register .sh / .bash / .zsh / .ksh with tree-sitter-bash. Extract File and Function nodes for shell function definitions, CALLS edges for command invocations (both to local functions and external binaries), and IMPORTS_FROM edges for `source` / `.` dot-includes. Test functions (test_* prefix) are classified as Test kind. Root cause of tirth8205#197 ------------------ Shell-script-heavy repos (installers, infra, CI tooling) previously indexed to 0 nodes / 0 edges because .sh was not in EXTENSION_TO_LANGUAGE. build_or_update_graph reported `parsed 0 files` for FragHub-style repos, which blocked architecture mapping and flow detection for the category of projects that most need them. What's supported ---------------- * Function definitions in both tree-sitter-bash shapes: foo() { ... } function foo { ... } function foo() { ... } * Call extraction for `command` nodes (tree-sitter-bash wraps every invocation in a `command` node with a `command_name` > `word` child). Local calls resolve to qualified names; external binaries (curl, grep, awk) are recorded with their raw name, consistent with how unresolved calls work in other languages. * `source lib.sh` and `. lib.sh` emit IMPORTS_FROM edges (both bareword and quoted-string arguments handled). * `test_*` prefix -> Test kind via the existing _TEST_PATTERNS list. Implementation -------------- 1. EXTENSION_TO_LANGUAGE: +4 shell extensions -> "bash" 2. _CLASS_TYPES["bash"] = [] (no class concept in bash) 3. _FUNCTION_TYPES["bash"] = ["function_definition"] 4. _IMPORT_TYPES["bash"] = [] (source handled via constructs handler) 5. _CALL_TYPES["bash"] = ["command"] 6. New _extract_bash_constructs() + _bash_get_source_target() helpers following the _extract_lua_constructs pattern. Dispatched next to the existing Lua/Luau handler. 7. _get_name: new "bash" branch that reads the first `word` child of a function_definition (handles both paren and function-keyword forms). 8. _get_call_name: new "bash" branch that reads command_name > word. Scope caveats documented in PR body ----------------------------------- * No cross-file call resolution beyond direct source includes (bash has no formal module system). * Calls whose name is a variable expansion ($foo, `$(cmd)`) are not statically resolvable and are skipped by _get_call_name. * Pipelines, subshells, heredocs are parsed by tree-sitter but not semantically linked yet. Tests added (tests/test_multilang.py::TestBashParsing — 10 tests) ----------------------------------------------------------------- - test_detects_language (.sh, .bash, .zsh) - test_finds_function_definitions_paren_form - test_finds_function_definitions_function_keyword_form - test_finds_source_imports (source + . + quoted string) - test_finds_calls_between_local_functions - test_finds_external_command_calls - test_finds_contains_edges - test_detects_test_functions - test_nodes_have_bash_language - test_calls_inside_functions New fixture: tests/fixtures/sample.sh exercises both function-definition forms, both import forms (bareword + quoted), local + external call mix, and test_* naming. Test results ------------ Stage 1 (new targeted tests): 10/10 passed. Stage 2 (tests/test_multilang.py full): 140/140 passed — no regressions across any language. Stage 3 (tests/test_parser.py adjacent): 67/67 passed. Stage 4 (full suite): 705 passed (up from 699 baseline — +6 new), 6 pre-existing Windows failures in test_incremental/test_notebook (verified identical on unchanged main in PR tirth8205#226 work). Stage 5 (ruff check): clean. Stage 6 (fixture smoke parse): 8 nodes, 26 edges. All expected function, call, import, and test nodes present. Zero regressions. All new code lives behind the bash language check so existing languages are untouched.

azizur100389 · 2026-04-11T22:12:05Z

Closing this — I just noticed v2.3.0 ships the dry-run preview as part of your super-PR #228. My implementation and the shipped one converge on the same two-step pattern (dry_run=True returns a unified diff via difflib, pending refactor is retained so a follow-up dry_run=False call applies it), so nothing to salvage here.

Thanks for landing it — glad the pattern matched what you had in mind. I've moved on to a smaller follow-up for the .ksh extension you flagged in #230 (tracked in #235, PR #236).

azizur100389 force-pushed the feat/refactor-dry-run branch from 4aff6b2 to 69abec6 Compare April 11, 2026 20:49

azizur100389 closed this Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add dry_run preview to apply_refactor_tool (#176)#226

feat: add dry_run preview to apply_refactor_tool (#176)#226
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:feat/refactor-dry-run

azizur100389 commented Apr 11, 2026

Uh oh!

azizur100389 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

azizur100389 commented Apr 11, 2026

Summary

Root cause

Fix

Return shape

Tests added (tests/test_refactor.py::TestApplyRefactor)

Test results

Why this fix is safe

Uh oh!

azizur100389 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tests added (`tests/test_refactor.py::TestApplyRefactor`)