feat: add dry_run preview to apply_refactor_tool (#176)#226
Closed
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
Closed
feat: add dry_run preview to apply_refactor_tool (#176)#226azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
Conversation
Add a dry_run: bool = False parameter to apply_refactor_tool so callers
can preview a unified diff of every edited file without touching disk.
When dry_run is True, the pending refactor is left in the cache so the
caller can review the diffs and invoke the tool again with the same
refactor_id to actually apply. This is the two-step commit pattern the
maintainer explicitly validated in the issue thread.
Root cause
----------
apply_refactor_tool wrote changes to disk immediately with no confirmation
step. Agentic workflows that chain MCP tool calls could trigger file
rewrites as a side effect of a broader task the developer did not intend
to modify files for. There was no undo path beyond `git checkout`, and
unstaged work could be lost.
Fix
---
1. refactor.py::apply_refactor gains `dry_run: bool = False`. When true,
compute a unified diff via stdlib difflib.unified_diff between the
original file content and the post-edit content, collect per-file
entries in a `diffs` list, and return without writing. The preview
is NOT popped from _pending_refactors in dry-run mode so the caller
can apply it afterwards.
2. tools/refactor_tools.py::apply_refactor_func threads the flag through.
3. main.py::apply_refactor_tool exposes dry_run to MCP clients.
4. Path-traversal validation and expiry checks still run in dry-run mode
so dry-run is not an escape hatch for security.
Return shape
------------
Non-dry-run path is unchanged (no new keys) so no existing caller breaks.
Dry-run adds `dry_run: True` and `diffs: [{file, diff}, ...]`.
Tests added (tests/test_refactor.py::TestApplyRefactor)
-------------------------------------------------------
- test_apply_refactor_dry_run_returns_diff: valid unified diff returned,
file unchanged on disk, preview still cached.
- test_apply_refactor_dry_run_then_apply: dry-run followed by real apply
on the same refactor_id writes the file and consumes the preview.
- test_apply_refactor_dry_run_blocks_path_traversal: path-traversal
check still fires in dry-run mode.
- test_apply_refactor_dry_run_rejects_expired: expired previews still
rejected in dry-run mode.
Test results
------------
Stage 1 (new targeted tests): 4/4 passed.
Stage 2 (tests/test_refactor.py): 26 passed, 16 pre-existing Windows
file-lock teardown errors (verified identical on unchanged main).
Stage 3 (tests/test_tools.py adjacent): 62 passed, 15 pre-existing
Windows teardown errors.
Stage 4 (full suite): 699 passed, 6 pre-existing Windows failures in
test_incremental/test_notebook (verified identical on main).
Stage 5 (ruff lint on all 4 touched files): clean.
Stage 6 (MCP tool wrapper smoke test): dry_run=True returns valid diff
without writing; dry_run=False after dry-run writes and consumes.
Zero regressions from this change.
Why this fix is safe
--------------------
- No new dependency (difflib is stdlib).
- Non-dry-run return shape is unchanged; all existing callers and tests
work with no modification.
- Dry-run preserves the refactor_id so a real apply can follow without
re-running rename_preview.
- Security invariants (path traversal, expiry) still enforced in both
modes.
4aff6b2 to
69abec6
Compare
azizur100389
added a commit
to azizur100389/code-review-graph
that referenced
this pull request
Apr 11, 2026
Register .sh / .bash / .zsh / .ksh with tree-sitter-bash. Extract File and Function nodes for shell function definitions, CALLS edges for command invocations (both to local functions and external binaries), and IMPORTS_FROM edges for `source` / `.` dot-includes. Test functions (test_* prefix) are classified as Test kind. Root cause of tirth8205#197 ------------------ Shell-script-heavy repos (installers, infra, CI tooling) previously indexed to 0 nodes / 0 edges because .sh was not in EXTENSION_TO_LANGUAGE. build_or_update_graph reported `parsed 0 files` for FragHub-style repos, which blocked architecture mapping and flow detection for the category of projects that most need them. What's supported ---------------- * Function definitions in both tree-sitter-bash shapes: foo() { ... } function foo { ... } function foo() { ... } * Call extraction for `command` nodes (tree-sitter-bash wraps every invocation in a `command` node with a `command_name` > `word` child). Local calls resolve to qualified names; external binaries (curl, grep, awk) are recorded with their raw name, consistent with how unresolved calls work in other languages. * `source lib.sh` and `. lib.sh` emit IMPORTS_FROM edges (both bareword and quoted-string arguments handled). * `test_*` prefix -> Test kind via the existing _TEST_PATTERNS list. Implementation -------------- 1. EXTENSION_TO_LANGUAGE: +4 shell extensions -> "bash" 2. _CLASS_TYPES["bash"] = [] (no class concept in bash) 3. _FUNCTION_TYPES["bash"] = ["function_definition"] 4. _IMPORT_TYPES["bash"] = [] (source handled via constructs handler) 5. _CALL_TYPES["bash"] = ["command"] 6. New _extract_bash_constructs() + _bash_get_source_target() helpers following the _extract_lua_constructs pattern. Dispatched next to the existing Lua/Luau handler. 7. _get_name: new "bash" branch that reads the first `word` child of a function_definition (handles both paren and function-keyword forms). 8. _get_call_name: new "bash" branch that reads command_name > word. Scope caveats documented in PR body ----------------------------------- * No cross-file call resolution beyond direct source includes (bash has no formal module system). * Calls whose name is a variable expansion ($foo, `$(cmd)`) are not statically resolvable and are skipped by _get_call_name. * Pipelines, subshells, heredocs are parsed by tree-sitter but not semantically linked yet. Tests added (tests/test_multilang.py::TestBashParsing — 10 tests) ----------------------------------------------------------------- - test_detects_language (.sh, .bash, .zsh) - test_finds_function_definitions_paren_form - test_finds_function_definitions_function_keyword_form - test_finds_source_imports (source + . + quoted string) - test_finds_calls_between_local_functions - test_finds_external_command_calls - test_finds_contains_edges - test_detects_test_functions - test_nodes_have_bash_language - test_calls_inside_functions New fixture: tests/fixtures/sample.sh exercises both function-definition forms, both import forms (bareword + quoted), local + external call mix, and test_* naming. Test results ------------ Stage 1 (new targeted tests): 10/10 passed. Stage 2 (tests/test_multilang.py full): 140/140 passed — no regressions across any language. Stage 3 (tests/test_parser.py adjacent): 67/67 passed. Stage 4 (full suite): 705 passed (up from 699 baseline — +6 new), 6 pre-existing Windows failures in test_incremental/test_notebook (verified identical on unchanged main in PR tirth8205#226 work). Stage 5 (ruff check): clean. Stage 6 (fixture smoke parse): 8 nodes, 26 edges. All expected function, call, import, and test nodes present. Zero regressions. All new code lives behind the bash language check so existing languages are untouched.
Contributor
Author
|
Closing this — I just noticed v2.3.0 ships the dry-run preview as part of your super-PR #228. My implementation and the shipped one converge on the same two-step pattern ( Thanks for landing it — glad the pattern matched what you had in mind. I've moved on to a smaller follow-up for the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
dry_run: bool = Falseparameter toapply_refactor_toolso callers can preview a unified diff of every edited file without writing to disk. Whendry_run=Truethe pending refactor is kept in the cache so the caller can review the diffs and call the tool again with the samerefactor_idto actually apply — the two-step commit pattern the maintainer explicitly validated in the issue thread.Closes #176.
Root cause
apply_refactor_toolwrote changes to disk immediately with no confirmation step. In agentic workflows where multiple MCP tools are chained, this tool could be called as a side effect of a broader task the developer didn't intend to modify files for — and there was no undo path beyondgit checkout.Maintainer validation from the issue thread:
Fix
code_review_graph/refactor.py—apply_refactor()gainsdry_run: bool = False. WhenTrue, compute a unified diff via stdlibdifflib.unified_diffbetween the original file content and the post-edit content, collect per-file entries in adiffslist, and return without writing. The preview is not popped from_pending_refactorsin dry-run mode so a follow-up real apply works.code_review_graph/tools/refactor_tools.py—apply_refactor_func()threadsdry_runthrough.code_review_graph/main.py—apply_refactor_toolMCP decorator exposesdry_runto clients with an updated docstring.Return shape
The non-dry-run path is unchanged (no new keys) so no existing caller breaks. Dry-run adds two keys:
{ "status": "ok", "applied": 0, "files_modified": [], "edits_applied": 0, "dry_run": True, "diffs": [{"file": "...", "diff": "--- a/...\n+++ b/...\n@@ ..."}] }Tests added (
tests/test_refactor.py::TestApplyRefactor)test_apply_refactor_dry_run_returns_diff— valid unified diff, file on disk untouched, preview still cachedtest_apply_refactor_dry_run_then_apply— dry-run followed by real apply with the samerefactor_idwrites the file and consumes the previewtest_apply_refactor_dry_run_blocks_path_traversal— path-traversal check still firestest_apply_refactor_dry_run_rejects_expired— expired previews still rejectedTest results
tests/test_refactor.pyfullmain)tests/test_tools.pytest_incremental/test_notebook(verified identical onmain)ruff checkon all 4 touched filesZero regressions. All pre-existing test issues are Windows-specific file-lock teardown failures that reproduce on unchanged
main.Why this fix is safe
difflibis stdlib.refactor_idso a real apply can follow without re-runningrename_preview.