From accff81d868ecf1dfc05827655a51f43c2fade48 Mon Sep 17 00:00:00 2001 From: azizur100389 Date: Sun, 12 Apr 2026 00:45:00 +0100 Subject: [PATCH] fix(incremental): add stop_at to find_repo_root / find_project_root (#241) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two tests in tests/test_incremental.py have been failing on any developer machine whose tmp_path has a git-initialized ancestor: tests/test_incremental.py::TestFindRepoRoot::test_returns_none_without_git tests/test_incremental.py::TestFindProjectRoot::test_falls_back_to_start Root cause ---------- pytest's tmp_path lives under $TMPDIR, which on Windows 11 resolves to ``C:/Users//AppData/Local/Temp/pytest-of-/...``. If the user has ever run ``git init`` anywhere under their home directory (dotfiles repo, chezmoi, yadm, a bare ``git init ~`` — all very common on developer machines), then ``find_repo_root()`` walking up from ``tmp_path/no_git`` finds the ancestor .git and returns ``C:/Users/`` instead of ``None``. The same failure mode affects Linux and macOS users with a git-initialized home directory. ``find_repo_root`` has no way to bound its ancestor walk — it always climbs to the filesystem root. This is correct production behavior (a nested script inside a git repo *should* find its repo) but makes the function un-testable in environments where the tmp path sits inside a git-initialized ancestor. Fix --- Add an optional ``stop_at: Path | None = None`` parameter to both ``find_repo_root()`` and ``find_project_root()``. When provided, the walk examines ``stop_at`` for a .git directory and then stops without crossing above it. ``stop_at=None`` preserves the existing "walk to filesystem root" semantics, so this is a zero-risk backward-compatible API addition. All 7 production callers in cli.py (x4), tools/refactor_tools.py (x1), tools/_common.py (x1), and the internal call in find_project_root pass no argument for stop_at — their behavior is unchanged. The two tests above are updated to pass ``stop_at=tmp_path`` so the walk is bounded to the test sandbox. ``test_falls_back_to_start`` is also hardened with ``monkeypatch.delenv("CRG_REPO_ROOT", raising=False)`` so a developer env var can no longer shadow the test's expectation. stop_at is also a legitimate production API — callers that want to bound the ancestor walk (multi-repo orchestrators, CI containers with bind-mounted volumes, embedded sandboxes, Docker build contexts) can now do so without resorting to monkeypatching. Tests added / updated --------------------- 1. TestFindRepoRoot::test_returns_none_without_git - Now passes ``stop_at=tmp_path`` (the actual fix for the flakiness). 2. TestFindRepoRoot::test_stop_at_prevents_escape_to_outer_git [NEW] - Positive regression test: when an outer directory has .git, an unbounded walk finds it but a bounded walk with stop_at=inner returns None. Locks in the new parameter's semantics. 3. TestFindRepoRoot::test_stop_at_finds_git_at_boundary [NEW] - Regression test: stop_at must NOT suppress a .git that lives at the boundary itself. The walk examines stop_at for .git before stopping, so a repo rooted exactly at the boundary is still found. 4. TestFindProjectRoot::test_falls_back_to_start - Now passes stop_at=tmp_path AND monkeypatches CRG_REPO_ROOT. 5. TestFindProjectRoot::test_stop_at_forwarded_to_find_repo_root [NEW] - Regression test: find_project_root must forward stop_at to find_repo_root, not silently drop it. Also verifies that when the bounded walk finds nothing, the fall-through to "return start" still works correctly. Test results ------------ Stage 1 (TestFindRepoRoot + TestFindProjectRoot): 8/8 passed (up from 6/8 on unchanged main — the 2 previously-flaky tests and the 3 new regression tests all green). Stage 2 (tests/test_incremental.py full): 48 passed, 1 pre-existing failure (TestDataDir::test_default_uses_repo_subdir — UTF-8 gitignore bug fixed on separate branch in the PR that closes #239). Stage 3 (tests/test_tools.py adjacent — find_project_root used there): 62 passed, 15 pre-existing Windows file-lock teardown errors. Stage 4 (full suite): 737 passed, 6 pre-existing Windows failures (all covered by the parallel #239 fix PR — none are regressions from this change), 165 pre-existing Windows teardown errors. Net effect: this PR independently removes 2 failures from the baseline (8 -> 6); once both PRs merge, all 8 pre-existing Windows failures will be resolved. Stage 5 (ruff check on incremental.py + test_incremental.py): clean. Zero regressions. Fully backward-compatible. --- code_review_graph/incremental.py | 40 +++++++++++++++--- tests/test_incremental.py | 69 ++++++++++++++++++++++++++++++-- 2 files changed, 101 insertions(+), 8 deletions(-) diff --git a/code_review_graph/incremental.py b/code_review_graph/incremental.py index cfa672c0..22848309 100644 --- a/code_review_graph/incremental.py +++ b/code_review_graph/incremental.py @@ -71,19 +71,44 @@ ] -def find_repo_root(start: Path | None = None) -> Optional[Path]: - """Walk up from start to find the nearest .git directory.""" +def find_repo_root( + start: Path | None = None, + stop_at: Path | None = None, +) -> Optional[Path]: + """Walk up from ``start`` to find the nearest ``.git`` directory. + + Args: + start: Starting directory. Defaults to ``Path.cwd()``. + stop_at: Optional boundary — if provided, the walk examines + ``stop_at`` for a ``.git`` directory and then stops without + crossing above it. Useful for tests that create a synthetic + repo under ``tmp_path`` (so the walk does not accidentally + climb into a developer's home-directory dotfiles repo) and + for any production caller that wants to bound the ancestor + walk — e.g. multi-repo orchestrators, CI containers with + bind-mounted volumes, embedded sandboxes. See #241. + + Returns: + The first ancestor containing ``.git``, or ``None`` if no + ancestor up to and including ``stop_at`` (when set) or the + filesystem root (when ``stop_at is None``) contains one. + """ current = start or Path.cwd() while current != current.parent: if (current / ".git").exists(): return current + if stop_at is not None and current == stop_at: + return None current = current.parent if (current / ".git").exists(): return current return None -def find_project_root(start: Path | None = None) -> Path: +def find_project_root( + start: Path | None = None, + stop_at: Path | None = None, +) -> Path: """Find the project root. Resolution order (highest precedence first): @@ -91,15 +116,20 @@ def find_project_root(start: Path | None = None) -> Path: 1. ``CRG_REPO_ROOT`` environment variable — explicit override for anyone scripting the CLI from outside the repo (CI jobs, daemons, multi-repo orchestrators). See: #155 - 2. Git repository root via :func:`find_repo_root` from ``start``. + 2. Git repository root via :func:`find_repo_root` from ``start``, + honoring ``stop_at`` if provided. 3. ``start`` itself (or cwd if no start given). + + ``stop_at`` is forwarded to :func:`find_repo_root` so callers that + want to bound the ancestor walk (typically tests; see #241) can do so + without having to call ``find_repo_root`` directly. """ env_override = os.environ.get("CRG_REPO_ROOT", "").strip() if env_override: p = Path(env_override).expanduser().resolve() if p.exists(): return p - root = find_repo_root(start) + root = find_repo_root(start, stop_at=stop_at) if root: return root return start or Path.cwd() diff --git a/tests/test_incremental.py b/tests/test_incremental.py index d438ccb0..2fe5a12b 100644 --- a/tests/test_incremental.py +++ b/tests/test_incremental.py @@ -35,9 +35,48 @@ def test_finds_parent_git_dir(self, tmp_path): assert find_repo_root(sub) == tmp_path def test_returns_none_without_git(self, tmp_path): + """No .git between ``sub`` and ``tmp_path`` -> None. + + Bounded with ``stop_at=tmp_path`` so the walk does not climb into + ancestors outside the test sandbox. On Windows in particular, + ``tmp_path`` lives under ``C:/Users//AppData/Local/Temp/...`` + and if the user has ``git init`` anywhere under their home (dotfiles, + chezmoi, etc.) the unbounded walk would find that ancestor .git and + the test would fail for reasons unrelated to the product. See #241. + """ sub = tmp_path / "no_git" sub.mkdir() - assert find_repo_root(sub) is None + assert find_repo_root(sub, stop_at=tmp_path) is None + + def test_stop_at_prevents_escape_to_outer_git(self, tmp_path): + """Positive regression test for #241: ``stop_at`` must halt the + walk even when an ancestor *does* contain ``.git``. + + Without ``stop_at`` the walk correctly finds the outer .git; with + ``stop_at=inner`` the walk is bounded and returns None. + """ + outer = tmp_path / "outer" + outer.mkdir() + (outer / ".git").mkdir() + inner = outer / "inner" + inner.mkdir() + + # Unbounded walk finds the ancestor .git (existing behavior). + assert find_repo_root(inner) == outer + + # Bounded walk stops at ``inner`` and never climbs to ``outer``. + assert find_repo_root(inner, stop_at=inner) is None + + def test_stop_at_finds_git_at_boundary(self, tmp_path): + """stop_at does not suppress a .git that lives *at* the boundary.""" + boundary = tmp_path / "boundary" + boundary.mkdir() + (boundary / ".git").mkdir() + inner = boundary / "inner" + inner.mkdir() + + # The walk examines ``boundary`` and finds the .git before stopping. + assert find_repo_root(inner, stop_at=boundary) == boundary class TestFindProjectRoot: @@ -45,10 +84,34 @@ def test_returns_git_root(self, tmp_path): (tmp_path / ".git").mkdir() assert find_project_root(tmp_path) == tmp_path - def test_falls_back_to_start(self, tmp_path): + def test_falls_back_to_start(self, tmp_path, monkeypatch): + """With no .git and no env override, find_project_root returns ``sub``. + + Bounded with ``stop_at=tmp_path`` to prevent the ancestor walk from + escaping the test sandbox (see #241), and ``CRG_REPO_ROOT`` is + cleared so a developer env var cannot shadow the test expectation. + """ + monkeypatch.delenv("CRG_REPO_ROOT", raising=False) sub = tmp_path / "no_git" sub.mkdir() - assert find_project_root(sub) == sub + assert find_project_root(sub, stop_at=tmp_path) == sub + + def test_stop_at_forwarded_to_find_repo_root(self, tmp_path, monkeypatch): + """Positive regression test for #241: find_project_root must forward + stop_at to find_repo_root, not silently drop it.""" + monkeypatch.delenv("CRG_REPO_ROOT", raising=False) + outer = tmp_path / "outer" + outer.mkdir() + (outer / ".git").mkdir() + inner = outer / "inner" + inner.mkdir() + + # Without stop_at, find_project_root climbs to outer (existing behavior). + assert find_project_root(inner) == outer + + # With stop_at=inner, the walk is bounded and find_project_root falls + # back to its third resolution rule (the start path itself). + assert find_project_root(inner, stop_at=inner) == inner class TestGetDbPath: