Skip to content

fix: three Windows-only bugs in v2.3.1 (#239)#240

Open
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:fix/windows-v231-bugs
Open

fix: three Windows-only bugs in v2.3.1 (#239)#240
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:fix/windows-v231-bugs

Conversation

@azizur100389
Copy link
Copy Markdown
Contributor

Summary

Three distinct, deterministic bugs that all reproduce on clean origin/main on Windows. They are not flakiness — each has a clear root cause and a targeted regression test that fails before the fix and passes after. All three were masking real product bugs behind the "CI is just flaky on Windows" narrative.

Closes #239.


Bug 1 — get_data_dir() writes non-UTF-8 .gitignore on Windows

File: code_review_graph/incremental.py:128-138

inner_gitignore.write_text(...) was called without an encoding= argument. The string literal contains an em-dash (, U+2014). On Windows, Path.write_text uses the system default codepage (cp1252 in most locales), which encodes U+2014 as the single byte 0x97. Anything that later reads the file as UTF-8 — including get_data_dir's own test test_default_uses_repo_subdir — raises:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 in position 38

Fix: Add encoding="utf-8" to the write_text() call. The sibling function _ensure_repo_gitignore at line 170 already uses this — this newer writer was simply missed when the inner .gitignore feature was added.

Regression guardTestDataDir::test_auto_gitignore_is_valid_utf8:

  • Reads the generated file as raw bytes
  • Asserts the em-dash is stored as UTF-8 bytes 0xE2 0x80 0x94
  • Asserts cp1252 byte 0x97 does not appear
  • Round-trips cleanly through strict UTF-8 decoding

Bug 2 — Databricks notebook auto-detection fails on CRLF line endings

File: code_review_graph/parser.py:390-394 (in parse_bytes)

The check was hard-coded to LF:

if language == "python" and source.startswith(
    b"# Databricks notebook source\n",
):

On Windows, git config core.autocrlf=true (the default) rewrites text files to CRLF on checkout. The Databricks fixture starts with b"# Databricks notebook source\r\n" on any Windows checkout, so the check returns False, the file is parsed as regular Python, and all Databricks-specific handling is bypassed:

  • notebook_format metadata is never tagged on the File node
  • # MAGIC %sql cells are not parsed as SQL (no IMPORTS_FROM edges for table references)
  • # MAGIC %md cells are not skipped
  • Per-cell cell_index metadata is missing

This silently breaks four tests in TestDatabricksPyNotebook on Windows:

  • test_detects_databricks_header
  • test_extracts_sql_tables
  • test_skips_magic_md_cells
  • test_cell_index_tracking

Fix: Parse the first line robustly by finding the first b"\n" and stripping any trailing b"\r" before comparing to the exact header:

if language == "python":
    first_newline = source.find(b"\n")
    first_line = (
        source[:first_newline].rstrip(b"\r")
        if first_newline != -1
        else source.rstrip(b"\r")
    )
    if first_line == b"# Databricks notebook source":
        return self._parse_databricks_py_notebook(path, source)

This matches both LF and CRLF endings and is strictly more restrictive than startswith — a file whose first line is "# Databricks notebook source code examples" will correctly not trigger Databricks parsing (regression guard test_databricks_header_prefix_false_positive_rejected locks this in).

Regression guardsTestDatabricksPyNotebook:

  • test_databricks_header_crlf_line_endings — CRLF path (the actual bug)
  • test_databricks_header_lf_line_endings_still_work — LF path unchanged
  • test_databricks_header_prefix_false_positive_rejected — guards against a naive "just use startswith" fix

Bug 3 — Stale FastMCP API in test_heavy_tools_are_coroutines

File: tests/test_main.py:68-71

await crg_main.mcp.get_tools()FastMCP.get_tools() does not exist in fastmcp>=2.14.0 (pinned via fastmcp>=2.14.0,<3 in pyproject.toml). The current API is list_tools(), and it returns MCP protocol Tool pydantic objects that do not expose the underlying Python function at all — so the old lookup pattern cannot be directly ported.

The test dies at runtime with AttributeError on every platform. This means the async regression guard promised by PR #231

There are regression tests (test_heavy_tools_are_coroutines + test_heavy_tool_source_uses_to_thread) that will fail at CI collection time if anyone converts one of the 5 tools back to sync in a future refactor, so we shouldn't hit this class of bug again.

— has been silently inert since the test was merged. A future refactor that converted any of the 5 heavy tools back to sync would not be caught by CI because the guard is already red.

Fix: Mirror the sibling test_heavy_tool_source_uses_to_thread, which resolves each heavy tool by getattr(crg_main, name) — a resilient approach independent of any FastMCP internal surface. Also drop @pytest.mark.asyncio on both guards since they no longer need an event loop.

Regression guardtest_regression_guard_does_not_depend_on_fastmcp_internals:

  • AST-walks the two async-guard functions' source
  • Fails if they reference mcp.get_tools / mcp._tools / mcp.tool_manager / mcp._tool_manager on actual Attribute nodes (docstrings ignored via ast.walk)
  • Fails if any heavy tool cannot be resolved via getattr(crg_main, name)

This is a meta-guard — it protects the guards themselves from future FastMCP API drift.


Test results

Stage Result
Stage 1 — new targeted regression tests 7/7 passed
Stage 2 — test_incremental.py + test_notebook.py + test_main.py 99 passed + 2 xpassed, 2 pre-existing failures in TestFindRepoRoot/TestFindProjectRoot (environmental — not addressed in this PR, see issue comment)
Stage 3 — adjacent tests/test_parser.py (parser.py touched) 67/67 passed
Stage 4 — full suite 743 passed, 165 pre-existing Windows teardown errors (unchanged). +10 net tests and -6 pre-existing Windows failures resolved vs baseline on main
Stage 5 — ruff check on all 5 changed files clean

Why this fix is safe

  • Fix 1 (encoding="utf-8") matches the established pattern in the sibling function in the same file. No API change.
  • Fix 2 uses a more restrictive match than the original (exact first line, not startswith), so it cannot loosen prior behavior — only the CRLF false negative is newly accepted. Prefix false positives are explicitly rejected by a test.
  • Fix 3 is test-only. The underlying product code (the 5 heavy tools) is unchanged. The regression guard is now actually enforced.

Not addressed in this PR (intentional)

Two additional failing tests on Windows (test_returns_none_without_git, test_falls_back_to_start) are true environmental flakiness — they assume no ancestor of tmp_path has a .git directory, which is false on any Windows user whose home directory contains a git repo (e.g. dotfiles). Fixing these requires either a product API change (stop_at parameter on find_repo_root) or invasive monkeypatching. Left for a separate discussion, called out in the tracking issue.

Three distinct, deterministic bugs that all reproduce on clean
origin/main on Windows.  They are NOT flakiness — each has a clear root
cause and a targeted regression test that fails before the fix and
passes after.  All three have been masking real product bugs behind
"CI is just flaky on Windows".

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows
============================================================
File: code_review_graph/incremental.py:128-138

``inner_gitignore.write_text(...)`` was called without an encoding
argument.  The string literal contains an em-dash (U+2014).  On Windows,
Path.write_text uses the system default codepage (cp1252 in most
locales), which encodes U+2014 as the single byte 0x97.  Anything that
later reads the file as UTF-8 — including get_data_dir's own test
`test_default_uses_repo_subdir` — raises
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 ...`.

Fix: add encoding="utf-8" to write_text.  Sibling function
_ensure_repo_gitignore at line 170 already uses encoding="utf-8"; this
one was missed when the inner .gitignore writer was added.

Regression guard: TestDataDir::test_auto_gitignore_is_valid_utf8
- reads the generated file as raw bytes
- asserts the em-dash is stored as UTF-8 bytes 0xE2 0x80 0x94
- asserts cp1252 byte 0x97 does NOT appear
- round-trips through strict UTF-8 decoding

Bug 2: Databricks notebook auto-detection fails on CRLF line endings
====================================================================
File: code_review_graph/parser.py:390-394 (in parse_bytes)

The check was hard-coded to LF:
    source.startswith(b"# Databricks notebook source\n")

On Windows, `git config core.autocrlf=true` (the default) rewrites text
files to CRLF on checkout.  The Databricks fixture
`tests/fixtures/sample_databricks_export.py` starts with
`b"# Databricks notebook source\r\n"` on any Windows checkout, so the
check returns False, the file is parsed as regular Python, and ALL
Databricks-specific handling is bypassed: notebook_format metadata is
never tagged, # MAGIC %sql cells are not parsed as SQL, # MAGIC %md
cells are not skipped, and per-cell cell_index metadata is missing.
This silently breaks four tests in TestDatabricksPyNotebook on Windows:
- test_detects_databricks_header
- test_extracts_sql_tables
- test_skips_magic_md_cells
- test_cell_index_tracking

Fix: parse the first line robustly by finding the first b"\n" and
stripping any trailing b"\r" before comparing to the exact header.  This
matches both LF and CRLF endings AND rejects prefix false positives
(e.g. a file whose first line is "# Databricks notebook source code
examples" — only the exact header now triggers Databricks parsing).

Regression guards in TestDatabricksPyNotebook:
- test_databricks_header_crlf_line_endings (CRLF path, the actual bug)
- test_databricks_header_lf_line_endings_still_work (LF path unchanged)
- test_databricks_header_prefix_false_positive_rejected (guard against
  naive "just use startswith" fix)

Bug 3: Stale FastMCP API in test_heavy_tools_are_coroutines
===========================================================
File: tests/test_main.py:68-71

``tools = await crg_main.mcp.get_tools()`` — `FastMCP.get_tools()` does
not exist in fastmcp>=2.14.0 (which pyproject.toml pins via
`fastmcp>=2.14.0,<3`).  The current API is `list_tools()` and it
returns MCP protocol Tool pydantic objects that do NOT expose the
underlying Python function at all, so the old lookup pattern cannot be
directly ported.

The test dies at runtime with AttributeError on EVERY platform, which
means the async regression guard promised by PR tirth8205#231 ("There are
regression tests that will fail at CI collection time if anyone
converts one of the 5 tools back to sync") has been silently inert
since the test was merged.  A future refactor converting any of the
5 heavy tools back to sync would NOT be caught by CI because the
guard is already red.

Fix: mirror the sibling test `test_heavy_tool_source_uses_to_thread`,
which resolves each heavy tool by ``getattr(crg_main, name)`` — a
resilient approach that does not depend on any FastMCP internal
surface.  Also drop @pytest.mark.asyncio on both guards since they no
longer need an event loop.

Regression guard:
test_regression_guard_does_not_depend_on_fastmcp_internals
- AST-walks the two guard functions' source
- fails if they reference mcp.get_tools / mcp._tools / mcp.tool_manager
  / mcp._tool_manager on actual Attribute nodes (docstrings ignored)
- fails if any heavy tool cannot be resolved via getattr(crg_main, name)

Test results
============
Stage 1 (new targeted regression tests):      7/7 passed.
Stage 2 (tests/test_incremental.py + test_notebook.py + test_main.py):
  99 passed + 2 xpassed, 2 pre-existing failures in TestFindRepoRoot /
  TestFindProjectRoot that are environmental (user's home dir
  contains .git; walk finds it).  Intentionally NOT fixed by this PR —
  worth a separate discussion, see the tracking issue.
Stage 3 (tests/test_parser.py adjacent — parser.py touched): 67/67.
Stage 4 (full suite): 743 passed, 2 unrelated find_repo_root failures,
  165 pre-existing Windows file-lock teardown errors.  That's +10 net
  tests and -6 pre-existing failures resolved compared to baseline on
  main.
Stage 5 (ruff check on all 5 changed files): clean.

Why this fix is safe
====================
- Fix 1 (encoding="utf-8") matches the established pattern in the
  sibling function in the same file.  No API change.
- Fix 2 uses a more restrictive match than the original (exact first
  line, not startswith), so it CANNOT loosen the prior behavior — only
  the CRLF false negative is newly accepted.  Prefix false positives
  are explicitly rejected by a test.
- Fix 3 is test-only.  The underlying product code (the 5 heavy tools)
  is unchanged.  The regression guard is now actually enforced.

See the umbrella tracking issue for the full root-cause analysis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Three Windows-only bugs in v2.3.1 (CRLF Databricks detection, UTF-8 gitignore, stale FastMCP test API)

1 participant