Skip to content

Fix CI test failures (T-856, T-857, T-859)#92

Merged
ArjenSchwarz merged 1 commit into
mainfrom
fix/ci-test-failures
Apr 20, 2026
Merged

Fix CI test failures (T-856, T-857, T-859)#92
ArjenSchwarz merged 1 commit into
mainfrom
fix/ci-test-failures

Conversation

@ArjenSchwarz

Copy link
Copy Markdown
Owner

Summary

Fix the four pre-existing test failures that were red on main:

  • T-859 (code bug)cmd/renumber.go looked up marker.AfterTaskID in a map keyed by raw file IDs, but ExtractPhaseMarkers was changed in T-742 to return AfterTaskID as a 1-based sequential count of preceding top-level tasks. The mismatched lookup silently dropped later phase markers in files with non-sequential top-level IDs (multi-phase + gapped IDs would render with wrong anchors or lost headers). Replaced extractTaskIDOrder + the idToPosition map with a direct N → N-1 conversion in convertPhaseMarkersToPositions, dropped the dead helper, and rewrote its unit test for the new contract.
  • T-856 (test bug)internal/task/phase_test.go:1553 still called RenderMarkdownWithPhases with two arguments after T-698 added a phaseSource *TaskList parameter. All other call sites pass nil; the regression test was just missed.
  • T-857 (test bug)resetBatchFlags() reset batchInput but not the format global. Two batch tests (TestBatchCommand_JSONOutput, TestBatchCommand_RemoveOnPhasedFilePreservesPhases) called rootCmd.Execute() with --format json, leaving format = "json" for every subsequent test. TestRunCompleteDryRun then failed because it depends on the default "table" format. Reset format and the format flag's Changed bit in resetBatchFlags, and register t.Cleanup(resetBatchFlags) on the two leaking tests.
  • TestResolveFilename (test bug) — the "no filename provided" subtest claimed discovery is "disabled by default", but Discovery.Enabled defaults to true. The test passed only when the host branch + filesystem produced no discovery match; on CI for branches like T-824/homebrew-install (with specs/homebrew-install/tasks.md present), discovery succeeded and the test failed. Now isolated in a fresh git repo with discovery.enabled: false.

Also updates docs/agent-notes/testing.md with the new resetBatchFlags contract, the discovery-isolation requirement, and a note on the ExtractPhaseMarkers / renumber.go invariant.

Test plan

  • make check (fmt + lint + unit tests) passes locally
  • make test-integration passes locally
  • CI green on the PR (this is the point)

- T-859 (code): cmd/renumber.go was building an idToPosition map from raw
  file task IDs, but ExtractPhaseMarkers (changed in T-742) now returns
  AfterTaskID as a 1-based sequential count of preceding top-level tasks.
  The mismatched lookup silently dropped later phase markers in files with
  non-sequential top-level IDs. Replace extractTaskIDOrder + idToPosition
  with a direct N→N-1 conversion in convertPhaseMarkersToPositions, drop
  the dead helper, and rewrite TestConvertPhaseMarkersToPositions for the
  new contract.
- T-856 (test): internal/task/phase_test.go:1553 still called
  RenderMarkdownWithPhases with two arguments after T-698 added a
  phaseSource *TaskList parameter for filtered phase rendering. All other
  callers pass nil; the regression test was simply missed. Pass nil.
- T-857 (test): resetBatchFlags() reset batchInput but not the format
  global, so any batch test that called rootCmd.Execute() with
  --format json (TestBatchCommand_JSONOutput,
  TestBatchCommand_RemoveOnPhasedFilePreservesPhases) left format = "json"
  for every subsequent test in the package, breaking
  TestRunCompleteDryRun which depends on the default. Reset format and
  the format flag's Changed bit in resetBatchFlags, and register
  t.Cleanup(resetBatchFlags) on the two leaking tests.
- TestResolveFilename (test): the "no filename provided" case asserted
  that discovery is "disabled by default", but Discovery.Enabled defaults
  to true. The test happened to pass when the host branch+filesystem
  produced no discovery match; on CI for branches like
  T-824/homebrew-install (with specs/homebrew-install/tasks.md present)
  discovery succeeded and the test failed. Isolate the test in a fresh
  git repo with discovery.enabled: false so the outcome is deterministic.

Also update docs/agent-notes/testing.md with the new resetBatchFlags
contract, the discovery-isolation requirement, and a note on the
ExtractPhaseMarkers / renumber.go invariant.
@claude

claude Bot commented Apr 20, 2026

Copy link
Copy Markdown

Code Review

Overall this is a well-structured, well-documented fix for four pre-existing CI failures. Each bug is clearly identified, the root causes are accurately diagnosed, and the changes are minimal and targeted.


T-859 — cmd/renumber.go (code bug fix)

Verdict: correct and cleaner than the code it replaces.

The old approach extracted raw file IDs via regex, built an idToPosition map, then looked up AfterTaskID values that ExtractPhaseMarkers had already normalized to sequential counts — a guaranteed mismatch for any file with non-sequential top-level IDs. The new N-1 conversion is both correct and far simpler (~40 lines removed).

One minor point: when AfterTaskID is non-numeric or n < 1, convertPhaseMarkersToPositions silently falls back to -1 (before all tasks). This is a reasonable default, but if ExtractPhaseMarkers ever emits a malformed value due to a future bug, the renumber command will silently misplace phase markers with no diagnostic. A log.Printf or structured warning at debug verbosity would make this easier to trace. Not a blocker.

The edge-case test for AfterTaskID = "abc" is good. Consider also adding AfterTaskID = "0" to exercise the n < 1 branch explicitly.


T-856 — internal/task/phase_test.go (test bug fix)

Verdict: correct one-line fix.

Adding nil as the phaseSource argument matches every other call site. Nothing to add.


T-857 — cmd/batch_test.go (test bug fix)

Verdict: correct, and the t.Cleanup pattern is better than manual cleanup.

Resetting both format and its Changed bit in resetBatchFlags is exactly right. t.Cleanup runs even on panic, which is strictly safer than a deferred call at the top of the test.

TestBatchCommand_RemoveOnPhasedFilePreservesPhases now calls resetBatchFlags() directly at the start and registers t.Cleanup(resetBatchFlags). The intent is clear (clean before + clean after), but the comment resetBatchFlags() at line 618 is slightly misleading if readers expect only one of the two patterns. A brief inline comment like // reset before + register cleanup after would prevent confusion.


TestResolveFilenamecmd/next_test.go (test isolation)

Verdict: correct approach, minor robustness note.

Using a fresh git init + .rune.yml with discovery.enabled: false is the right way to make this deterministic. The config.ResetConfigCache() bookend in t.Cleanup is important and present.

Two observations:

  1. os.Chdir is global process state. If any other test in the package ever calls t.Parallel(), this will be a data race. The current codebase doesn't appear to use t.Parallel(), so it's safe now, but a comment noting "do not parallelize this test" would reduce the risk of a future maintainer breaking it.
  2. If git is not in $PATH (unusual but possible in minimal CI containers), exec.Command("git", ...) will produce an opaque failure. The existing t.Fatalf message is clear enough; no change needed.

docs/agent-notes/testing.md

The rewrite is clear and actionable. Removing the now-resolved "Current Known Test Failures" section is correct. The new sections on config/discovery isolation and the ExtractPhaseMarkers invariant are exactly the kind of tribal knowledge that prevents the same bug classes from recurring.


Summary

Area Verdict
T-859 code fix ✅ Correct, significantly cleaner
T-856 test fix ✅ Trivial one-liner, correct
T-857 state leak fix ✅ Correct; minor comment suggestion
Discovery isolation ✅ Correct; minor parallelism note
Docs update ✅ Well-written, good future-proofing
Test coverage ✅ New edge-case test added; AfterTaskID = "0" worth adding
Security ✅ No concerns
Performance ✅ Slight improvement (regex+map → O(1) arithmetic)

No blocking issues. The fallback-without-warning behavior in convertPhaseMarkersToPositions and the missing "0" test case are the only things worth a follow-up, and neither is urgent.

@ArjenSchwarz

Copy link
Copy Markdown
Owner Author

PR Review Overview - Iteration 1

PR: #92 | Branch: fix/ci-test-failures | Date: 2026-04-20

CI Status

All 12 checks PASSED (Test and Build, claude-review, all Integration Tests across ubuntu/macos/windows × 1.0.0/latest, caching, error handling). The PR achieves its stated goal of turning CI green.

Valid Issues (fixed)

None — review found no blocking issues.

Invalid/Skipped Issues (Minor — deferred)

Add AfterTaskID = "0" test case

  • Reviewer: @claude
  • Comment: New convertPhaseMarkersToPositions test covers "abc" (non-numeric) but not "0" (the n < 1 branch).
  • Reason: Minor coverage gap. Both branches in if err != nil || n < 1 already collapse to the same -1 fallback, and the "abc" case exercises the early err != nil short-circuit. Adding "0" would only document intent. Deferred.

Inline comment clarifying reset+cleanup pattern

  • Reviewer: @claude
  • Comment: TestBatchCommand_RemoveOnPhasedFilePreservesPhases calls resetBatchFlags() directly and registers t.Cleanup(resetBatchFlags); a comment would help.
  • Reason: Stylistic. Both calls are intentional (existing pre-call protected against state from earlier tests; new cleanup prevents leaking to later tests). Skipped.

Note that TestResolveFilename cannot be parallelized

  • Reviewer: @claude
  • Comment: The test uses os.Chdir, which is process-global; parallelizing would race.
  • Reason: True but theoretical — the package doesn't use t.Parallel() anywhere. The test would already break under parallelism for unrelated reasons (shared format, dryRun, configCache). Skipped as a class-wide concern, not specific to this test.

Optional warning when convertPhaseMarkersToPositions falls back to -1

  • Reviewer: @claude
  • Comment: A debug log would aid future diagnosis.
  • Reason: Reviewer explicitly said "Not a blocker." Skipped.

@claude

claude Bot commented Apr 20, 2026

Copy link
Copy Markdown

Claude finished @ArjenSchwarz's task in 43s —— View job


PR Review — Iteration 1 Response

Thanks for the disposition summary. All four deferred items are well-reasoned — agreed with the conclusions:

  • AfterTaskID = "0" test case — The "abc" case already exercises the same -1 fallback path; adding "0" would be documentation only. Deferred is correct.
  • Inline reset+cleanup comment — The intent is clear from reading the two lines together. Skipping is fine.
  • t.Parallel() note — Accurate but theoretical given the package's current state. A class-wide concern rather than specific to this test.
  • Debug log in convertPhaseMarkersToPositions — Not a blocker; the fallback is intentional and the function is small enough to trace quickly.

Review complete. No outstanding issues.

The PR achieves its stated goal — all four pre-existing CI failures fixed, CI fully green, and the testing.md docs updated to prevent recurrence. Clean, minimal, well-targeted changes.

If you'd like any of the deferred items tracked, I can file follow-up issues for them — just say the word.


@ArjenSchwarz ArjenSchwarz merged commit 60b54e0 into main Apr 20, 2026
12 checks passed
@ArjenSchwarz ArjenSchwarz deleted the fix/ci-test-failures branch April 20, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant