Fix CI test failures (T-856, T-857, T-859)#92
Conversation
- T-859 (code): cmd/renumber.go was building an idToPosition map from raw file task IDs, but ExtractPhaseMarkers (changed in T-742) now returns AfterTaskID as a 1-based sequential count of preceding top-level tasks. The mismatched lookup silently dropped later phase markers in files with non-sequential top-level IDs. Replace extractTaskIDOrder + idToPosition with a direct N→N-1 conversion in convertPhaseMarkersToPositions, drop the dead helper, and rewrite TestConvertPhaseMarkersToPositions for the new contract. - T-856 (test): internal/task/phase_test.go:1553 still called RenderMarkdownWithPhases with two arguments after T-698 added a phaseSource *TaskList parameter for filtered phase rendering. All other callers pass nil; the regression test was simply missed. Pass nil. - T-857 (test): resetBatchFlags() reset batchInput but not the format global, so any batch test that called rootCmd.Execute() with --format json (TestBatchCommand_JSONOutput, TestBatchCommand_RemoveOnPhasedFilePreservesPhases) left format = "json" for every subsequent test in the package, breaking TestRunCompleteDryRun which depends on the default. Reset format and the format flag's Changed bit in resetBatchFlags, and register t.Cleanup(resetBatchFlags) on the two leaking tests. - TestResolveFilename (test): the "no filename provided" case asserted that discovery is "disabled by default", but Discovery.Enabled defaults to true. The test happened to pass when the host branch+filesystem produced no discovery match; on CI for branches like T-824/homebrew-install (with specs/homebrew-install/tasks.md present) discovery succeeded and the test failed. Isolate the test in a fresh git repo with discovery.enabled: false so the outcome is deterministic. Also update docs/agent-notes/testing.md with the new resetBatchFlags contract, the discovery-isolation requirement, and a note on the ExtractPhaseMarkers / renumber.go invariant.
Code ReviewOverall this is a well-structured, well-documented fix for four pre-existing CI failures. Each bug is clearly identified, the root causes are accurately diagnosed, and the changes are minimal and targeted. T-859 —
|
| Area | Verdict |
|---|---|
| T-859 code fix | ✅ Correct, significantly cleaner |
| T-856 test fix | ✅ Trivial one-liner, correct |
| T-857 state leak fix | ✅ Correct; minor comment suggestion |
| Discovery isolation | ✅ Correct; minor parallelism note |
| Docs update | ✅ Well-written, good future-proofing |
| Test coverage | ✅ New edge-case test added; AfterTaskID = "0" worth adding |
| Security | ✅ No concerns |
| Performance | ✅ Slight improvement (regex+map → O(1) arithmetic) |
No blocking issues. The fallback-without-warning behavior in convertPhaseMarkersToPositions and the missing "0" test case are the only things worth a follow-up, and neither is urgent.
PR Review Overview - Iteration 1PR: #92 | Branch: fix/ci-test-failures | Date: 2026-04-20 CI StatusAll 12 checks PASSED (Test and Build, claude-review, all Integration Tests across ubuntu/macos/windows × 1.0.0/latest, caching, error handling). The PR achieves its stated goal of turning CI green. Valid Issues (fixed)None — review found no blocking issues. Invalid/Skipped Issues (Minor — deferred)Add
|
|
Claude finished @ArjenSchwarz's task in 43s —— View job PR Review — Iteration 1 ResponseThanks for the disposition summary. All four deferred items are well-reasoned — agreed with the conclusions:
Review complete. No outstanding issues. The PR achieves its stated goal — all four pre-existing CI failures fixed, CI fully green, and the testing.md docs updated to prevent recurrence. Clean, minimal, well-targeted changes. If you'd like any of the deferred items tracked, I can file follow-up issues for them — just say the word. |
Summary
Fix the four pre-existing test failures that were red on main:
cmd/renumber.golooked upmarker.AfterTaskIDin a map keyed by raw file IDs, butExtractPhaseMarkerswas changed in T-742 to returnAfterTaskIDas a 1-based sequential count of preceding top-level tasks. The mismatched lookup silently dropped later phase markers in files with non-sequential top-level IDs (multi-phase + gapped IDs would render with wrong anchors or lost headers). ReplacedextractTaskIDOrder+ theidToPositionmap with a directN → N-1conversion inconvertPhaseMarkersToPositions, dropped the dead helper, and rewrote its unit test for the new contract.internal/task/phase_test.go:1553still calledRenderMarkdownWithPhaseswith two arguments after T-698 added aphaseSource *TaskListparameter. All other call sites passnil; the regression test was just missed.resetBatchFlags()resetbatchInputbut not theformatglobal. Two batch tests (TestBatchCommand_JSONOutput,TestBatchCommand_RemoveOnPhasedFilePreservesPhases) calledrootCmd.Execute()with--format json, leavingformat = "json"for every subsequent test.TestRunCompleteDryRunthen failed because it depends on the default"table"format. Resetformatand the format flag'sChangedbit inresetBatchFlags, and registert.Cleanup(resetBatchFlags)on the two leaking tests.Discovery.Enableddefaults totrue. The test passed only when the host branch + filesystem produced no discovery match; on CI for branches likeT-824/homebrew-install(withspecs/homebrew-install/tasks.mdpresent), discovery succeeded and the test failed. Now isolated in a fresh git repo withdiscovery.enabled: false.Also updates
docs/agent-notes/testing.mdwith the newresetBatchFlagscontract, the discovery-isolation requirement, and a note on theExtractPhaseMarkers/renumber.goinvariant.Test plan
make check(fmt + lint + unit tests) passes locallymake test-integrationpasses locally