Context
The github-actions[bot] test-summary comment on PRs sometimes reports a non-zero failed-tests count even when every individual per-package job in the same workflow run reports success. Header is rendered as ❌ Test Results, which makes the PR look broken at a glance.
Reproduction
Workflow run 26316743941 (PR #100, commit 85ab9cb):
| Status |
Suites |
Tests |
| ✅ Passed |
132 |
331 |
| ❌ Failed |
0 |
9 |
| ⏭️ Skipped |
0 |
0 |
| Total |
132 |
340 |
But gh api repos/saga-ed/soa/actions/runs/26316743941/jobs shows every per-package job (22 of them) concluded success. No suite failed; aggregate disagrees with itself (Suites Failed 0 but Tests Failed 9).
Hypotheses
- Aggregator counts
it.todo / it.skip as "failed" instead of "skipped"
- Vitest retry semantics: failed-then-passed attempts double-count
- JUnit XML parser bug on suite/test status mismatch
- Stale cached test reports merged into the summary
Suggested next step
Pull the raw JUnit XMLs from the artifacts for that run and diff against the rendered table to identify which 9 tests are being classified as failed.
Why this matters
Every PR currently shows ❌ in the test-summary comment even when CI is green. This trains reviewers to ignore the comment, which means a real regression would also be ignored.
Context
The
github-actions[bot]test-summary comment on PRs sometimes reports a non-zero failed-tests count even when every individual per-package job in the same workflow run reportssuccess. Header is rendered as ❌ Test Results, which makes the PR look broken at a glance.Reproduction
Workflow run 26316743941 (PR #100, commit
85ab9cb):But
gh api repos/saga-ed/soa/actions/runs/26316743941/jobsshows every per-package job (22 of them) concludedsuccess. No suite failed; aggregate disagrees with itself (Suites Failed 0 but Tests Failed 9).Hypotheses
it.todo/it.skipas "failed" instead of "skipped"Suggested next step
Pull the raw JUnit XMLs from the artifacts for that run and diff against the rendered table to identify which 9 tests are being classified as failed.
Why this matters
Every PR currently shows ❌ in the test-summary comment even when CI is green. This trains reviewers to ignore the comment, which means a real regression would also be ignored.