Fix triage regex, CLI validation, and the self-approving daily-triage loop by Nagendhra-web · Pull Request #63 · cobusgreyling/loop-engineering

Nagendhra-web · 2026-06-25T04:59:21Z

Summary

While reading through the tooling I found three small bugs and one workflow issue. This PR fixes all four. Each one is a separate commit so they can be reviewed or cherry-picked on their own. Every package test suite passes: loop-audit 10/10, loop-init 6/6, loop-cost 7/7.

Area	Type	What changed
loop-audit	bug + refactor	Repair a dead regex in activity detection; move scoring weights into named constants
loop-cost	bug	Validate the readiness level instead of silently treating an invalid one as L3
loop-init	robustness	Reject an unknown pattern or tool with a clear error instead of a confusing crash
daily-triage.yml	safety	Post commit statuses from the real gate outcomes instead of a hardcoded success

1. loop-audit: dead triage regex, plus a scoring-weights cleanup

In detectLoopActivity, the git history scan matched the literal token " t riage " (spaces around the word, and a space inside it) instead of triage:

// before
if (/state\.md|loop| t riage |changelog-drafter|post-merge|daily triage|audit/i.test(lower)) {
// after
if (/state\.md|loop|triage|changelog-drafter|post-merge|daily triage|audit/i.test(lower)) {

That branch can never match a real commit subject, so commits whose only loop signal is the word triage were not being counted as activity. It looked fine because the other terms in the alternation (loop, audit, state.md) covered most real cases, and the existing tests checked the final result rather than this specific branch. I added a regression test that creates a temporary git repo with a single triage commit and asserts that git evidence is detected. It skips cleanly if git is not available.

While I was in the file I also moved the scoring weights out of the function body. The score contributions (18, 14, 9, and so on) and the level cutoffs (38, 58, 78) were inline magic numbers, and the L3 threshold was written out twice, once in computeScore and once in auditProject, which is easy to get out of sync. They now live in documented SCORE_WEIGHTS and LEVEL_THRESHOLDS constants. This part is a pure refactor. The numbers are identical and the existing boundary tests pass unchanged.

2. loop-cost: an invalid level silently became L3

Passing something like --level garbage was cast to ReadinessLevel and handed to realisticMix. That function has no matching branch for an unknown level, so it falls through to the L3 return and produces a confident but wrong estimate. The CLI now rejects an unknown level with a clear message and a non-zero exit. estimateCost also checks the level at the library boundary and throws, so code that calls it directly cannot hit the same silent fallthrough. Added a test for the rejection.

3. loop-init: unknown pattern or tool gave a confusing error

parseArgs cast --pattern and --tool straight to their union types with no validation, so a typo like --tool emacs flowed into an undefined lookup later and failed in a way that did not point at the real problem. Both are now validated against the existing PATTERN_STARTERS and TOOL_SUFFIX keys, so there is still a single source of truth, and the CLI exits with the list of valid values. Added two tests.

4. daily-triage.yml: the loop was approving its own merge

The daily-triage workflow posted the validate and audit commit statuses as a hardcoded success to satisfy branch protection, then auto-merged its own PR. Since those statuses were unconditional, the loop could merge itself whether or not the real gates passed. The real gate scripts were already being run in the job, but their results were not used. This is the kind of unattended change without verification that docs/safety.md and docs/concepts.md warn about, so it felt worth fixing in the reference repo itself.

The change keeps the automation, since the statuses still need to be posted (a GITHUB_TOKEN push does not trigger the PR workflows), but ties them to reality:

The two gate steps now run with continue-on-error and an explicit id, and a follow-up step fails the run if either gate failed. So a red gate stops the PR from being opened or merged.
The status step now derives each state from the gate's real step outcome rather than a literal success, so a green status can only be posted when the gate actually passed.

The blast radius here was already small, only STATE.md and loop-run-log.md, but it lines the repo's own loop up with the verifier discipline the docs describe. If you would rather discuss this one separately, I am happy to pull it into its own PR or open an issue instead.

Testing

loop-audit:  10 passed   (includes the new triage-commit activity test)
loop-init:    6 passed   (includes 2 new pattern/tool validation tests)
loop-cost:    7 passed   (includes the new invalid-level test)

The compiled dist/ output is committed next to each source change, since the packages publish from committed dist/ and the test suites import from it, so the build artifacts stay in sync with the source.

Two related fixes to the audit engine, shipped together because they live in the same recompiled source file. 1. Dead triage regex (bug) The git-history scan in detectLoopActivity matched the literal token " t riage " (spaces around it and a space inside the word) rather than "triage". That alternation branch could never match a real commit subject, so triage commits were silently not counted as loop activity. It looked correct only because adjacent terms (loop, audit, state.md) masked it. The fix changes the pattern to "triage" and adds a regression test that initializes a temporary git repo with a single triage commit and asserts that git evidence is detected. The test skips gracefully when git is unavailable. 2. Magic scoring weights (maintainability) The score contributions (18, 14, 9, and so on) and the level cutoffs (38, 58, 78) were inlined as magic numbers, and the L3 threshold was duplicated across computeScore and auditProject, which made it easy to desync. They are now extracted into documented SCORE_WEIGHTS and LEVEL_THRESHOLDS constants. This is a pure refactor: the values are unchanged and the existing boundary tests pass without modification, which proves behavior parity. All 10 loop-audit tests pass.

Passing an invalid level such as "--level garbage" was cast straight to ReadinessLevel and flowed into realisticMix, whose if/else chain has no matching branch and falls through to the L3 return. The result was a confident but wrong estimate for an invalid level. The CLI now rejects an unknown level with a clear message and exit code 1. estimateCost also guards at the library boundary and throws on an invalid level, so callers using it programmatically cannot hit the silent fallthrough either. A test asserts that estimateCost rejects an invalid level. All 7 loop-cost tests pass.

parseArgs cast --pattern and --tool straight to their union types with no validation, so a typo such as "--tool emacs" flowed downstream into an undefined record lookup and a confusing failure. Both values are now validated against the existing PATTERN_STARTERS and TOOL_SUFFIX keys, which keeps a single source of truth with no duplicated lists, and the CLI exits with code 1 and the list of valid values. Two CLI tests assert the friendly errors. All 6 loop-init tests pass.

The daily-triage loop posted the validate and audit commit statuses as a hardcoded success state to satisfy branch protection, then auto-merged its own PR. Because the green statuses were unconditional, the loop could merge itself regardless of whether the real gates passed. That is the unattended-without-verification failure mode that docs/safety.md and docs/concepts.md (comprehension debt) explicitly warn against. The real gate scripts (ci-validate-gates.sh and ci-audit-gates.sh) were already run in the job, but their results were ignored. This change keeps the automation, since statuses still need to be posted because GITHUB_TOKEN-pushed commits do not trigger PR workflows, but it makes the automation honest: - The two gate steps now run with continue-on-error and an explicit step id. A follow-up step fails the run if either gate failed, so no PR is opened or merged on a red gate. - The commit-status step derives each status state from the gate's real step outcome instead of a literal success value. A green status can now only be posted when the gate actually passed. The blast radius was already small, limited to STATE.md and loop-run-log.md, but this aligns the reference repo's own loop with the verifier discipline it teaches.

Nagendhra-web requested a review from cobusgreyling as a code owner June 25, 2026 04:59

Nagendhra-web added 4 commits June 25, 2026 01:01

Nagendhra-web force-pushed the fix/loop-tooling-bugs branch from 62d2d41 to e9f1fcf Compare June 25, 2026 05:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix triage regex, CLI validation, and the self-approving daily-triage loop#63

Fix triage regex, CLI validation, and the self-approving daily-triage loop#63
Nagendhra-web wants to merge 4 commits into
cobusgreyling:mainfrom
Nagendhra-web:fix/loop-tooling-bugs

Nagendhra-web commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nagendhra-web commented Jun 25, 2026

Summary

1. loop-audit: dead triage regex, plus a scoring-weights cleanup

2. loop-cost: an invalid level silently became L3

3. loop-init: unknown pattern or tool gave a confusing error

4. daily-triage.yml: the loop was approving its own merge

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant