Fix dead triage regex, CLI input validation, and self-grading dogfood loop by Nagendhra-web · Pull Request #62 · cobusgreyling/loop-engineering

Nagendhra-web · 2026-06-25T04:50:04Z

Summary

Four focused correctness/maintainability fixes to the tooling and the dogfood workflow, found while reading through the repo. Each is a separate commit so they can be reviewed (or cherry-picked) independently. All package test suites pass: loop-audit 10/10, loop-init 6/6, loop-cost 7/7.

#	Area	Type	Commit
1	`loop-audit`	bug + refactor	`fix(loop-audit): repair dead triage regex + centralize scoring weights`
2	`loop-cost`	bug	`fix(loop-cost): validate --level instead of silently defaulting to L3`
3	`loop-init`	robustness	`fix(loop-init): reject unknown --pattern / --tool with a clear error`
4	`daily-triage.yml`	safety	`fix(ci): derive daily-triage commit statuses from real gate outcomes`

1. `loop-audit` — dead `t riage` regex (bug) + magic-number scoring (refactor)

Bug: In detectLoopActivity(), the git-history scan matched the literal t riage (spaces around it and a space inside the word) instead of triage:

// before
if (/state\.md|loop| t riage |changelog-drafter|post-merge|daily triage|audit/i.test(lower)) {
// after
if (/state\.md|loop|triage|changelog-drafter|post-merge|daily triage|audit/i.test(lower)) {

That alternation branch could never match a real commit subject, so plain triage commits were silently not counted as loop activity. It looked fine only because adjacent terms (loop/audit/state.md) masked it — the existing tests asserted the outcome, not the mechanism. Added a regression test that inits a temp git repo with a single triage commit and asserts git evidence is detected (skips gracefully when git is unavailable).

Refactor (no behavior change): the +18/+14/+9… score contributions and the 38/58/78 level cutoffs were inlined as magic numbers, and the L3 threshold (78) was duplicated across computeScore() and auditProject() — easy to desync. Extracted into documented SCORE_WEIGHTS and LEVEL_THRESHOLDS constants. Values are identical; the existing L0/L1/L2/L3 boundary tests pass unmodified, proving parity.

2. `loop-cost` — `--level` silently defaulted to L3

--level garbage was cast to ReadinessLevel and flowed into realisticMix(), whose if/else has no matching branch and falls through to the L3 return — a confident, wrong estimate. Now the CLI rejects an unknown level with a friendly message, and estimateCost() also guards at the library boundary (throws) so programmatic callers can't hit the silent fallthrough. Added a test.

3. `loop-init` — unknown `--pattern` / `--tool` produced a confusing failure

parseArgs() cast both straight to their union types with no validation, so a typo like --tool emacs flowed into an undefined record lookup. Now validated against the existing PATTERN_STARTERS / TOOL_SUFFIX keys (no duplicated lists) with exit 1 and the list of valid values. Added two CLI tests.

4. `daily-triage.yml` — the loop marked its own homework green

The daily-triage loop posted validate and audit commit statuses as a hardcoded state: 'success' to satisfy branch protection, then auto-merged its own PR. Since the green statuses were unconditional, the loop could merge itself regardless of whether the real gates passed — the unattended-without-verification failure mode that docs/safety.md and docs/concepts.md (comprehension debt) explicitly warn about. The real gate scripts were already run in the job, but their results were ignored.

This keeps the automation (statuses still need posting because GITHUB_TOKEN-pushed commits don't trigger PR workflows) but makes it honest:

The two gate steps now run with continue-on-error + an explicit id; a follow-up step fails the run if either gate failed, so no PR is opened or merged on a red gate.
The commit-status step derives each state from the gate's real steps.<id>.outcome instead of a literal 'success'. Green can only be posted when the gate actually passed.

Blast radius was already small (only STATE.md + loop-run-log.md), but this aligns the reference repo's own loop with the verifier discipline it teaches. (Happy to split #4 into a separate PR or downgrade it to an issue if you'd prefer to discuss the workflow change on its own — it's the one opinionated change here.)

Testing

loop-audit:  10 passed   (incl. new triage-commit activity regression test)
loop-init:    6 passed   (incl. 2 new --pattern/--tool validation tests)
loop-cost:    7 passed   (incl. new invalid-level test)

Compiled dist/ is committed alongside each source change (the packages publish from committed dist/, and the test suites import from dist/), so the npm artifacts stay in sync.

🤖 Generated with Claude Code

Two related correctness/maintainability fixes to the audit engine, shipped together because they live in the same recompiled source file. 1. Dead `t riage` regex (bug): detectLoopActivity()'s git-history scan matched the literal ` t riage ` (surrounding spaces + a space inside the word) instead of `triage`, so it could never match a real commit subject and triage commits were silently not counted as loop activity. The branch appeared to "work" only because adjacent alternatives (loop/audit/state.md) masked it. Fixed to `triage`, with a regression test that inits a temp git repo with a single triage commit and asserts git evidence is detected (skips gracefully if git is unavailable). 2. Magic scoring weights (maintainability): The +18/+14/+9... contributions and the 38/58/78 level cutoffs were inlined as magic numbers across computeScore() and auditProject(), so the L3 gate threshold was duplicated and easy to desync. Extracted into documented SCORE_WEIGHTS and LEVEL_THRESHOLDS constants. Pure refactor — values are unchanged and the existing threshold tests (L0/L1/L2/L3 boundaries) pass unmodified, proving behavior parity. All 10 loop-audit tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

`--level garbage` was cast straight to ReadinessLevel and flowed into realisticMix(), whose if/else chain has no matching branch and falls through to the L3 return — producing a confident, wrong estimate for an invalid level. - CLI now rejects an unknown --level with a friendly message + exit 1. - estimateCost() also guards at the library boundary (throws on an invalid level), so callers using it programmatically can't hit the silent-fallthrough either. - Add a test asserting estimateCost rejects an invalid level. All 7 loop-cost tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

parseArgs() cast --pattern and --tool straight to their union types with no validation, so a typo (e.g. `--tool emacs`) flowed downstream into an undefined record lookup and a confusing failure. - Validate both against the existing PATTERN_STARTERS / TOOL_SUFFIX keys (single source of truth, no duplicated lists) and exit 1 with the list of valid values. - Add two CLI tests asserting the friendly errors. All 6 loop-init tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The daily-triage loop posted `validate` and `audit` commit statuses as a hardcoded `state: 'success'` to satisfy branch protection, then auto-merged its own PR. Because the green statuses were unconditional, the loop structurally "marked its own homework" — exactly the unattended-without-verification failure mode this repo's own docs/safety.md and concepts.md (comprehension debt) warn against. The real gates (ci-validate-gates.sh, ci-audit-gates.sh) were already run in the job but their results were ignored. This change keeps the automation (statuses still need posting because GITHUB_TOKEN-pushed commits don't trigger PR workflows) but makes it honest: - The two gate steps now run with continue-on-error and an explicit step id; a follow-up step fails the run if either gate failed, so no PR is opened or merged on a red gate. - The commit-status step derives each status's state from the corresponding gate's real `steps.<id>.outcome` instead of a literal 'success'. A green status can now only be posted when the gate actually passed. Blast radius was already small (only STATE.md + loop-run-log.md), but this aligns the reference repo's own loop with the L1/L2/L3 + verifier discipline it teaches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

nagendhra-tech and others added 4 commits June 25, 2026 00:45

Nagendhra-web requested a review from cobusgreyling as a code owner June 25, 2026 04:50

Nagendhra-web closed this Jun 25, 2026

Nagendhra-web deleted the fix/auditor-regex-cli-validation-scoring branch June 25, 2026 04:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dead triage regex, CLI input validation, and self-grading dogfood loop#62

Fix dead triage regex, CLI input validation, and self-grading dogfood loop#62
Nagendhra-web wants to merge 4 commits into
cobusgreyling:mainfrom
Nagendhra-web:fix/auditor-regex-cli-validation-scoring

Nagendhra-web commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Nagendhra-web commented Jun 25, 2026

Summary

1. loop-audit — dead t riage regex (bug) + magic-number scoring (refactor)

2. loop-cost — --level silently defaulted to L3

3. loop-init — unknown --pattern / --tool produced a confusing failure

4. daily-triage.yml — the loop marked its own homework green

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `loop-audit` — dead `t riage` regex (bug) + magic-number scoring (refactor)

2. `loop-cost` — `--level` silently defaulted to L3

3. `loop-init` — unknown `--pattern` / `--tool` produced a confusing failure

4. `daily-triage.yml` — the loop marked its own homework green