feat(assess): add Agent Maturity Assessment as a first-class command#6
feat(assess): add Agent Maturity Assessment as a first-class command#6asabaylus wants to merge 6 commits into
Conversation
Adds `teamhero assess` (CLI) and a new TUI subcommand alongside report / setup / doctor. Scores an engineering org against the 12-criterion Agent Maturity Assessment (4 weighted categories: engineering basics, knowledge & context 1.5×, AI governance & quality 1.25×, hiring), producing a weighted percentage, a raw /12, item-level evidence, top-3 fixes, strengths, and a maturity band marker. Pipeline: preflight (gh / GitHub MCP / git-only) → adjacent-repo detection → Phase-1 interview (7 questions, asked one at a time over a bidirectional JSON-lines protocol) → 12 deterministic evidence collectors → AI scoring (OpenAI Responses API + strict JSON schema, with tier-3 caps enforced post-hoc on items 2/3/9/11) → audit markdown matching the canonical template + sibling .json artifact + docs/audits/CONFIG.md round-trip. The TUI flow uses the same visual design as `teamhero report`: two- pane Bubble Tea progress display with monotonic progress bar, spinner-driven step list (✔/✖/○ icons), right-side configuration summary with AI/dry-run badge, and a tabbed Glamour-rendered preview (Audit / Evidence / JSON Data) that mirrors the report preview. Scope is configurable: local repo, GitHub org, or both. Headless mode accepts a JSON file of pre-supplied interview answers; interactive mode round-trips each question through huh forms one at a time. Rubric is hardcoded in src/services/maturity/rubric.ts (RUBRIC_VERSION participates in the cache key) so the binary doesn't depend on external skill files at runtime. Tests: 81 TS specs (rubric, scoring, interview, audit-writer, audit- store, evidence-collectors, adjacent-repos, maturity-prompts, stdin- interview, end-to-end dry-run) + Go specs for the TUI (progress state machine, summary panel, preview tab bar, config round-trip, runner glue). All passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first thing the user saw running `teamhero assess` was a bare huh select prompt — no banner, no two-pane layout. The wizard now hosts its huh forms inside a Bubble Tea program that renders the same shell-header + left-form / right-summary / nav-hints layout the report wizard uses. - New `assessWizardModel` Bubble Tea model with state machine: awScopeMode → awLocalPath / awOrg / awBoth → awConfirm → awDone - View() composes renderShellHeader + form panel + renderAssessSummary + "esc back • ctrl+c quit" hints, identical structure to wizard.go::View - Esc navigates back through history; ctrl+c aborts; final confirm step matches the report's wsConfirmRun flow - 14 new unit tests cover state transitions, scope-specific advance paths, history navigation, abort handling, View() rendering, and the parseRepoCSV / requireNonEmpty helpers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The interview was breaking the layout: each Phase-1 question called
huh.Form.Run() in a separate full-screen huh program after releasing
the alt-screen. No shell header, no two-pane summary, just a bare
prompt — discontinuous with the rest of the assess flow.
The progress model now hosts the interview form directly, mirroring how
the report wizard handles its wsConfirmRun step inside the same Bubble
Tea program. When an interview-question event arrives:
- Build a huh.Form for the option select (with "Other (type your own)"
appended when allowFreeText is true)
- Mount it on the model's `interviewForm` field
- View() swaps the left pane from the progress panel to the form panel
while keeping renderShellHeader on top, renderAssessSummary on the
right, and an updated nav-hints footer ("↑↓ navigate • enter submit")
- Update() routes key events to the form first; on huh.StateCompleted
the model either transitions to a free-text follow-up (when the
sentinel was chosen) or invokes the sendAnswer callback that writes
the answer JSON line back to the runner's stdin
Also reworked RunAssessProgressDisplay's signature: it now takes
sendAnswer directly instead of an askInterview prompter, since prompting
lives inside the model.
Tests added/updated: TestAssessProgress_InterviewQuestionMountsForm,
TestAssessProgress_SubmitInterviewAdvances,
TestAssessProgress_FreeTextSentinelTransitions,
TestAssessProgress_FreeTextEmptyMapsToUnknown,
TestAssessProgress_QKeyDuringInterviewIsRoutedToForm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…kill CI's test-go job failed at 82.6% < 85% block coverage. Added tui/assess_coverage_test.go covering the uncovered functions in the new assess package: - assess_progress: Init, View with mounted interview form, hintsText, renderInterviewPanel, fitLine truncation paths, WindowSizeMsg reflow, failed-step rendering, report-data event handling, contentWidth / formWidth minimums - assess_preview: Init, reflow, previewFrameHeight floor, Update for rendered/keymsg/window-resize, tab switching forward+reverse with wraparound, q-quit, RunAssessPreview happy-path with stubbed teaProgramRun, View while rendering and after render - assess_wizard: Init, WindowSizeMsg, esc with empty history aborts, validateLocalPath edge cases (empty / nonexistent / file / dir), defaultScopeMode / defaultLocalPath - assess_flags: applyAssessFlagsTo with all flags set vs none set - assess_config: SaveAssessConfig dir-creation branch New total: 85.5%. Also ships share/skills/agent-maturity-assessment/ — a self-contained, harness-agnostic copy of the skill (SKILL.md + references/criteria.md + interview.md + output-template.md + preflight.md) intended for distribution to other Claude harnesses. Mentions the teamhero binary as an optional accelerator but works standalone in pure-Claude mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new docs and wires them into the existing index: - docs/MATURITY_ASSESSMENT.md — complete user-facing reference for `teamhero assess`: what gets scored (12 items, 4 weighted categories, bands), the two ways to run (interactive TUI / headless), full Phase-1 interview table with Q→criterion mapping, answers.json shape, evidence tiers, every CLI flag, scoring pipeline, output structure, env vars, re-audit cadence, troubleshooting - share/skills/agent-maturity-assessment/INSTALL.md — install guide for the standalone harness-agnostic skill bundle: covers Claude Code (~/.claude/skills/), Cowork / Workbench, custom SDK harnesses, verifying the install, customizing rubric/triggers/interview, and attribution Cross-links: - README.md — links to docs/MATURITY_ASSESSMENT.md in the assess section and the Learn-more list; adds a "Shareable maturity- assessment skill" subsection that points at the share/skills/ bundle - docs/ARCHITECTURE.md — adds a "MaturityService" key-components entry summarizing the pipeline and linking to the full reference No code changes. CI should still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude review this code |
📝 WalkthroughWalkthroughAdds a new teamhero assess command with a Go TUI and TypeScript service: preflight tier detection, adjacent repo scan, interview round-trip, deterministic evidence collection, AI scoring, audit writing (markdown+JSON), CLI wiring, docs, fixtures, and tests. ChangesEnd-to-end maturity assessment (CLI → TUI → Service)
Sequence Diagram(s)sequenceDiagram
participant User
participant teamhero as CLI (Node)
participant TUI as Go TUI
participant Runner as scripts/run-assess.ts
participant Service as MaturityService
participant FS as Filesystem
User->>teamhero: teamhero assess --flags
teamhero->>TUI: spawn assess
TUI->>Runner: send AssessConfig (stdin)
Runner->>Service: run(input)
Service->>Service: detectTier → detectAdjacentRepos
Service-->>TUI: interview-frame/question (JSONL via Runner)
TUI-->>Service: interview-answer (JSONL via Runner)
Service->>FS: collect deterministic evidence
Service->>Service: AI score (schema JSON)
Service->>FS: write audit.md + audit.json
Service-->>Runner: result (paths, artifact)
Runner-->>TUI: result (JSONL)
TUI-->>User: progress → preview paths
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
|
Note Docstrings generation - SUCCESS |
Docstrings generation was requested by @asabaylus. * #6 (comment) The following files were modified: * `scripts/run-assess.ts` * `src/cli/index.ts` * `src/services/maturity/adjacent-repos.ts` * `src/services/maturity/ai-scorer.ts` * `src/services/maturity/audit-store.ts` * `src/services/maturity/audit-writer.ts` * `src/services/maturity/evidence-collectors.ts` * `src/services/maturity/fs-utils.ts` * `src/services/maturity/interview.ts` * `src/services/maturity/maturity-prompts.ts` * `src/services/maturity/maturity.service.ts` * `src/services/maturity/preflight.ts` * `src/services/maturity/rubric.ts` * `src/services/maturity/scoring.ts` * `tui/assess.go` * `tui/assess_config.go` * `tui/assess_flags.go` * `tui/assess_preview.go` * `tui/assess_progress.go` * `tui/assess_runner.go` * `tui/assess_summary.go` * `tui/assess_wizard.go` * `tui/main.go`
There was a problem hiding this comment.
Actionable comments posted: 14
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (12)
.env.schema-45-47 (1)
45-47:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winKeep
.env.schemakeys in expected order to satisfy dotenv-linter.
MATURITY_AI_MODELcurrently appears afterVISIBLE_WINS_AI_MODEL, which triggersUnorderedKey.Suggested diff
-# `@type`=string -VISIBLE_WINS_AI_MODEL= # `@type`=string AI_DISCREPANCY_ANALYSIS_MODEL= # `@type`=string AI_TECHNICAL_WINS_MODEL= # Override AI model for the Agent Maturity Assessment scorer (falls back to AI_MODEL). # `@type`=string MATURITY_AI_MODEL= +# `@type`=string +VISIBLE_WINS_AI_MODEL=🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.env.schema around lines 45 - 47, The MATURITY_AI_MODEL key in .env.schema is out of order and triggers dotenv-linter's UnorderedKey; move the MATURITY_AI_MODEL entry so it appears in the expected order (alphabetical/grouping consistent with the surrounding keys) relative to VISIBLE_WINS_AI_MODEL and other AI model vars, ensuring the comment and `@type`=string line stay with the key (i.e., cut the MATURITY_AI_MODEL block and paste it into the correct position to satisfy dotenv-linter).docs/maturity-skill-ref/references/output-template.md-98-102 (1)
98-102:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEscape pipe characters inside table code spans.
The
||snippets are parsed as extra table separators, which causes MD056 and truncates cell content in some renderers.Suggested diff
-|Too long |`pnpm -r test resolves to nothing — no package implements test. ci.yml line 80: dotnet test || true with comment 'no real tests yet'. Zero test files anywhere. Architecture is testable in principle but the inner loop runs nothing.`| +|Too long |`pnpm -r test resolves to nothing — no package implements test. ci.yml line 80: dotnet test \|\| true with comment 'no real tests yet'. Zero test files anywhere. Architecture is testable in principle but the inner loop runs nothing.`| |Too vague |`No tests exist.` | -|Right size|`CI runs dotnet test || true, no test files exist anywhere, and the architecture's seams sit unused.` | +|Right size|`CI runs dotnet test \|\| true, no test files exist anywhere, and the architecture's seams sit unused.` |🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/maturity-skill-ref/references/output-template.md` around lines 98 - 102, Escape the pipe characters inside the inline code spans so the table parser doesn't treat "||" as column separators; update the cell containing "CI runs dotnet test || true" to use escaped pipes (e.g., `dotnet test \|\| true`) and any other inline code with "||" so MD056/noise stops truncating content, leaving the rest of the text unchanged.src/services/maturity/interview.ts-106-114 (1)
106-114:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winNormalize smart-quote variants in
isUnknownAnswer().
I don’t knowwith a curly apostrophe currently returnsfalse, which is a realistic copy/paste or mobile-keyboard input. That would score the criterion from free text instead of marking it not assessed.Suggested fix
const UNKNOWN_TOKENS = new Set( ["i don't know", "i dont know", "unknown", "n/a", "na", "skip"].map((s) => s.toLowerCase().trim(), ), ); +function normalizeUnknownToken(value: string): string { + return value.trim().toLowerCase().replace(/[’]/g, "'"); +} + export function isUnknownAnswer(value: string): boolean { - return UNKNOWN_TOKENS.has(value.trim().toLowerCase()); + return UNKNOWN_TOKENS.has(normalizeUnknownToken(value)); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/maturity/interview.ts` around lines 106 - 114, The isUnknownAnswer check misses smart-quote variants like “I don’t know”; update normalization so it converts curly apostrophes/quotes to straight ones before trimming/lowering and checking UNKNOWN_TOKENS (i.e., in function isUnknownAnswer replace characters such as U+2019/U+2018 and other common smart quotes/backticks with "'" and optionally normalize other unicode apostrophe variants, then trim().toLowerCase() and call UNKNOWN_TOKENS.has on that normalized string); reference symbols: UNKNOWN_TOKENS and isUnknownAnswer.src/services/maturity/audit-writer.ts-72-160 (1)
72-160:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse
formatDateUTC()when displaying the audit date.Lines 75 and 155 display
artifact.auditDatedirectly without callingformatDateUTC(). Per coding guidelines, all display dates must useformatDateUTC()to prevent timezone shift. Convert the date string to a Date object when formatting:formatDateUTC(new Date(artifact.auditDate + 'T00:00:00Z')).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/maturity/audit-writer.ts` around lines 72 - 160, renderAuditMarkdown is inserting artifact.auditDate directly in two places (the title header and the footer) instead of formatting it via formatDateUTC; update both occurrences inside renderAuditMarkdown to call formatDateUTC(new Date(artifact.auditDate + 'T00:00:00Z')) so the displayed dates use UTC formatting and avoid timezone shifts (replace the raw artifact.auditDate in the title string and the footer string).share/skills/agent-maturity-assessment/INSTALL.md-7-16 (1)
7-16:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd language identifiers to fenced code blocks.
These fences are missing a language hint, which can trip markdown lint and reduce rendering consistency.
💡 Suggested patch
-``` +```text agent-maturity-assessment/ ├── SKILL.md ← entry point with frontmatter ├── INSTALL.md ← this file (delete after install) └── references/ ├── criteria.md ← 12-criterion rubric (full text) ├── interview.md ← 7 Phase-1 questions + Q→criterion mapping ├── output-template.md ← canonical audit template └── preflight.md ← evidence tiers + multi-repo handling@@
-+text
You: audit this repo's agent readinessAlso applies to: 38-40
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@share/skills/agent-maturity-assessment/INSTALL.md` around lines 7 - 16, The fenced code blocks in INSTALL.md (the directory tree block and the short "You: audit this repo's agent readiness" block and other similar fences around lines showing the example) lack language identifiers; update each triple-backtick fence to include a language hint (use "text" for the ASCII tree and short examples) so linting and rendering are consistent—e.g., change ``` to ```text for the directory tree block and the `You: ...` block and for the other fences referenced in the comment.README.md-283-285 (1)
283-285:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSpecify a language for the new fenced block.
The added code fence should include a language token (e.g.,
text) to satisfy markdown lint rules.💡 Suggested patch
-``` +```text share/skills/agent-maturity-assessment/ ← copy this folder to ~/.claude/skills/</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@README.mdaround lines 283 - 285, Update the fenced code block containing
the line "share/skills/agent-maturity-assessment/ ← copy this folder to
~/.claude/skills/" to include a language token (e.g., changetotext) so
the block becomes a labeled code fence; modify the block around that exact
string to usetext at the opening fence and close withat the end.</details> </blockquote></details> <details> <summary>claude-plugin/skills/agent-maturity-assessment/SKILL.md-10-11 (1)</summary><blockquote> `10-11`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Remove Asana from maturity evidence-source wording.** This line currently implies maturity detectors pull from Asana, which conflicts with the documented assess pipeline (repo/GitHub evidence tiers + interview). That mismatch can set incorrect user expectations. <details> <summary>💡 Suggested patch</summary> ```diff -It scores a 12-criterion diagnostic across four weighted categories using a -hybrid pipeline: deterministic detectors gather evidence from the local repo / -GitHub / Asana, a Phase-1 interview captures the org-level signals that aren't +It scores a 12-criterion diagnostic across four weighted categories using a +hybrid pipeline: deterministic detectors gather evidence from the local repo / +GitHub, a Phase-1 interview captures the org-level signals that aren't visible in code, and an AI scorer (OpenAI Responses API + strict JSON schema) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@claude-plugin/skills/agent-maturity-assessment/SKILL.md` around lines 10 - 11, Update the SKILL.md wording so it no longer lists Asana as an evidence source: locate the line containing "hybrid pipeline: deterministic detectors gather evidence from the local repo / GitHub / Asana, a Phase-1 interview captures the org-level signals that aren't" and remove "/ Asana" (or replace the segment with "local repo / GitHub") so the sentence matches the documented assess pipeline and retains the reference to the Phase-1 interview for org-level signals. ``` </details> </blockquote></details> <details> <summary>share/skills/agent-maturity-assessment/SKILL.md-122-122 (1)</summary><blockquote> `122-122`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Use the same audit path convention as the rest of the docs.** Line 122 points to `audits/CHANGELOG.md`, but this skill consistently stores audit artifacts under `docs/audits/…`. This inconsistency can send users to the wrong location. <details> <summary>📝 Proposed fix</summary> ```diff -As organizations mature and the AI tooling landscape shifts, expect items to be added, dropped, or re-weighted. Track changes to the assessment itself (not just individual audits) in an `audits/CHANGELOG.md` so historical scores remain interpretable. +As organizations mature and the AI tooling landscape shifts, expect items to be added, dropped, or re-weighted. Track changes to the assessment itself (not just individual audits) in a `docs/audits/CHANGELOG.md` so historical scores remain interpretable. ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@share/skills/agent-maturity-assessment/SKILL.md` at line 122, The reference to the changelog uses the wrong path string "audits/CHANGELOG.md"; update the text in SKILL.md so it points to the canonical audit location "docs/audits/CHANGELOG.md" (and sweep any other occurrences in this document to the "docs/audits/…" convention) to keep audit artifact paths consistent with the rest of the repo. ``` </details> </blockquote></details> <details> <summary>share/skills/agent-maturity-assessment/SKILL.md-85-94 (1)</summary><blockquote> `85-94`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Add a language hint to the fenced scoring block.** Line 85 uses an unlabeled fenced block, which triggers markdownlint `MD040`. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text A_total = sum(items 1–4) × 1.00 // max 4.00 B_total = sum(items 5–7) × 1.50 // max 4.50 C_total = sum(items 8–11) × 1.25 // max 5.00 D_total = sum(item 12) × 1.00 // max 1.00 ────────── weighted = A + B + C + D max = 14.50 score% = (weighted / 14.50) × 100 ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@share/skills/agent-maturity-assessment/SKILL.mdaround lines 85 - 94, The
fenced scoring block containing the A_total/B_total/C_total/D_total equations is
unlabeled and triggers markdownlint MD040; fix it by adding a language hint to
the opening fence (e.g., change the openingtotext) for that scoring
block so the block is explicitly labeled as plain text—update the fenced block
around the scoring equations (the block that starts with A_total = ...)
accordingly.</details> </blockquote></details> <details> <summary>share/skills/agent-maturity-assessment/references/preflight.md-37-37 (1)</summary><blockquote> `37-37`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Tier-3 cap wording conflicts with current scorer behavior.** “unless filesystem evidence alone is sufficient” suggests a possible 1.0 at Tier 3, but implementation applies a hard 0.5 cap for items 2/3/9/11. Please align this sentence with actual behavior. As per coding guidelines: `src/services/maturity/ai-scorer.ts`: Tier-3 (git-only) audits must cap maturity items 2, 3, 9, 11 at 0.5 even when AI awards 1.0 — use `applyTier3Caps` in `ai-scorer.ts`. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@share/skills/agent-maturity-assessment/references/preflight.md` at line 37, The sentence currently implies Tier-3 items 2, 3, 9, and 11 can reach 1.0 if filesystem evidence alone suffices, but the scorer enforces a hard cap of 0.5; change the wording to state that Tier‑3 (git-only) audits always cap items 2, 3, 9, and 11 at 0.5 and cannot be auto-promoted to 1.0 (even when AI awards 1.0), and reference the enforcement in applyTier3Caps within ai-scorer.ts so the doc matches implementation. ``` </details> </blockquote></details> <details> <summary>docs/maturity-skill-ref/SKILL.md-61-70 (1)</summary><blockquote> `61-70`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Add a language identifier to the fenced code block.** This block is missing a language tag (`MD040`), which can trip markdown lint in stricter CI setups. <details> <summary>Suggested patch</summary> ```diff -``` +```text A_total = sum(items 1–4) × 1.00 // max 4.00 B_total = sum(items 5–7) × 1.50 // max 4.50 C_total = sum(items 8–11) × 1.25 // max 5.00 D_total = sum(item 12) × 1.00 // max 1.00 ────────── weighted = A + B + C + D max = 14.50 score% = (weighted / 14.50) × 100 ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@docs/maturity-skill-ref/SKILL.mdaround lines 61 - 70, The fenced code block
containing the scoring formula is missing a language identifier (triggers
MD040); update the opening fence fromtotext (or another appropriate
language like ```none) so the block is explicitly tagged, e.g., change the
opening delimiter for the block that begins "A_total = sum(items 1–4)..." to
include the language identifier; no other content changes needed.</details> </blockquote></details> <details> <summary>tui/assess_coverage_test.go-469-473 (1)</summary><blockquote> `469-473`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Convert `TestMain_FlagPackageState` to a proper `TestMain(m *testing.M)` hook to ensure flag initialization.** The function is a regular test (not Go's `TestMain(m *testing.M)` hook), so it won't execute before tests and cannot guarantee flag state is initialized before `flag.Visit` is called in `tui/flags.go:48`. Additionally, it only references `flag.CommandLine` without actually parsing it, which doesn't satisfy the comment's intent. <details> <summary>Suggested fix</summary> ```diff -func TestMain_FlagPackageState(t *testing.T) { - _ = flag.CommandLine -} +func TestMain(m *testing.M) { + if !flag.Parsed() { + _ = flag.CommandLine.Parse([]string{}) + } + os.Exit(m.Run()) +} ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tui/assess_coverage_test.go` around lines 469 - 473, Replace the regular test TestMain_FlagPackageState with a real TestMain hook by renaming it to func TestMain(m *testing.M) and initializing the flag package there (e.g., call flag.CommandLine.Parse([]string{}) or flag.Parse() to ensure flags are set up), then call os.Exit(m.Run()); update imports to include "os" and remove the old TestMain_FlagPackageState test function. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (4)</summary><blockquote> <details> <summary>tui/assess_config_test.go (1)</summary><blockquote> `117-137`: _⚡ Quick win_ **Add regression cases for `scope-mode=both` path requirements.** Given `both` mode semantics, this suite should assert that `Org`-only is invalid and `Org+LocalPath+DisplayName` is valid. <details> <summary>Suggested diff</summary> ```diff func TestHasMinimalAssessConfig(t *testing.T) { @@ if !hasMinimalAssessConfig(&AssessConfig{ Scope: AssessScope{Mode: "local-repo", LocalPath: "/foo", DisplayName: "foo"}, }) { t.Error("local-repo+path should be valid") } + if hasMinimalAssessConfig(&AssessConfig{ + Scope: AssessScope{Mode: "both", Org: "acme", DisplayName: "acme"}, + }) { + t.Error("both without local path should be invalid") + } + if !hasMinimalAssessConfig(&AssessConfig{ + Scope: AssessScope{Mode: "both", Org: "acme", LocalPath: "/foo", DisplayName: "acme"}, + }) { + t.Error("both with org+path should be valid") + } } ``` </details> Based on learnings: Every non-trivial source code change must include corresponding test additions or updates. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tui/assess_config_test.go` around lines 117 - 137, Add regression test cases to TestHasMinimalAssessConfig to cover scope Mode "both": call hasMinimalAssessConfig with an AssessConfig whose Scope is AssessScope{Mode: "both", Org: "acme"} and assert it returns false (Org-only invalid), and call hasMinimalAssessConfig with AssessConfig{Scope: AssessScope{Mode: "both", Org: "acme", LocalPath: "/foo", DisplayName: "acme"}} and assert it returns true (Org+LocalPath+DisplayName valid); refer to the existing TestHasMinimalAssessConfig, function hasMinimalAssessConfig, and types AssessConfig/AssessScope and fields Mode, Org, LocalPath, DisplayName when adding these assertions. ``` </details> </blockquote></details> <details> <summary>tests/unit/services/maturity/interview.spec.ts (1)</summary><blockquote> `9-45`: _⚡ Quick win_ **Pin the exact prompt text in this suite.** Right now a paraphrased question would still pass as long as the ids and options stayed the same. Since these prompts are supposed to remain verbatim, the tests should assert the exact `prompt` strings, not just the shape. As per coding guidelines "`src/services/maturity/interview.ts`: The 7 Phase-1 interview questions in `src/services/maturity/interview.ts` must be verbatim from `references/interview.md` — do not paraphrase". <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/services/maturity/interview.spec.ts` around lines 9 - 45, The test must assert the exact prompt text for each interview question instead of only checking ids/options: update the suite that references INTERVIEW_QUESTIONS (and optionally getQuestion) to compare INTERVIEW_QUESTIONS.map(q => q.prompt) (or getQuestion("qN").prompt) against an array of the seven verbatim prompt strings copied from references/interview.md so the test fails on any paraphrase; keep the existing checks (ids, options, allowFreeText, configHeading, FRAMING_MESSAGE) intact and replace or augment the "questions are in id order" test with a strict equality assertion of the exact prompt strings in the correct q1..q7 order. ``` </details> </blockquote></details> <details> <summary>tests/unit/services/maturity/adjacent-repos.spec.ts (1)</summary><blockquote> `5-24`: _⚡ Quick win_ **These tests won't catch parser regressions.** The first case only proves the function returns an array against the current repo, and the second only covers the trivial early return. Please add fixture-driven assertions for workflow `uses:`, Terraform sources, `.gitmodules`, README refs, and the owner/`.git` normalization rules so this detector is actually pinned down. Based on learnings "Every non-trivial source code change must include corresponding test additions or updates" and "Test only our code, not library dependencies — mock at boundaries and verify our logic handles responses/errors correctly". <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/services/maturity/adjacent-repos.spec.ts` around lines 5 - 24, The tests for detectAdjacentRepos are too shallow; add fixture-driven unit cases in tests/unit/services/maturity/adjacent-repos.spec.ts that load small repo fixtures (containing workflow files with uses:, Terraform module/source blocks, .gitmodules, README file refs, and repos with owner/.git URL variants) and assert detectAdjacentRepos returns the expected normalized repo descriptors (including detection of workflow uses, tf sources, submodule entries, README links, and normalization of owner vs owner/.git forms). Keep tests focused on our parsing logic by mocking file system/git-reading helpers used by detectAdjacentRepos (e.g., any fs/git helper functions invoked) so you control inputs and only verify our normalization/parsing code paths and error handling rather than external library behavior. ``` </details> </blockquote></details> <details> <summary>src/services/maturity/maturity.service.ts (1)</summary><blockquote> `59-60`: _⚡ Quick win_ **Use `formatDateUTC()` for the audit date.** This is a display/output date, so it should go through the shared date utility instead of open-coding `toISOString().slice(0, 10)`. As per coding guidelines, "Always format display dates with `formatDateUTC()` to prevent local-timezone shift". <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/maturity/maturity.service.ts` around lines 59 - 60, Replace the inline date construction in the run method (async run(input: AssessCommandInput): Promise<AssessResult>) that sets the today variable using new Date().toISOString().slice(0, 10) with a call to the shared formatDateUTC() utility so the audit/display date is properly formatted for UTC; update the imports in maturity.service.ts to import formatDateUTC from the shared date util if not already present and ensure the variable name today continues to hold the formatted string used elsewhere in the method. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.Inline comments:
In@docs/maturity-skill-ref/references/preflight.md:
- Around line 33-38: Remove the conditional exception in the Tier 3 scoring text
so the cap on items 2, 3, 9, and 11 is unconditional: delete the clause "unless
filesystem evidence alone is sufficient" in the Tier 3 bullet and change the
sentence to state that items 2, 3, 9, and 11 are capped at 0.5 at Tier 3
unconditionally (i.e., always cap those items at 0.5), while leaving the other
Tier 3 bullets (the Summary one-line take and Notes for re-audit) unchanged.In
@scripts/run-assess.ts:
- Around line 107-114: The current onProgress callback always emits status
"active", preventing phases from ever completing; update the onProgress wiring
so it forwards a status value from MaturityService (e.g.,
"active"|"complete"|"failed") to emitProgress instead of hardcoding "active":
change the passed callback when constructing MaturityService (in
scripts/run-assess.ts) from onProgress: (step, message) => emitProgress(step,
"active", message) to accept and forward a status (onProgress: (step, status,
message) => emitProgress(step, status, message)), and update
MaturityService/MaturityAIScorer call sites to pass the appropriate status for
phase transitions (preflight/interview/evidence/scoring/writing) so each phase
can emit "complete" when finished.- Around line 74-89: The code calls process.exit() immediately after emit(...)
which can truncate asynchronously buffered stdout; instead, set process.exitCode
= 1 (or 0 for success) and return so the process can flush stdout. Locate the
branches that call process.exit() after emit in the functions using
reader.nextLine(), the config parsing block that assigns configLine / input (and
the other four locations noted), replace each process.exit(...) with setting
process.exitCode appropriately and then return from the function (do not call
process.exit()), ensuring emit(...) has time to flush.In
@src/services/maturity/adjacent-repos.ts:
- Around line 64-69: Update the regex used in the content.matchAll call to
recognize common Terraform module source syntaxes (e.g.
git::https://github.com/org/repo.git//path?ref=..., git@github.com:org/repo.git,
https://github.com/org/repo, and with optional .git and //subpath or ?ref) so
those sources capture the owner/repo correctly; keep using OWNER_REPO to
validate the captured group and call addRepo(found, m[1], m[2], "Terraform
module source") as before. In short: broaden the matchAll pattern to accept
optional prefixes (git::, ssh/git@, https?://), both ":" and "/" host
separators, optional .git, and optional subpath/query tail so the same loop and
OWNER_REPO check will find and add adjacent repos for those real-world Terraform
source forms.- Around line 101-109: addRepo currently stores raw owner/name values and lets
inputs like "org/repo.git" bypass normalization and STDLIB_OWNERS filtering;
update addRepo to canonicalize the repo id by trimming whitespace, removing any
trailing ".git", extracting only the owner and repo segments from possible path
inputs, lowercasing both owner and repo, then recompose key =${owner}/${name}
and skip adding if owner is in STDLIB_OWNERS or if the normalized key already
exists in map; refer to function addRepo and the map parameter and STDLIB_OWNERS
constant when making the change.In
@src/services/maturity/ai-scorer.ts:
- Around line 71-90: The ensureAllItems function can leave duplicate ItemScore
entries from the AI response which skews scoring; change the logic to
deduplicate by itemId (use a Map<number, ItemScore> or object) when consuming
the incoming items so only one score per RUBRIC_ITEMS id is retained (choose
first or last consistently), then iterate RUBRIC_ITEMS to build the final filled
array in the canonical order: if an id exists in the deduped map use it,
otherwise push the default "n/a" ItemScore and record the id in missing; finally
return the ordered items array and missing list so the result contains exactly
one entry per RUBRIC_ITEMS id and no duplicates.In
@src/services/maturity/audit-store.ts:
- Around line 118-130: The parsed JSON is assumed to be Record<string, string>
but non-string values (e.g., numbers or null) can slip through and later cause
renderConfigMd().trim() to throw; in the block that iterates parsed entries
(where readFile, JSON.parse, getQuestion, and answers: InterviewAnswer[] are
used) check the runtime type of value before constructing InterviewAnswer: only
push entries where typeof value === "string" (or coerce to a string explicitly
if that is acceptable), otherwise skip or log the invalid question id so
non-string values are not stored and renderConfigMd() receives only strings.In
@src/services/maturity/maturity.service.ts:
- Around line 168-195: collectInterviewAnswers currently only uses
readPriorAnswers for headless runs so interactive re-audits still prompt all
INTERVIEW_QUESTIONS; change the interactive path to consult the byId map (built
from readPriorAnswers()) before calling this.interview.ask: for each question q,
if byId.has(q.id) push the stored InterviewAnswer (normalizing unknown via
isUnknownAnswer) and skip this.interview.ask, otherwise ask and push the
returned answer; ensure you still call this.interview.frame(FRAMING_MESSAGE)
before any interactive asks and preserve existing normalization of "I don't
know" answers.In
@src/services/maturity/preflight.ts:
- Around line 39-46: The ghIsAuthenticated function can hang if the spawned "gh
auth status" never returns; modify ghIsAuthenticated to enforce a timeout (e.g.,
a few seconds) that treats the check as unauthenticated on expiry: when spawning
the process (the child returned by spawn in ghIsAuthenticated), start a timer
that on expiration kills the child, clears listeners, and resolves false; ensure
the normal child.on("error") and child.on("close") handlers clear the timeout
and resolve appropriately (resolve true only for code === 0, false otherwise) to
avoid leaks and races.In
@tui/assess_flags.go:
- Around line 107-115: The check in hasMinimalAssessConfig currently treats
cfg.Scope.Mode == "both" the same as "org" and only verifies cfg.Scope.Org;
update hasMinimalAssessConfig so that when cfg.Scope.Mode is "both" it requires
both non-empty cfg.Scope.Org and non-empty cfg.Scope.LocalPath (i.e., trim and
validate cfg.Scope.LocalPath as you do elsewhere), or alternately split the
cases so "both" explicitly validates both cfg.Scope.Org and cfg.Scope.LocalPath;
refer to the cfg.Scope.Mode switch and the cfg.Scope.Org / cfg.Scope.LocalPath
fields to locate where to add the additional validation.In
@tui/assess_runner.go:
- Around line 47-60: AssessRunResult.Close currently only runs closeFns (which
just EOFs stdin) but must also stop the spawned subprocess; add a field to
AssessRunResult to hold the exec.Cmd or its *os.Process (e.g., Cmd *exec.Cmd or
Proc *os.Process) and update Close to first interrupt/kill the process (try
Process.Signal(os.Interrupt) then Process.Kill if needed) and wait for it to
exit before running closeFns; ensure you handle nil checks and avoid
double-closing, and update any code that constructs AssessRunResult to populate
the new Cmd/Proc field.- Around line 117-138: The loop currently swallows JSON parse failures and
scanner errors; change the scanner loop in which you read from stdoutPipe
(scanner, scanner.Scan(), scanner.Text()) so that any json.Unmarshal error for
GenericEvent is sent to errCh and the loop breaks (stop scanning) instead of
continuing, and after the loop check scanner.Err() and if non-nil send that
error to errCh as well; keep the existing cmd.Wait() handling for process exit
but ensure you do not double-report the same error (i.e., only send
parse/scanner errors via errCh and then return/stop before calling the
wait-error branch if appropriate).In
@tui/assess_wizard.go:
- Around line 434-446: validateLocalPath currently allows any directory; update
it to reject non-Git repos by verifying repo-ness before returning nil. Inside
validateLocalPath, after the existing directory checks, run a lightweight repo
check—preferably exec.Command("git","rev-parse","--is-inside-work-tree") and
treat non-zero exit as not-a-repo (or fallback to checking for a ".git" entry
via os.Stat(filepath.Join(trimmed, ".git"))); return a clear error like "path is
not a git repository: " when the check fails so the wizard rejects plain
folders early.- Around line 50-66: The loop over runner.Errors can block preventing the
deferred runner.Close() and the cancellation path from running; before iterating
runner.Errors in assess_wizard.go check progress.Cancelled (and handle
progress.ErrorMsg) and exit early, or use a non-blocking/select pattern that
listens for a cancellation signal (progress.Cancelled) alongside reading from
runner.Errors so you can break and call runner.Close() promptly; reference
runner.Errors, runner.Close(), progress.Cancelled, progress.ErrorMsg and
RenderError when making the change.
Minor comments:
In @.env.schema:
- Around line 45-47: The MATURITY_AI_MODEL key in .env.schema is out of order
and triggers dotenv-linter's UnorderedKey; move the MATURITY_AI_MODEL entry so
it appears in the expected order (alphabetical/grouping consistent with the
surrounding keys) relative to VISIBLE_WINS_AI_MODEL and other AI model vars,
ensuring the comment and@type=string line stay with the key (i.e., cut the
MATURITY_AI_MODEL block and paste it into the correct position to satisfy
dotenv-linter).In
@claude-plugin/skills/agent-maturity-assessment/SKILL.md:
- Around line 10-11: Update the SKILL.md wording so it no longer lists Asana as
an evidence source: locate the line containing "hybrid pipeline: deterministic
detectors gather evidence from the local repo / GitHub / Asana, a Phase-1
interview captures the org-level signals that aren't" and remove "/ Asana" (or
replace the segment with "local repo / GitHub") so the sentence matches the
documented assess pipeline and retains the reference to the Phase-1 interview
for org-level signals.In
@docs/maturity-skill-ref/references/output-template.md:
- Around line 98-102: Escape the pipe characters inside the inline code spans so
the table parser doesn't treat "||" as column separators; update the cell
containing "CI runs dotnet test || true" to use escaped pipes (e.g.,dotnet test \|\| true) and any other inline code with "||" so MD056/noise stops
truncating content, leaving the rest of the text unchanged.In
@docs/maturity-skill-ref/SKILL.md:
- Around line 61-70: The fenced code block containing the scoring formula is
missing a language identifier (triggers MD040); update the opening fence from
totext (or another appropriate language like ```none) so the block is
explicitly tagged, e.g., change the opening delimiter for the block that begins
"A_total = sum(items 1–4)..." to include the language identifier; no other
content changes needed.In
@README.md:
- Around line 283-285: Update the fenced code block containing the line
"share/skills/agent-maturity-assessment/ ← copy this folder to
~/.claude/skills/" to include a language token (e.g., changetotext) so
the block becomes a labeled code fence; modify the block around that exact
string to usetext at the opening fence and close withat the end.In
@share/skills/agent-maturity-assessment/INSTALL.md:
- Around line 7-16: The fenced code blocks in INSTALL.md (the directory tree
block and the short "You: audit this repo's agent readiness" block and other
similar fences around lines showing the example) lack language identifiers;
update each triple-backtick fence to include a language hint (use "text" for the
ASCII tree and short examples) so linting and rendering are consistent—e.g.,
changetotext for the directory tree block and theYou: ...block and
for the other fences referenced in the comment.In
@share/skills/agent-maturity-assessment/references/preflight.md:
- Line 37: The sentence currently implies Tier-3 items 2, 3, 9, and 11 can reach
1.0 if filesystem evidence alone suffices, but the scorer enforces a hard cap of
0.5; change the wording to state that Tier‑3 (git-only) audits always cap items
2, 3, 9, and 11 at 0.5 and cannot be auto-promoted to 1.0 (even when AI awards
1.0), and reference the enforcement in applyTier3Caps within ai-scorer.ts so the
doc matches implementation.In
@share/skills/agent-maturity-assessment/SKILL.md:
- Line 122: The reference to the changelog uses the wrong path string
"audits/CHANGELOG.md"; update the text in SKILL.md so it points to the canonical
audit location "docs/audits/CHANGELOG.md" (and sweep any other occurrences in
this document to the "docs/audits/…" convention) to keep audit artifact paths
consistent with the rest of the repo.- Around line 85-94: The fenced scoring block containing the
A_total/B_total/C_total/D_total equations is unlabeled and triggers markdownlint
MD040; fix it by adding a language hint to the opening fence (e.g., change the
openingtotext) for that scoring block so the block is explicitly
labeled as plain text—update the fenced block around the scoring equations (the
block that starts with A_total = ...) accordingly.In
@src/services/maturity/audit-writer.ts:
- Around line 72-160: renderAuditMarkdown is inserting artifact.auditDate
directly in two places (the title header and the footer) instead of formatting
it via formatDateUTC; update both occurrences inside renderAuditMarkdown to call
formatDateUTC(new Date(artifact.auditDate + 'T00:00:00Z')) so the displayed
dates use UTC formatting and avoid timezone shifts (replace the raw
artifact.auditDate in the title string and the footer string).In
@src/services/maturity/interview.ts:
- Around line 106-114: The isUnknownAnswer check misses smart-quote variants
like “I don’t know”; update normalization so it converts curly
apostrophes/quotes to straight ones before trimming/lowering and checking
UNKNOWN_TOKENS (i.e., in function isUnknownAnswer replace characters such as
U+2019/U+2018 and other common smart quotes/backticks with "'" and optionally
normalize other unicode apostrophe variants, then trim().toLowerCase() and call
UNKNOWN_TOKENS.has on that normalized string); reference symbols: UNKNOWN_TOKENS
and isUnknownAnswer.In
@tui/assess_coverage_test.go:
- Around line 469-473: Replace the regular test TestMain_FlagPackageState with a
real TestMain hook by renaming it to func TestMain(m *testing.M) and
initializing the flag package there (e.g., call
flag.CommandLine.Parse([]string{}) or flag.Parse() to ensure flags are set up),
then call os.Exit(m.Run()); update imports to include "os" and remove the old
TestMain_FlagPackageState test function.
Nitpick comments:
In@src/services/maturity/maturity.service.ts:
- Around line 59-60: Replace the inline date construction in the run method
(async run(input: AssessCommandInput): Promise) that sets the
today variable using new Date().toISOString().slice(0, 10) with a call to the
shared formatDateUTC() utility so the audit/display date is properly formatted
for UTC; update the imports in maturity.service.ts to import formatDateUTC from
the shared date util if not already present and ensure the variable name today
continues to hold the formatted string used elsewhere in the method.In
@tests/unit/services/maturity/adjacent-repos.spec.ts:
- Around line 5-24: The tests for detectAdjacentRepos are too shallow; add
fixture-driven unit cases in tests/unit/services/maturity/adjacent-repos.spec.ts
that load small repo fixtures (containing workflow files with uses:, Terraform
module/source blocks, .gitmodules, README file refs, and repos with owner/.git
URL variants) and assert detectAdjacentRepos returns the expected normalized
repo descriptors (including detection of workflow uses, tf sources, submodule
entries, README links, and normalization of owner vs owner/.git forms). Keep
tests focused on our parsing logic by mocking file system/git-reading helpers
used by detectAdjacentRepos (e.g., any fs/git helper functions invoked) so you
control inputs and only verify our normalization/parsing code paths and error
handling rather than external library behavior.In
@tests/unit/services/maturity/interview.spec.ts:
- Around line 9-45: The test must assert the exact prompt text for each
interview question instead of only checking ids/options: update the suite that
references INTERVIEW_QUESTIONS (and optionally getQuestion) to compare
INTERVIEW_QUESTIONS.map(q => q.prompt) (or getQuestion("qN").prompt) against an
array of the seven verbatim prompt strings copied from references/interview.md
so the test fails on any paraphrase; keep the existing checks (ids, options,
allowFreeText, configHeading, FRAMING_MESSAGE) intact and replace or augment the
"questions are in id order" test with a strict equality assertion of the exact
prompt strings in the correct q1..q7 order.In
@tui/assess_config_test.go:
- Around line 117-137: Add regression test cases to TestHasMinimalAssessConfig
to cover scope Mode "both": call hasMinimalAssessConfig with an AssessConfig
whose Scope is AssessScope{Mode: "both", Org: "acme"} and assert it returns
false (Org-only invalid), and call hasMinimalAssessConfig with
AssessConfig{Scope: AssessScope{Mode: "both", Org: "acme", LocalPath: "/foo",
DisplayName: "acme"}} and assert it returns true (Org+LocalPath+DisplayName
valid); refer to the existing TestHasMinimalAssessConfig, function
hasMinimalAssessConfig, and types AssessConfig/AssessScope and fields Mode, Org,
LocalPath, DisplayName when adding these assertions.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro Plus **Run ID**: `7a6f5c98-028b-43ab-9f8e-8dad2e5f7521` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between ecf24fe046d80f9c3c76fe316f6ef064bb24b47d and 99dc3a9df5a768748ca0c5da1ac073a86fe86461. </details> <details> <summary>📒 Files selected for processing (65)</summary> * `.env.schema` * `CLAUDE.md` * `README.md` * `claude-plugin/skills/agent-maturity-assessment/SKILL.md` * `docs/2026-05-03-agent-maturity-assessment-plan.md` * `docs/ARCHITECTURE.md` * `docs/MATURITY_ASSESSMENT.md` * `docs/maturity-skill-ref/SKILL.md` * `docs/maturity-skill-ref/references/criteria.md` * `docs/maturity-skill-ref/references/interview.md` * `docs/maturity-skill-ref/references/output-template.md` * `docs/maturity-skill-ref/references/preflight.md` * `justfile` * `scripts/run-assess.ts` * `share/skills/agent-maturity-assessment/INSTALL.md` * `share/skills/agent-maturity-assessment/SKILL.md` * `share/skills/agent-maturity-assessment/references/criteria.md` * `share/skills/agent-maturity-assessment/references/interview.md` * `share/skills/agent-maturity-assessment/references/output-template.md` * `share/skills/agent-maturity-assessment/references/preflight.md` * `src/cli/index.ts` * `src/core/types.ts` * `src/services/maturity/adjacent-repos.ts` * `src/services/maturity/ai-scorer.ts` * `src/services/maturity/audit-store.ts` * `src/services/maturity/audit-writer.ts` * `src/services/maturity/evidence-collectors.ts` * `src/services/maturity/fs-utils.ts` * `src/services/maturity/interview.ts` * `src/services/maturity/maturity-prompts.ts` * `src/services/maturity/maturity.service.ts` * `src/services/maturity/preflight.ts` * `src/services/maturity/rubric.ts` * `src/services/maturity/scoring.ts` * `src/services/maturity/stdin-interview.ts` * `src/services/maturity/types.ts` * `tests/fixtures/maturity/teamhero-answers.json` * `tests/integration/maturity-end-to-end.spec.ts` * `tests/unit/services/maturity/adjacent-repos.spec.ts` * `tests/unit/services/maturity/audit-store.spec.ts` * `tests/unit/services/maturity/audit-writer.spec.ts` * `tests/unit/services/maturity/evidence-collectors.spec.ts` * `tests/unit/services/maturity/interview.spec.ts` * `tests/unit/services/maturity/maturity-prompts.spec.ts` * `tests/unit/services/maturity/rubric.spec.ts` * `tests/unit/services/maturity/scoring.spec.ts` * `tests/unit/services/maturity/stdin-interview.spec.ts` * `tui/assess.go` * `tui/assess_config.go` * `tui/assess_config_test.go` * `tui/assess_coverage_test.go` * `tui/assess_flags.go` * `tui/assess_preview.go` * `tui/assess_preview_test.go` * `tui/assess_progress.go` * `tui/assess_progress_test.go` * `tui/assess_protocol.go` * `tui/assess_runner.go` * `tui/assess_runner_test.go` * `tui/assess_summary.go` * `tui/assess_summary_test.go` * `tui/assess_wizard.go` * `tui/assess_wizard_test.go` * `tui/main.go` * `tui/protocol.go` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| **At Tier 3, the audit MUST:** | ||
|
|
||
| - State “Tier 3 (git-only) audit — limited GitHub-side evidence” in the Summary’s *One-line take*. | ||
| - Add an entry to *Notes for re-audit* listing which items were scored against fallback evidence and what to re-verify when running at Tier 1. | ||
| - Never auto-promote a Tier 3 score to 1.0 on items 2, 3, 9, or 11 — the missing GitHub-side data could pull them down. Cap those at 0.5 unless filesystem evidence alone is sufficient. | ||
|
|
There was a problem hiding this comment.
Remove the Tier-3 cap exception to keep scoring deterministic.
Line 37 introduces an exception (unless filesystem evidence alone is sufficient) that weakens the hard-cap rule and can produce inconsistent Tier-3 scoring across auditors. Keep items 2, 3, 9, and 11 capped at 0.5 unconditionally at Tier 3.
🔧 Proposed wording fix
-- Never auto-promote a Tier 3 score to 1.0 on items 2, 3, 9, or 11 — the missing GitHub-side data could pull them down. Cap those at 0.5 unless filesystem evidence alone is sufficient.
+- Never auto-promote a Tier 3 score to 1.0 on items 2, 3, 9, or 11 — the missing GitHub-side data could pull them down. Cap those at 0.5.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/maturity-skill-ref/references/preflight.md` around lines 33 - 38, Remove
the conditional exception in the Tier 3 scoring text so the cap on items 2, 3,
9, and 11 is unconditional: delete the clause "unless filesystem evidence alone
is sufficient" in the Tier 3 bullet and change the sentence to state that items
2, 3, 9, and 11 are capped at 0.5 at Tier 3 unconditionally (i.e., always cap
those items at 0.5), while leaving the other Tier 3 bullets (the Summary
one-line take and Notes for re-audit) unchanged.
| const configLine = await reader.nextLine(); | ||
| if (!configLine) { | ||
| emit({ type: "error", message: "No config received on stdin" }); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| let input: AssessCommandInput; | ||
| try { | ||
| input = JSON.parse(configLine) as AssessCommandInput; | ||
| } catch (err) { | ||
| emit({ | ||
| type: "error", | ||
| message: `Failed to parse config JSON: ${(err as Error).message}`, | ||
| }); | ||
| process.exit(1); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does process.exit()in Node.js or Bun risk truncating pendingprocess.stdout.write() output when stdout is piped?
💡 Result:
Yes—calling process.exit() can truncate pending process.stdout.write() output when stdout is piped/redirected. In Node.js, process.exit() “forces the process to exit as quickly as possible” even if there are still asynchronous operations pending, explicitly including “I/O operations to process.stdout and process.stderr”; Node docs further explain that process.stdout writes can be asynchronous across multiple event-loop ticks, so process.exit() can occur before those writes complete [1]. This is consistent with long-standing Node issue reports about buffered/asynchronous stdio not being flushed when process.exit() is called immediately after process.stdout.write() [2][3]. For Bun: the provided sources I found cover Node’s process.exit() flushing behavior and Bun’s stdio/write infrastructure generally, but they do not provide an authoritative, explicit statement equivalent to Node’s docs about whether process.exit() truncates pending process.stdout.write() when stdout is piped. So I can’t confidently answer the Bun-specific part from the evidence gathered. Practical guidance: in Node, prefer setting process.exitCode (and let the process exit naturally) rather than calling process.exit() directly, since docs call out the truncation risk with pending stdout/stderr writes [1].
Citations:
- 1: https://nodejs.org/api/process.html
- 2: Stdout is not completely flushed on process exit nodejs/node#2972
- 3: stdio buffered writes (chunked) issues & process.exit() truncation nodejs/node#6456
🏁 Script executed:
# First, locate and examine the run-assess.ts file
fd -e ts -e js | grep -i "run-assess" | head -20Repository: asabaylus/teamhero.cli
Length of output: 88
🏁 Script executed:
# Read the full file to examine the implementation
cat -n scripts/run-assess.tsRepository: asabaylus/teamhero.cli
Length of output: 5898
🏁 Script executed:
# Verify the emit function and confirm stdout.write behavior
sed -n '40,44p' scripts/run-assess.tsRepository: asabaylus/teamhero.cli
Length of output: 231
Don't call process.exit() immediately after writing protocol events.
The emit() function writes directly to process.stdout via process.stdout.write(), which is asynchronous/buffered. These branches exit synchronously right after sending error/result lines, which can truncate the very events the Go TUI is waiting for before stdout flushes. Set process.exitCode and return instead so stdout has a chance to complete. This applies at five locations: lines 77, 88, 141, 145, and 152.
Suggested fix
if (!configLine) {
emit({ type: "error", message: "No config received on stdin" });
- process.exit(1);
+ process.exitCode = 1;
+ return;
}
@@
} catch (err) {
emit({
type: "error",
message: `Failed to parse config JSON: ${(err as Error).message}`,
});
- process.exit(1);
+ process.exitCode = 1;
+ return;
}
@@
emit({
type: "result",
outputPath: result.outputPath,
@@
data: result.artifact as unknown as Record<string, unknown>,
});
- process.exit(0);
+ return;
} catch (err) {
emit({ type: "error", message: (err as Error).message });
consola.error(err);
- process.exit(1);
+ process.exitCode = 1;
+ return;
}
}
main().catch((err) => {
emit({ type: "error", message: (err as Error).message });
consola.error(err);
- process.exit(1);
+ process.exitCode = 1;
});🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/run-assess.ts` around lines 74 - 89, The code calls process.exit()
immediately after emit(...) which can truncate asynchronously buffered stdout;
instead, set process.exitCode = 1 (or 0 for success) and return so the process
can flush stdout. Locate the branches that call process.exit() after emit in the
functions using reader.nextLine(), the config parsing block that assigns
configLine / input (and the other four locations noted), replace each
process.exit(...) with setting process.exitCode appropriately and then return
from the function (do not call process.exit()), ensuring emit(...) has time to
flush.
| const scorer = new MaturityAIScorer({ dryRun: input.dryRun ?? false }); | ||
| const service = new MaturityService({ | ||
| logger, | ||
| scorer, | ||
| ...(interview ? { interview } : {}), | ||
| ...(auditStore ? { auditStore } : {}), | ||
| onProgress: (step, message) => emitProgress(step, "active", message), | ||
| }); |
There was a problem hiding this comment.
Intermediate steps never complete in the protocol.
This callback always emits status: "active", so preflight/interview/evidence/scoring/writing never transition to complete. The Go progress model will keep old rows spinning and its ratio will never reflect a fully finished run. This needs a status-bearing callback or explicit complete emits per phase.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/run-assess.ts` around lines 107 - 114, The current onProgress
callback always emits status "active", preventing phases from ever completing;
update the onProgress wiring so it forwards a status value from MaturityService
(e.g., "active"|"complete"|"failed") to emitProgress instead of hardcoding
"active": change the passed callback when constructing MaturityService (in
scripts/run-assess.ts) from onProgress: (step, message) => emitProgress(step,
"active", message) to accept and forward a status (onProgress: (step, status,
message) => emitProgress(step, status, message)), and update
MaturityService/MaturityAIScorer call sites to pass the appropriate status for
phase transitions (preflight/interview/evidence/scoring/writing) so each phase
can emit "complete" when finished.
| const matches = content.matchAll( | ||
| /source\s*=\s*"(?:git::|github\.com\/)([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g, | ||
| ); | ||
| for (const match of matches) { | ||
| const m = OWNER_REPO.exec(match[1]); | ||
| if (m) addRepo(found, m[1], m[2], "Terraform module source"); |
There was a problem hiding this comment.
Handle the common Terraform GitHub source forms.
The current pattern only matches github.com/org/repo or git::org/repo. Standard module sources such as git::https://github.com/org/repo.git//path?ref=v1 or git@github.com:org/repo.git never match, so adjacent repos are silently missed for many real Terraform/OpenTofu repos.
Suggested fix
const matches = content.matchAll(
- /source\s*=\s*"(?:git::|github\.com\/)([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g,
+ /source\s*=\s*"(?:(?:git::)?(?:https?:\/\/|ssh:\/\/git@|git@)?github\.com[:/])([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g,
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const matches = content.matchAll( | |
| /source\s*=\s*"(?:git::|github\.com\/)([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g, | |
| ); | |
| for (const match of matches) { | |
| const m = OWNER_REPO.exec(match[1]); | |
| if (m) addRepo(found, m[1], m[2], "Terraform module source"); | |
| const matches = content.matchAll( | |
| /source\s*=\s*"(?:(?:git::)?(?:https?:\/\/|ssh:\/\/git@|git@)?github\.com[:/])([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g, | |
| ); | |
| for (const match of matches) { | |
| const m = OWNER_REPO.exec(match[1]); | |
| if (m) addRepo(found, m[1], m[2], "Terraform module source"); |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 68-68: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.
(coderabbit.command-injection.exec-js)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/services/maturity/adjacent-repos.ts` around lines 64 - 69, Update the
regex used in the content.matchAll call to recognize common Terraform module
source syntaxes (e.g. git::https://github.com/org/repo.git//path?ref=...,
git@github.com:org/repo.git, https://github.com/org/repo, and with optional .git
and //subpath or ?ref) so those sources capture the owner/repo correctly; keep
using OWNER_REPO to validate the captured group and call addRepo(found, m[1],
m[2], "Terraform module source") as before. In short: broaden the matchAll
pattern to accept optional prefixes (git::, ssh/git@, https?://), both ":" and
"/" host separators, optional .git, and optional subpath/query tail so the same
loop and OWNER_REPO check will find and add adjacent repos for those real-world
Terraform source forms.
| function addRepo( | ||
| map: Map<string, AdjacentRepo>, | ||
| owner: string, | ||
| name: string, | ||
| reason: string, | ||
| ): void { | ||
| const key = `${owner}/${name}`.toLowerCase(); | ||
| if (map.has(key)) return; | ||
| map.set(key, { owner, name, reason }); |
There was a problem hiding this comment.
Canonicalize and filter repo ids inside addRepo().
addRepo() stores raw owner/name values. That means Terraform and .gitmodules inputs like org/repo.git are kept as repo.git, and those same paths also bypass STDLIB_OWNERS. Since adjacentRepos is emitted in the audit and fed back into scoring context, these false positives skew the assessment.
Suggested fix
function addRepo(
map: Map<string, AdjacentRepo>,
owner: string,
name: string,
reason: string,
): void {
- const key = `${owner}/${name}`.toLowerCase();
+ const normalizedOwner = owner.trim();
+ const normalizedName = name.trim().replace(/\.git$/i, "");
+ if (STDLIB_OWNERS.has(normalizedOwner.toLowerCase())) return;
+ const key = `${normalizedOwner}/${normalizedName}`.toLowerCase();
if (map.has(key)) return;
- map.set(key, { owner, name, reason });
+ map.set(key, {
+ owner: normalizedOwner,
+ name: normalizedName,
+ reason,
+ });
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/services/maturity/adjacent-repos.ts` around lines 101 - 109, addRepo
currently stores raw owner/name values and lets inputs like "org/repo.git"
bypass normalization and STDLIB_OWNERS filtering; update addRepo to canonicalize
the repo id by trimming whitespace, removing any trailing ".git", extracting
only the owner and repo segments from possible path inputs, lowercasing both
owner and repo, then recompose key = `${owner}/${name}` and skip adding if owner
is in STDLIB_OWNERS or if the normalized key already exists in map; refer to
function addRepo and the map parameter and STDLIB_OWNERS constant when making
the change.
| switch cfg.Scope.Mode { | ||
| case "org", "both": | ||
| if strings.TrimSpace(cfg.Scope.Org) == "" { | ||
| return false | ||
| } | ||
| case "local-repo": | ||
| if strings.TrimSpace(cfg.Scope.LocalPath) == "" { | ||
| return false | ||
| } |
There was a problem hiding this comment.
Require LocalPath when scope-mode is both.
hasMinimalAssessConfig currently accepts both with just Org, but your flag contract says --path is required for both. This can incorrectly bypass interactive completion and pass an incomplete config downstream.
Suggested diff
func hasMinimalAssessConfig(cfg *AssessConfig) bool {
if cfg == nil {
return false
}
switch cfg.Scope.Mode {
- case "org", "both":
+ case "org":
if strings.TrimSpace(cfg.Scope.Org) == "" {
return false
}
+ case "both":
+ if strings.TrimSpace(cfg.Scope.Org) == "" {
+ return false
+ }
+ if strings.TrimSpace(cfg.Scope.LocalPath) == "" {
+ return false
+ }
case "local-repo":
if strings.TrimSpace(cfg.Scope.LocalPath) == "" {
return false
}
default:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_flags.go` around lines 107 - 115, The check in
hasMinimalAssessConfig currently treats cfg.Scope.Mode == "both" the same as
"org" and only verifies cfg.Scope.Org; update hasMinimalAssessConfig so that
when cfg.Scope.Mode is "both" it requires both non-empty cfg.Scope.Org and
non-empty cfg.Scope.LocalPath (i.e., trim and validate cfg.Scope.LocalPath as
you do elsewhere), or alternately split the cases so "both" explicitly validates
both cfg.Scope.Org and cfg.Scope.LocalPath; refer to the cfg.Scope.Mode switch
and the cfg.Scope.Org / cfg.Scope.LocalPath fields to locate where to add the
additional validation.
| type AssessRunResult struct { | ||
| Events <-chan GenericEvent | ||
| Errors <-chan error | ||
| Stderr *bytes.Buffer | ||
| StdinW io.WriteCloser | ||
| closeFns []func() | ||
| } | ||
|
|
||
| // Close cleans up the stdin writer if not already closed. | ||
| func (r *AssessRunResult) Close() { | ||
| for _, fn := range r.closeFns { | ||
| fn() | ||
| } | ||
| } |
There was a problem hiding this comment.
Close() needs to stop the subprocess, not just close stdin.
Right now cleanup only EOFs stdin. If the service is in a non-interview phase, it can keep running evidence/AI work after the TUI exits. Store the spawned exec.Cmd (or at least its Process) and interrupt/kill it during cleanup.
Also applies to: 141-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_runner.go` around lines 47 - 60, AssessRunResult.Close currently
only runs closeFns (which just EOFs stdin) but must also stop the spawned
subprocess; add a field to AssessRunResult to hold the exec.Cmd or its
*os.Process (e.g., Cmd *exec.Cmd or Proc *os.Process) and update Close to first
interrupt/kill the process (try Process.Signal(os.Interrupt) then Process.Kill
if needed) and wait for it to exit before running closeFns; ensure you handle
nil checks and avoid double-closing, and update any code that constructs
AssessRunResult to populate the new Cmd/Proc field.
| scanner := bufio.NewScanner(stdoutPipe) | ||
| scanner.Buffer(make([]byte, 0, 256*1024), 4*1024*1024) // 4MB max line for full audit JSON | ||
|
|
||
| for scanner.Scan() { | ||
| line := strings.TrimSpace(scanner.Text()) | ||
| if line == "" { | ||
| continue | ||
| } | ||
| var evt GenericEvent | ||
| if err := json.Unmarshal([]byte(line), &evt); err != nil { | ||
| continue | ||
| } | ||
| eventCh <- evt | ||
| } | ||
|
|
||
| if err := cmd.Wait(); err != nil { | ||
| if exitErr, ok := err.(*exec.ExitError); ok { | ||
| errCh <- fmt.Errorf("assess runner exited with code %d", exitErr.ExitCode()) | ||
| } else { | ||
| errCh <- fmt.Errorf("assess runner error: %w", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Don't silently swallow protocol parse failures.
Both malformed JSON lines and scanner.Err() are ignored here. That turns runner/protocol regressions into blank exits with no surfaced error. Send the parse/scanner error to errCh and stop scanning.
Suggested fix
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
var evt GenericEvent
if err := json.Unmarshal([]byte(line), &evt); err != nil {
- continue
+ errCh <- fmt.Errorf("failed to decode assess event: %w", err)
+ return
}
eventCh <- evt
}
+
+ if err := scanner.Err(); err != nil {
+ errCh <- fmt.Errorf("failed to read assess runner output: %w", err)
+ return
+ }
if err := cmd.Wait(); err != nil {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_runner.go` around lines 117 - 138, The loop currently swallows
JSON parse failures and scanner errors; change the scanner loop in which you
read from stdoutPipe (scanner, scanner.Scan(), scanner.Text()) so that any
json.Unmarshal error for GenericEvent is sent to errCh and the loop breaks (stop
scanning) instead of continuing, and after the loop check scanner.Err() and if
non-nil send that error to errCh as well; keep the existing cmd.Wait() handling
for process exit but ensure you do not double-report the same error (i.e., only
send parse/scanner errors via errCh and then return/stop before calling the
wait-error branch if appropriate).
| for runErr := range runner.Errors { | ||
| if runErr != nil { | ||
| if runner.Stderr != nil && runner.Stderr.Len() > 0 { | ||
| fmt.Fprintln(os.Stderr, runner.Stderr.String()) | ||
| } | ||
| return runErr | ||
| } | ||
| } | ||
|
|
||
| if progress.Cancelled { | ||
| fmt.Fprintln(os.Stderr, "\nAssessment cancelled.") | ||
| return nil | ||
| } | ||
| if progress.ErrorMsg != "" { | ||
| RenderError(progress.ErrorMsg) | ||
| return fmt.Errorf("assess: %s", progress.ErrorMsg) | ||
| } |
There was a problem hiding this comment.
Handle cancellation before blocking on runner.Errors.
If the user cancels during an interview, the child can still be waiting on stdin. This code then blocks on for runErr := range runner.Errors before the deferred runner.Close() runs, so the cancel path can hang indefinitely.
Suggested fix
progress := RunAssessProgressDisplay(
"Agent Maturity Assessment",
cfg,
runner,
@@
},
)
+ if progress.Cancelled {
+ runner.Close()
+ fmt.Fprintln(os.Stderr, "\nAssessment cancelled.")
+ return nil
+ }
+
for runErr := range runner.Errors {
if runErr != nil {
if runner.Stderr != nil && runner.Stderr.Len() > 0 {
fmt.Fprintln(os.Stderr, runner.Stderr.String())
}
@@
- if progress.Cancelled {
- fmt.Fprintln(os.Stderr, "\nAssessment cancelled.")
- return nil
- }
if progress.ErrorMsg != "" {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_wizard.go` around lines 50 - 66, The loop over runner.Errors can
block preventing the deferred runner.Close() and the cancellation path from
running; before iterating runner.Errors in assess_wizard.go check
progress.Cancelled (and handle progress.ErrorMsg) and exit early, or use a
non-blocking/select pattern that listens for a cancellation signal
(progress.Cancelled) alongside reading from runner.Errors so you can break and
call runner.Close() promptly; reference runner.Errors, runner.Close(),
progress.Cancelled, progress.ErrorMsg and RenderError when making the change.
| func validateLocalPath(s string) error { | ||
| trimmed := strings.TrimSpace(s) | ||
| if trimmed == "" { | ||
| return fmt.Errorf("path is required") | ||
| } | ||
| info, err := os.Stat(trimmed) | ||
| if err != nil { | ||
| return fmt.Errorf("path does not exist: %s", trimmed) | ||
| } | ||
| if !info.IsDir() { | ||
| return fmt.Errorf("path is not a directory: %s", trimmed) | ||
| } | ||
| return nil |
There was a problem hiding this comment.
Reject non-Git directories here.
validateLocalPath currently accepts any directory, so a plain folder gets through the wizard and only fails later in preflight/collectors. Validate repo-ness at this boundary too.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_wizard.go` around lines 434 - 446, validateLocalPath currently
allows any directory; update it to reject non-Git repos by verifying repo-ness
before returning nil. Inside validateLocalPath, after the existing directory
checks, run a lightweight repo check—preferably
exec.Command("git","rev-parse","--is-inside-work-tree") and treat non-zero exit
as not-a-repo (or fallback to checking for a ".git" entry via
os.Stat(filepath.Join(trimmed, ".git"))); return a clear error like "path is not
a git repository: <path>" when the check fails so the wizard rejects plain
folders early.
Docstrings generation was requested by @asabaylus. * #6 (comment) The following files were modified: * `scripts/run-assess.ts` * `src/cli/index.ts` * `src/services/maturity/adjacent-repos.ts` * `src/services/maturity/ai-scorer.ts` * `src/services/maturity/audit-store.ts` * `src/services/maturity/audit-writer.ts` * `src/services/maturity/evidence-collectors.ts` * `src/services/maturity/fs-utils.ts` * `src/services/maturity/interview.ts` * `src/services/maturity/maturity-prompts.ts` * `src/services/maturity/maturity.service.ts` * `src/services/maturity/preflight.ts` * `src/services/maturity/rubric.ts` * `src/services/maturity/scoring.ts` * `tui/assess.go` * `tui/assess_config.go` * `tui/assess_flags.go` * `tui/assess_preview.go` * `tui/assess_progress.go` * `tui/assess_runner.go` * `tui/assess_summary.go` * `tui/assess_wizard.go` * `tui/main.go` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
src/services/maturity/adjacent-repos.ts (2)
114-116:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCanonicalize and filter in
addRepo()to prevent false positives and duplicate keys.
addRepo()currently stores raw values. Inputs likerepo.git(or casing variants) bypass normalization and produce duplicate/skewed adjacent-repo entries; stdlib-owner filtering is also inconsistent across callers.💡 Suggested fix
function addRepo( map: Map<string, AdjacentRepo>, owner: string, name: string, reason: string, ): void { - const key = `${owner}/${name}`.toLowerCase(); + const normalizedOwner = owner.trim().toLowerCase(); + const normalizedName = name.trim().replace(/\.git$/i, "").toLowerCase(); + if (!normalizedOwner || !normalizedName) return; + if (STDLIB_OWNERS.has(normalizedOwner)) return; + const key = `${normalizedOwner}/${normalizedName}`; if (map.has(key)) return; - map.set(key, { owner, name, reason }); + map.set(key, { owner: normalizedOwner, name: normalizedName, reason }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/maturity/adjacent-repos.ts` around lines 114 - 116, addRepo is storing raw owner/name values and building keys that allow duplicates and false positives (e.g., repo.git or mixed case); modify addRepo to canonicalize inputs before using map: trim and remove trailing ".git" from name, normalize owner and name to lower-case, and then build the key (e.g., `${owner}/${name}`). Also apply the stdlib-owner filter inside addRepo (so callers don't need to) and use the normalized key for the map.has and map.set checks (symbols: addRepo, map, key, owner, name, reason) to prevent duplicate/skewed adjacent-repo entries.
63-68:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTerraform source regex is still too narrow for common GitHub module forms.
Line 64 still misses valid forms like
git::https://github.com/org/repo.git//path?ref=...andgit@github.com:org/repo.git, so adjacent repos are silently dropped.💡 Suggested fix
const matches = content.matchAll( - /source\s*=\s*"(?:git::|github\.com\/)([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/g, + /source\s*=\s*"(?:(?:git::)?(?:https?:\/\/|ssh:\/\/git@|git@)?github\.com[:/])([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)(?:\.git)?(?:\/\/[^"\s]*)?(?:\?[^"\s]*)?/g, );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/maturity/adjacent-repos.ts` around lines 63 - 68, The current regex used in the matchAll call (the matches variable) is too narrow and misses common Terraform source forms; update the regex used where matches is defined so it also accepts git::https://github.com/... (with optional .git, //path and ?ref), scp-style git@github.com:org/repo.git, and both single and double-quoted sources while still capturing just the owner/repo string so OWNER_REPO.exec(match[1]) continues to work; then let addRepo(found, owner, repo, "Terraform module source") run as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/maturity/adjacent-repos.ts`:
- Around line 41-47: The code is treating local action references like "uses:
./.github/actions/..." as external owner/repo entries; after extracting
usesMatch, skip any local paths by checking usesMatch[1] for leading "./", "../"
or "/" (or other local-path indicators) before running OWNER_REPO.exec; keep
using OWNER_REPO, STDLIB_OWNERS and addRepo as-is, so only call
OWNER_REPO.exec(usesMatch[1]) and addRepo(...) when the uses string is not a
local path.
In `@tui/assess_config.go`:
- Around line 75-80: The code currently ignores errors from os.Getwd() when
building an AssessConfig/AssessScope default; update the logic around os.Getwd()
so you check the returned error, and if it fails set a sensible fallback (for
example "." or os.TempDir()) before assigning Scope.LocalPath and computing
DisplayName (both in the AssessScope construction), ensuring AssessConfig and
AssessScope creation uses the fallback path when Getwd returns an error.
In `@tui/assess.go`:
- Around line 68-73: The current check conflates LoadAssessConfig() errors with
a missing config; instead, check the error first and surface it: call
LoadAssessConfig(), if err != nil print the actual error (including context from
assessConfigPath()) to stderr and exit non-zero, and only if err == nil and cfg
== nil print the “No saved assess configuration found…” message; reference
flagAssessShowConfig, LoadAssessConfig, and assessConfigPath to locate and
update the logic so parse/read/permission failures are not masked as “not
found.”
- Around line 96-101: The helper loadOrInitAssessConfig currently swallows
errors from LoadAssessConfig causing silent fallback to defaults and potential
overwrites; change loadOrInitAssessConfig to return (AssessConfig, error)
instead of just AssessConfig, propagate the error returned by LoadAssessConfig
(return the saved config when non-nil, otherwise return DefaultAssessConfig plus
the load error), and update callers such as runAssess to handle the error
(surface it to the user and avoid saving/overwriting the config when
LoadAssessConfig returned an error) so corrupted/unreadable config files are
reported instead of silently replaced.
---
Duplicate comments:
In `@src/services/maturity/adjacent-repos.ts`:
- Around line 114-116: addRepo is storing raw owner/name values and building
keys that allow duplicates and false positives (e.g., repo.git or mixed case);
modify addRepo to canonicalize inputs before using map: trim and remove trailing
".git" from name, normalize owner and name to lower-case, and then build the key
(e.g., `${owner}/${name}`). Also apply the stdlib-owner filter inside addRepo
(so callers don't need to) and use the normalized key for the map.has and
map.set checks (symbols: addRepo, map, key, owner, name, reason) to prevent
duplicate/skewed adjacent-repo entries.
- Around line 63-68: The current regex used in the matchAll call (the matches
variable) is too narrow and misses common Terraform source forms; update the
regex used where matches is defined so it also accepts
git::https://github.com/... (with optional .git, //path and ?ref), scp-style
git@github.com:org/repo.git, and both single and double-quoted sources while
still capturing just the owner/repo string so OWNER_REPO.exec(match[1])
continues to work; then let addRepo(found, owner, repo, "Terraform module
source") run as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2be1c0e0-988f-492e-ae6e-c90fbcfd73e1
📒 Files selected for processing (23)
scripts/run-assess.tssrc/cli/index.tssrc/services/maturity/adjacent-repos.tssrc/services/maturity/ai-scorer.tssrc/services/maturity/audit-store.tssrc/services/maturity/audit-writer.tssrc/services/maturity/evidence-collectors.tssrc/services/maturity/fs-utils.tssrc/services/maturity/interview.tssrc/services/maturity/maturity-prompts.tssrc/services/maturity/maturity.service.tssrc/services/maturity/preflight.tssrc/services/maturity/rubric.tssrc/services/maturity/scoring.tstui/assess.gotui/assess_config.gotui/assess_flags.gotui/assess_preview.gotui/assess_progress.gotui/assess_runner.gotui/assess_summary.gotui/assess_wizard.gotui/main.go
✅ Files skipped from review due to trivial changes (1)
- src/services/maturity/interview.ts
🚧 Files skipped from review as they are similar to previous changes (19)
- src/services/maturity/scoring.ts
- tui/assess_summary.go
- tui/assess_flags.go
- tui/assess_preview.go
- src/services/maturity/maturity-prompts.ts
- src/services/maturity/audit-writer.ts
- src/services/maturity/ai-scorer.ts
- src/services/maturity/preflight.ts
- tui/assess_runner.go
- src/cli/index.ts
- tui/assess_progress.go
- scripts/run-assess.ts
- src/services/maturity/maturity.service.ts
- tui/assess_wizard.go
- src/services/maturity/audit-store.ts
- tui/main.go
- src/services/maturity/evidence-collectors.ts
- src/services/maturity/rubric.ts
- src/services/maturity/fs-utils.ts
| const usesMatch = /\buses:\s*([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/.exec( | ||
| line, | ||
| ); | ||
| if (usesMatch) { | ||
| const m = OWNER_REPO.exec(usesMatch[1]); | ||
| if (m && !STDLIB_OWNERS.has(m[1].toLowerCase())) { | ||
| addRepo(found, m[1], m[2], `Workflow uses: ${m[0]}`); |
There was a problem hiding this comment.
Skip local-action uses: entries before owner/repo parsing.
Line 41 currently matches local workflow actions like uses: ./.github/actions/... and records invalid adjacent repos (e.g., ./.github). That pollutes scoring context.
💡 Suggested fix
- const usesMatch = /\buses:\s*([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/.exec(
- line,
- );
+ const usesMatch = /\buses:\s*"?([^"\s]+)"?/.exec(line);
if (usesMatch) {
- const m = OWNER_REPO.exec(usesMatch[1]);
+ const usesRef = usesMatch[1].trim();
+ if (usesRef.startsWith("./") || usesRef.startsWith("docker://")) continue;
+ const repoRef = usesRef.split("@", 1)[0];
+ const m = OWNER_REPO.exec(repoRef);
if (m && !STDLIB_OWNERS.has(m[1].toLowerCase())) {
addRepo(found, m[1], m[2], `Workflow uses: ${m[0]}`);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const usesMatch = /\buses:\s*([a-zA-Z0-9_.-]+\/[a-zA-Z0-9_.-]+)/.exec( | |
| line, | |
| ); | |
| if (usesMatch) { | |
| const m = OWNER_REPO.exec(usesMatch[1]); | |
| if (m && !STDLIB_OWNERS.has(m[1].toLowerCase())) { | |
| addRepo(found, m[1], m[2], `Workflow uses: ${m[0]}`); | |
| const usesMatch = /\buses:\s*"?([^"\s]+)"?/.exec(line); | |
| if (usesMatch) { | |
| const usesRef = usesMatch[1].trim(); | |
| if (usesRef.startsWith("./") || usesRef.startsWith("docker://")) continue; | |
| const repoRef = usesRef.split("@", 1)[0]; | |
| const m = OWNER_REPO.exec(repoRef); | |
| if (m && !STDLIB_OWNERS.has(m[1].toLowerCase())) { | |
| addRepo(found, m[1], m[2], `Workflow uses: ${m[0]}`); | |
| } | |
| } |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 41-43: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.
(coderabbit.command-injection.exec-js)
[ERROR] 45-45: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.
(coderabbit.command-injection.exec-js)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/services/maturity/adjacent-repos.ts` around lines 41 - 47, The code is
treating local action references like "uses: ./.github/actions/..." as external
owner/repo entries; after extracting usesMatch, skip any local paths by checking
usesMatch[1] for leading "./", "../" or "/" (or other local-path indicators)
before running OWNER_REPO.exec; keep using OWNER_REPO, STDLIB_OWNERS and addRepo
as-is, so only call OWNER_REPO.exec(usesMatch[1]) and addRepo(...) when the uses
string is not a local path.
| cwd, _ := os.Getwd() | ||
| return AssessConfig{ | ||
| Scope: AssessScope{ | ||
| Mode: "local-repo", | ||
| LocalPath: cwd, | ||
| DisplayName: filepath.Base(cwd), |
There was a problem hiding this comment.
Handle os.Getwd() failure before deriving default scope.
At Line 75, ignoring the os.Getwd() error can leave Scope.LocalPath empty and produce brittle defaults. Add a fallback path when Getwd fails.
Suggested patch
func DefaultAssessConfig() AssessConfig {
- cwd, _ := os.Getwd()
+ cwd, err := os.Getwd()
+ if err != nil || cwd == "" {
+ cwd = "."
+ }
return AssessConfig{
Scope: AssessScope{
Mode: "local-repo",
LocalPath: cwd,
DisplayName: filepath.Base(cwd),
},
EvidenceTier: "auto",
OutputFormat: "both",
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess_config.go` around lines 75 - 80, The code currently ignores errors
from os.Getwd() when building an AssessConfig/AssessScope default; update the
logic around os.Getwd() so you check the returned error, and if it fails set a
sensible fallback (for example "." or os.TempDir()) before assigning
Scope.LocalPath and computing DisplayName (both in the AssessScope
construction), ensuring AssessConfig and AssessScope creation uses the fallback
path when Getwd returns an error.
| if *flagAssessShowConfig { | ||
| cfg, err := LoadAssessConfig() | ||
| if err != nil || cfg == nil { | ||
| fmt.Fprintln(os.Stderr, "No saved assess configuration found at "+assessConfigPath()) | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
Don’t mask parse/read failures as “no saved config.”
At Line 70, err != nil || cfg == nil collapses permission/read/JSON errors into the same “not found” message, which hides the real problem and makes recovery harder.
Suggested patch
if *flagAssessShowConfig {
cfg, err := LoadAssessConfig()
- if err != nil || cfg == nil {
- fmt.Fprintln(os.Stderr, "No saved assess configuration found at "+assessConfigPath())
+ if err != nil {
+ fmt.Fprintf(os.Stderr, "Failed to load assess configuration at %s: %v\n", assessConfigPath(), err)
+ os.Exit(1)
+ }
+ if cfg == nil {
+ fmt.Fprintln(os.Stderr, "No saved assess configuration found at "+assessConfigPath())
os.Exit(1)
}
data, _ := json.MarshalIndent(cfg, "", " ")
fmt.Println(string(data))
return nil
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess.go` around lines 68 - 73, The current check conflates
LoadAssessConfig() errors with a missing config; instead, check the error first
and surface it: call LoadAssessConfig(), if err != nil print the actual error
(including context from assessConfigPath()) to stderr and exit non-zero, and
only if err == nil and cfg == nil print the “No saved assess configuration
found…” message; reference flagAssessShowConfig, LoadAssessConfig, and
assessConfigPath to locate and update the logic so parse/read/permission
failures are not masked as “not found.”
| func loadOrInitAssessConfig() AssessConfig { | ||
| saved, _ := LoadAssessConfig() | ||
| if saved != nil { | ||
| return *saved | ||
| } | ||
| return DefaultAssessConfig() |
There was a problem hiding this comment.
Avoid silent fallback when saved config is unreadable/corrupt.
At Line 97, ignoring LoadAssessConfig() errors can silently reset users to defaults and later overwrite the broken file state on save. This should be surfaced to the caller.
Suggested patch
-func loadOrInitAssessConfig() AssessConfig {
- saved, _ := LoadAssessConfig()
+func loadOrInitAssessConfig() (AssessConfig, error) {
+ saved, err := LoadAssessConfig()
+ if err != nil {
+ return AssessConfig{}, fmt.Errorf("assess: failed to load saved config: %w", err)
+ }
if saved != nil {
- return *saved
+ return *saved, nil
}
- return DefaultAssessConfig()
+ return DefaultAssessConfig(), nil
}And in runAssess:
- cfg := loadOrInitAssessConfig()
+ cfg, err := loadOrInitAssessConfig()
+ if err != nil {
+ return err
+ }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tui/assess.go` around lines 96 - 101, The helper loadOrInitAssessConfig
currently swallows errors from LoadAssessConfig causing silent fallback to
defaults and potential overwrites; change loadOrInitAssessConfig to return
(AssessConfig, error) instead of just AssessConfig, propagate the error returned
by LoadAssessConfig (return the saved config when non-nil, otherwise return
DefaultAssessConfig plus the load error), and update callers such as runAssess
to handle the error (surface it to the user and avoid saving/overwriting the
config when LoadAssessConfig returned an error) so corrupted/unreadable config
files are reported instead of silently replaced.
Summary
Adds
teamhero assess(CLI) and a new TUI subcommand alongsidereport/setup/doctor. Scores an engineering organization against the 12-criterion Agent Maturity Assessment (4 weighted categories: engineering basics 1.0x, knowledge & context 1.5x, AI governance & quality 1.25x, hiring 1.0x). Output is a markdown audit (matching the canonical template — per-category tables, summary, maturity-scale row marker, top-3 fixes, strengths, adjacent repos, notes for re-audit) plus a.jsonsidecar with the full data.Pipeline
Mirrors how the existing
reportflow is composed:ghCLI authed -> Tier 1, GitHub MCP -> Tier 2, otherwise -> Tier 3 git-only).uses:, Terraform module sources, submodules, and README cross-refs to map sibling repos.json_schema. Tier-3 caps on items 2/3/9/11 enforced post-hoc.docs/audits/CONFIG.mdround-trip for re-audit confirm-or-refresh.Visual design parity with
teamhero reportThe whole interactive flow runs inside a single Bubble Tea program so the framed two-pane layout is continuous from scope-pick to audit preview — no layout breaks anywhere along the way.
renderShellHeaderbanner + left-form / right-summary / nav-hints layout the report wizard usesprogress.go: monotonic progress bar, spinner-driven step list with checkmark/cross/circle icons, inline elapsed timers, right-side configuration summary in a rounded box,dry-runpill styled like theflexpillinterview-questionevent arrives, the left pane swaps from the progress panel to a huh form (same shell header, same right-pane summary, same nav hints) instead of releasing the terminal and breaking the framed layoutRunReportPreviewFullexactly, including the checkmark on the JSON tab when data is availableShareable skill bundle
share/skills/agent-maturity-assessment/ships a self-contained, harness-agnostic copy of the skill (SKILL.md + 4 references) suitable for distribution to other Claude harnesses. Mentions theteamherobinary as an optional accelerator but works standalone in pure-Claude mode.Headless / scripted usage
Test plan
Commits
39e287cfbabfa0ccc5fdcbf0d9f0File summary
src/services/maturity/(14 files: rubric, scoring, interview, preflight, evidence-collectors, adjacent-repos, audit-writer, audit-store, maturity-prompts, ai-scorer, maturity.service, stdin-interview, fs-utils, types)scripts/run-assess.ts(bidirectional JSON-lines stdin/stdout)tui/assess.go,assess_config.go,assess_flags.go,assess_protocol.go,assess_runner.go,assess_progress.go,assess_summary.go,assess_preview.go,assess_wizard.go+ 6 test filesclaude-plugin/skills/agent-maturity-assessment/SKILL.mdshare/skills/agent-maturity-assessment/(zip-ready, harness-agnostic)docs/maturity-skill-ref/(upstream skill files for human readers)docs/2026-05-03-agent-maturity-assessment-plan.mdsrc/cli/index.ts,src/core/types.ts,tui/main.go,tui/protocol.go,README.md,CLAUDE.md,justfile,.env.schema🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
teamhero assesscommand supporting interactive TUI and headless execution modesdocs/audits/CONFIG.mdConfiguration
MATURITY_AI_MODELenvironment variable to override assessment AI modelTEAMHERO_GITHUB_MCPflag to enable higher-fidelity assessmentsDocumentation