Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions plugins/maister/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,7 @@ Skills are automatically invoked by Claude when appropriate. Details live in eac
|-------|---------|---------|
| `codebase-analyzer` | Thin dispatcher: selects agent roles adaptively, launches parallel Explore subagents, delegates report synthesis to `codebase-analysis-reporter` subagent | `skills/codebase-analyzer/SKILL.md` |
| `implementer` | Executes plans with **mandatory** standards reading (INDEX.md + implementation-plan.md Standards Compliance section + keyword-triggered) and **test step enforcement** (requires user approval to skip N.1 tests) | `skills/implementer/SKILL.md` |
| `implementation-plan-executor` | Executes implementation plans with two-mode adaptive execution. Mode A (≤5 steps): direct. Mode B (6+ steps): delegates to `task-group-implementer` subagent with **model escalation** (sonnet → opus on BLOCKED) | `skills/implementation-plan-executor/SKILL.md` |
| `implementation-verifier` | Read-only QA orchestrator: delegates completeness checks, test execution, code review, and production readiness to specialized subagents; compiles results into verification report | `skills/implementation-verifier/SKILL.md` |
| `standards-discover` | Parallel multi-source standards discovery (config, code, docs, PRs/CI) with confidence scoring | `skills/standards-discover/SKILL.md` |
| `docs-manager` | Internal engine for doc file operations, INDEX.md generation, CLAUDE.md integration. Not user-invocable — accessed via `docs-operator` agent (Task tool) by init, standards-update, standards-discover | `skills/docs-manager/skill.md` |
Expand Down Expand Up @@ -601,6 +602,7 @@ Subagents are specialized AI agents invoked by skills and orchestrators. All age
| `spec-auditor` | Independent spec audit with senior auditor perspective | orchestrators | `agents/spec-auditor.md` |
| `reality-assessor` | Validates work actually solves the problem | implementation-verifier | `agents/reality-assessor.md` |
| `implementation-changes-planner` | Creates detailed change plans (no file modifications) | implementer | `agents/implementation-changes-planner.md` |
| `task-group-implementer` | Executes a single task group: writes code, runs tests, reports status. Supports model escalation (sonnet → opus on BLOCKED). | implementation-plan-executor | `agents/task-group-implementer.md` |

**See**: Individual `agents/*.md` files for detailed workflows and philosophies.

Expand All @@ -614,6 +616,7 @@ Subagents are specialized AI agents invoked by skills and orchestrators. All age
6. **Incremental Verification**: Run only new tests after each group, not entire suite
7. **Comprehensive Verification Before Commit**: Run full test suite and create verification report before code review
8. **Task Directory Artifact Anchoring**: ALL workflow artifacts (reports, documentation, screenshots) MUST be saved under the task directory (`.maister/tasks/[type]/[task-name]/`). NEVER save task artifacts to project directories like `docs/`, `src/`, or project root.
9. **Model Escalation**: Subagents start on sonnet; if BLOCKED, automatically retry with opus before asking the user

**For detailed workflow documentation, see**: individual skill `SKILL.md` files

Expand Down
36 changes: 30 additions & 6 deletions plugins/maister/agents/task-group-implementer.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: task-group-implementer
description: Execute a single task group from an implementation plan with continuous standards discovery. Writes code, runs tests, returns structured execution report. Does NOT mark checkboxes - main agent handles progress tracking.
model: inherit
model: sonnet
color: green
---

Expand All @@ -25,6 +25,24 @@ Execute one task group from an implementation plan: write tests, implement code,
4. **Structured reporting**: Return results in expected format for main agent
5. **No progress tracking**: Do NOT mark checkboxes - main agent owns that responsibility

## When You're Stuck

It is always OK to stop and report that you can't complete the task. Bad work is worse than no work. You will not be penalized for escalating.

**Report BLOCKED when:**
- The task requires architectural decisions with multiple valid approaches
- You need to understand code beyond what was provided and can't find clarity
- You feel uncertain about whether your approach is correct
- The task involves restructuring existing code in ways the plan didn't anticipate
- You've been reading file after file trying to understand the system without progress

**Report NEEDS_CONTEXT when:**
- You need information about a specific file, function, or pattern not provided
- The spec is ambiguous about a specific requirement
- You need to know which of two approaches the project prefers

**How to report:** Set your status to BLOCKED or NEEDS_CONTEXT. Describe specifically what you're stuck on, what you've tried, and what kind of help you need. The coordinator can provide more context, re-dispatch with a more capable model, or break the task into smaller pieces.

## Decision-Making Framework

When facing implementation choices:
Expand Down Expand Up @@ -139,7 +157,7 @@ Output structured report in expected format (see Output Format section).
```markdown
## Group [N] Execution Report

### Status: [SUCCESS/PARTIAL/FAILED]
### Status: [SUCCESS/SUCCESS_WITH_CONCERNS/PARTIAL/NEEDS_CONTEXT/BLOCKED]

### Steps Completed
- [x] N.1 - [brief description]
Expand Down Expand Up @@ -216,15 +234,21 @@ If you encounter errors during implementation:
1. **Syntax/compile errors**: Fix before proceeding
2. **Missing dependencies**: Note in report, attempt reasonable fix
3. **Unclear requirements**: Make reasonable choice, document in notes
4. **Blocking issues**: Report FAILED status with details
4. **Blocking issues**: Report BLOCKED status with details

### What Triggers Each Status

| Status | When to Use |
|--------|-------------|
| **SUCCESS** | All steps complete, all tests pass |
| **PARTIAL** | Some steps complete, tests failing, or minor issues |
| **FAILED** | Blocking issue prevents completion, needs main agent intervention |
| **SUCCESS_WITH_CONCERNS** | All steps complete, but flagging doubts (e.g., file growing too large, uncertain edge case) |
| **PARTIAL** | Some steps complete, tests failing, or minor issues — you made progress but couldn't finish |
| **NEEDS_CONTEXT** | Missing information that wasn't provided. You know what you need — specify it precisely |
| **BLOCKED** | Cannot complete due to complexity, unclear architecture, or conflicting requirements. Describe what you're stuck on and what you've tried |

**BLOCKED vs PARTIAL:** Use BLOCKED when the problem is reasoning/understanding (you don't know HOW), not execution (you know how but hit errors). BLOCKED triggers model escalation; PARTIAL triggers main agent investigation.

**NEEDS_CONTEXT vs BLOCKED:** Use NEEDS_CONTEXT when you can name the specific missing information. Use BLOCKED when you can't articulate a specific ask — you're stuck.

## Integration

Expand Down Expand Up @@ -279,4 +303,4 @@ During step N.3, realize auth pattern needed → Check INDEX.md → Find and rea

### Scenario 4: Blocking Issue

Can't proceed due to missing dependency or unclear spec → Report FAILED with clear explanation → Main agent will use AskUserQuestion to decide path forward
Can't proceed due to missing dependency or unclear spec → Report BLOCKED with clear explanation → Main agent will escalate model or use AskUserQuestion to decide path forward
88 changes: 69 additions & 19 deletions plugins/maister/skills/implementation-plan-executor/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,12 +131,42 @@ For each task group:

5. Use `TaskUpdate` to set the group task to `status: "completed"` with `metadata: {completed_at, tests_passed, files_modified, standards_applied}`

6. **If subagent reports failure**:
- Do NOT auto-rollback (see Critical Principle in CLAUDE.md)
- Assess: config issue? test setup? logic error?
- Use AskUserQuestion for recovery path
6. **Process subagent status**:

**SUCCESS / SUCCESS_WITH_CONCERNS**: Proceed normally. If concerns flagged, log them in work-log.

**PARTIAL**: Subagent made progress but couldn't finish. Assess root cause:
- Test failures → analyze, apply fix if obvious, re-run
- If unclear → AskUserQuestion with recovery options
- Keep group task as `in_progress` with `metadata: {failed_at, failure_reason}`

**NEEDS_CONTEXT**: Subagent needs specific information. Read what they're asking for, provide it, and re-dispatch with the **same model** (sonnet):
- Extract the specific ask from subagent output
- Gather the requested context (read files, check standards, etc.)
- Re-dispatch task-group-implementer with original prompt + additional context section
- No model change — the problem is missing data, not reasoning

**BLOCKED**: Subagent is stuck on complexity/reasoning. **Escalate model**:
- Re-dispatch task-group-implementer with `model: opus` parameter
- Include the original prompt + subagent's BLOCKED explanation as additional context
- If opus also returns BLOCKED → stop and use AskUserQuestion:
```
Question: "Task group [N] blocked even with escalated model. [Brief reason from subagent]. How to proceed?"
Header: "Model Escalation Failed"
Options:
- "Break into smaller pieces" - Split this group and retry
- "Provide more context" - I'll give additional information
- "Skip this group" - Mark as skipped, continue
- "Stop implementation" - Pause for investigation
```
- Log escalation in work-log: "Group N: escalated sonnet → opus. Reason: [from BLOCKED status]"

**Key rules:**
- Never retry the same model without changes
- NEEDS_CONTEXT → same model (missing data)
- BLOCKED → opus (reasoning/complexity)
- Opus BLOCKED → always ask user

## Continuous Standards Discovery

**Philosophy**: Standards are discovered when relevant, not memorized upfront.
Expand Down Expand Up @@ -237,14 +267,42 @@ You have access to `.maister/docs/INDEX.md` for continuous standards discovery.
[See Subagent Output Format section]
```

### Re-dispatch on BLOCKED (Model Escalation)

When re-dispatching with opus after BLOCKED:

````markdown
## Task: Execute Task Group [N] (Escalated)

**Previous attempt status**: BLOCKED
**Previous attempt explanation**: [paste BLOCKED explanation from subagent]
**Model**: opus (escalated from sonnet)

### Task Group Content
[Same as original dispatch]

### Specification Excerpt
[Same as original dispatch]

### Standards
[Same as original dispatch]

### Additional Context
[Any context gathered based on the BLOCKED explanation]

### Requirements
[Same as original dispatch, plus:]
5. You are running on a more capable model because the previous attempt was blocked. Use your additional reasoning capability to work through the complexity described above.
````

## Subagent Output Format

The task-group-implementer returns structured output:

```markdown
## Group [N] Execution Report

### Status: [SUCCESS/PARTIAL/FAILED]
### Status: [SUCCESS/SUCCESS_WITH_CONCERNS/PARTIAL/NEEDS_CONTEXT/BLOCKED]

### Steps Completed
- [x] N.1 - [description]
Expand Down Expand Up @@ -355,22 +413,14 @@ After each task group:

### Subagent Failure (Mode B)

If task-group-implementer reports failure:
Subagent status handling is defined in Mode B step 6 above. Additional rules:

1. **Do NOT auto-rollback** - User-confirmed rollback only
2. **Analyze root cause** from subagent output
3. **Check for easy fixes**: config issues, missing dependencies, test setup
4. **Use AskUserQuestion**:
```
Question: "Group [N] implementation failed: [brief reason]. How to proceed?"
Header: "Failure"
Options:
- "Try suggested fix" - [if easy fix identified]
- "Retry group" - Re-invoke subagent
- "Complete manually" - Switch to direct execution for this group
- "Rollback changes" - Revert this group's changes
- "Stop" - Pause for investigation
```
2. **Model escalation is automatic** - BLOCKED → opus happens without asking user
3. **User involvement triggers**:
- Opus returns BLOCKED (end of escalation chain)
- PARTIAL status with unclear root cause
- Max 1 NEEDS_CONTEXT re-dispatch per group (if still NEEDS_CONTEXT after providing context → AskUserQuestion)

### Test Failure

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -324,3 +324,60 @@ If prerequisites missing, use AskUserQuestion: "Start from Phase 1", "Specify di
| User chooses "Proceed with known issues" | Proceed with warning logged |
| Max iterations (3) reached | Ask user how to proceed |
| Critical issues remain unresolved | **MUST NOT proceed** — require user approval first |

---

## 7. Model Escalation Pattern

When a subagent reports BLOCKED status, the coordinator can re-dispatch with a more capable model. This is an automatic escalation — no user confirmation needed for the first tier.

### Escalation Chain

````
sonnet (default) → BLOCKED → opus → BLOCKED → AskUserQuestion
````

### Status-to-Action Mapping

| Subagent Status | Action | Model Change |
|----------------|--------|--------------|
| SUCCESS / SUCCESS_WITH_CONCERNS | Proceed | None |
| PARTIAL | Investigate, fix if obvious, ask user if unclear | None |
| NEEDS_CONTEXT | Provide requested context, re-dispatch | Same model |
| BLOCKED | Re-dispatch with more capable model | sonnet → opus |

### Key Rules

1. **Never retry same model without changes** — if BLOCKED, something must change (model, context, or task scope)
2. **NEEDS_CONTEXT ≠ BLOCKED** — missing data → same model; reasoning limit → higher model
3. **End of chain → user** — when the most capable model is BLOCKED, always AskUserQuestion
4. **Log escalations** — record in work-log for visibility and cost tracking
5. **No automatic rollback** — BLOCKED does not mean "undo what was done"

### When to Apply

This pattern applies to any agent that:
- Has `model: sonnet` in frontmatter (not `inherit` or `opus`)
- Implements the enriched status protocol (SUCCESS/SUCCESS_WITH_CONCERNS/PARTIAL/NEEDS_CONTEXT/BLOCKED)
- Is dispatched by a coordinator skill that processes the output

Currently applies to:
- `task-group-implementer` (dispatched by `implementation-plan-executor`)

### Re-dispatch Prompt Structure

When escalating, the coordinator includes:
- Original task prompt (unchanged)
- Previous attempt's BLOCKED explanation
- Any additional context gathered
- Note that this is an escalated dispatch with a more capable model

### Anti-Patterns

| Anti-Pattern | Why It's Wrong |
|--------------|----------------|
| Retrying same model on BLOCKED | Wastes tokens, same result |
| Escalating on NEEDS_CONTEXT | Problem is data, not reasoning — provide context first |
| Escalating on PARTIAL | Subagent made progress — investigate the specific failure |
| Skipping user when opus is BLOCKED | End of chain, user must decide next step |
| Auto-rollback on BLOCKED | BLOCKED means "stuck", not "failed" — work may be partially valid |