diff --git a/.gitignore b/.gitignore index 5a627d84..761ad209 100644 --- a/.gitignore +++ b/.gitignore @@ -2,7 +2,8 @@ close-bureau .worktrees/ *.pyc .serena/ -.vscode/ +.vscode/* +!.vscode/settings.json .npm-cache/ bureau.egg-info/ .coverage diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 00000000..3d8cd419 --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,5 @@ +{ + // ensure VSCode extension picks up Mypy when the uv-installed Python interpreter + // (in bureau/.venv/) is selected in VSCode + "mypy-type-checker.importStrategy": "fromEnvironment" +} \ No newline at end of file diff --git a/README.md b/README.md index 166b85db..0b05bc45 100644 --- a/README.md +++ b/README.md @@ -19,9 +19,11 @@ - spawnable as **cross-CLI subagents** with *minimal* task delegation overhead - usable in *every* supported CLI as both: - - **isolated subagents** + - **isolated subagents** - **interactive main agents** +- **Built-in workflow skills** — structured, multi-step protocols (like [two-phase code assessment](protocols/context/static/skills/assess-mode/SKILL.md)) that agents activate automatically when they recognise a matching task + - A ***near-zero* learning curve** via: 1. **context injection** that ensures: @@ -110,15 +112,16 @@ All agents automatically read these files at startup: - Serves as an entrypoint to documentation progressively disclosing each MCP servers' tool capabilities -- Guidance on automatically activating [skills](https://github.com/obra/superpowers/tree/main/skills) highly relevant and useful for key dev workflows/tasks *(provided by the [Superpowers plugin](https://github.com/obra/superpowers), currently **only for Claude Code or Codex**)* +- **Custom Bureau skills**: structured workflow protocols (e.g. `bureau-assess-mode`) installed for all supported CLIs and activated automatically by matching prompts +- **[Superpowers](https://github.com/obra/superpowers) skills** — community-maintained skill library *(currently Claude Code and Codex only)* -Injected via these files (created in setup steps) +Injected via these files - `~/.claude/CLAUDE.md` (Claude Code) - `~/.gemini/GEMINI.md` (Gemini CLI) - `~/.codex/AGENTS.md` (Codex) -with each of the 3 files above generated from templates (for portability regardless of repo clone location). +with each of the 3 files above generated from [templates](protocols/context/templates/) and symlinked (for portability). ### Spec-driven development *(maintainer favourite)* @@ -132,9 +135,44 @@ with each of the 3 files above generated from templates (for portability regardl - can seamlessly handle on-the-fly updates, accordingly synchronize/adjust specs, plans, tasks, etc. in a cascading fashion > [!TIP] -> +> > To get started fast, **read [Bureau's 5-minute guide to `spec-kit`](docs/USAGE.md#using-github-speckit-cli)**. +### Workflow skills that actually help + +> *Structured, multi-step protocols that agents activate automatically when they recognise a matching task.* + +> [!NOTE] +> +> All skill names below appear in agent interfaces prefixed with `bureau-` (e.g. `assess-mode` → `bureau-assess-mode`). + +#### Skills installed by default + +| Skill | What it does | +| :--- | :--- | +| **[Assess mode](protocols/context/static/skills/assess-mode/SKILL.md)** | **Two-phase guided review**: first builds a mental model of changes (with 4 comprehension styles to choose from), then audits every file against [configurable quality standards](docs/CONFIGURATION.md#assess_mode). Interactive tour when used as a main agent; structured report when delegated to a subagent. | +| **[Micro mode](protocols/context/static/skills/micro-mode/SKILL.md)** | **Step-gated editing with DAG-based planning:** offers maximum control over each atomic edit, with pause points after every change. | + +#### Additional skills available in the catalog + +The [`protocols/context/static/skills/`](protocols/context/static/skills/) directory ships several more skills that can be enabled on demand: + +| Skill | What it does | +| :--- | :--- | +| [Scrimmage mode](protocols/context/static/skills/scrimmage-mode/SKILL.md) | Systematic self-attack testing after every code change: generates attack vectors across 5 categories (input validation, state, failure modes, concurrency, security) and blocks progression until vulnerabilities are fixed. | +| [Blast radius mode](protocols/context/static/skills/blast-radius-mode/SKILL.md) | Runs impact analysis before edits by enumerating callers, dependents, tests, and contracts affected, then classifying changes as *safe/needs review/breaking/blocked*. | +| [Clearance mode](protocols/context/static/skills/clearance-mode/SKILL.md) | Rigorous completion verification that defines measurable "done" criteria upfront and blocks clearance until they're satisfied, with evidence. | +| [Safeguard mode](protocols/context/static/skills/safeguard-mode/SKILL.md) | Defines system invariants (value constraints, state machines, relationships, ordering) that must never break and verifies them after all changes. | +| [Prompt engineering](protocols/context/static/skills/prompt-engineering/SKILL.md) | Guided prompt creation and refinement for system prompts, agent instructions, skill definitions, or any LLM-facing text. | +| [Shadow mode](protocols/context/static/skills/shadow-mode/SKILL.md) | Propose-only editing: the agent shows diffs without touching files, with the user applying changes manually. Ideal for learning, maximum transparency, or untrusted environments. | + +To enable any of these, add them to the `skills.enabled` [config setting](docs/CONFIGURATION.md#skills): + +```yaml +skills: + enabled: [micro-mode, assess-mode, shadow-mode, scrimmage-mode] +``` + ## Agent role usage patterns ### Spawning subagents @@ -190,12 +228,12 @@ Use the built-in [primary agents mechanism](https://opencode.ai/docs/agents/#pri | File | Purpose | Tracked? | | :--- | :--- | :--- | -| `charter.yml` | Fixed, rarely-changed system defaults | Yes | -| `directives.yml` | Streamlined collection of user-oriented, often-tweaked settings | Yes | +| `defaults.yml` | All git-tracked package defaults (ships with Bureau) | Yes | +| `.bureau.yml` | Optional project-level config (discovered by CWD walk-up) | Yes (in *your* project) | | **`local.yml`** | **Personal customizations/overrides** (gitignored) | **No** (gitignored) | Configuration loads based on the following hierarchy *(later config sources override earlier ones)*: \ -**`charter.yml` → `directives.yml` → `local.yml` → environment variables** +**`defaults.yml` → `.bureau.yml` → `local.yml` → environment variables** See [`docs/CONFIGURATION.md`](docs/CONFIGURATION.md) for full reference. @@ -203,7 +241,7 @@ See [`docs/CONFIGURATION.md`](docs/CONFIGURATION.md) for full reference. ``` bureau/ -├── bin/ # CLI entry points (open-bureau, close-bureau, check-prereqs) +├── bin/ # CLI entry points (open-bureau, close-bureau, ensure-prereqs) ├── agents/ # Agent definitions and setup ├── protocols/ # Context/guidance files for agents ├── tools/ # MCP servers and their documentation diff --git a/agents/README.md b/agents/README.md deleted file mode 100644 index a33bc360..00000000 --- a/agents/README.md +++ /dev/null @@ -1,302 +0,0 @@ -# Agent-ready context/config files - -## Role prompts - -- **`role-prompts/`**: contains role prompts to be used for clink and OpenCode subagents - - - Body-only "delta" role prompts (no YAML) - - 600–2,000 chars - -- **`claude-subagents`**: contains Claude Code subagent files - - - Generated (using script) to contain: - - - YAML frontmatter - - Body is the agent template in `role-prompts/` for the same role - - - 1,200-5,000 chars - -### General role prompt pattern - -- **Role and scope**: 2-3 bullets about what this agent does + boundaries -- **When to invoke checklist**: 3-6 triggers that should activate this role -- **Approach/workflow**: 3-6 bullets -- **Must-read files**: 4-5 max, see below -- **Output format**: deliverable structure -- **Constraints**: 3-6 bullets containing guardrails + handoff conditions - -## Linked files (in `reference/`) - -### Must-read (all roles) - -#### MCP quick decision tree (`tools-guide.md`) - -- 1,000-1,500 chars (~330 tokens) -- Fast decision tree for MCPs available by use case (code search, web research, memory, etc.) -- Tool selection hierarchy by category (code search, web research, memory, etc.) - - - Provides first-choice tool per category, with limits - -- "Tier 1" document - - - Links to per-category MCP decision guides ("tier 2") and per-MCP reference docs ("tier 3") - -#### Handoff guidelines (`handoff-guide.md`): - -**Guide for agents: when to delegate vs. when to ask the user for guidance** - - - When to use clink to spawn another CLI (cross-model orchestration) - - - This section will include a quick guide to model selection for each `clink` role prompt - - - When to use Task tool to spawn Claude Code subagents - - When to stop and ask user (AskUserQuestion scenarios) - - When to hand off between agents (e.g., research → planning → implementation) - - What requires explicit approval (commits, deployments, deletions) - - Decision matrix: `"If X, then delegate to Y agent with Z role"` - -> ### Integrating must-read files within role prompts -> -> Reference must‑read files in role prompts *without* including their content. -> -> #### Claude Code subagent frontmatter example: -> -> ```markdown -> --- -> name: code-reviewer -> description: Reviews code for quality, security, maintainability -> tools: Read, Grep, Glob, Bash, mcp__semgrep -> model: sonnet -> --- -> -> You are a senior code reviewer. Before starting, read these files: -> - `protocols/context/static/tools-guide.md` – Tier 1 tool quick ref -> - `protocols/context/static/style-guide.md` – Project coding standards -> - `protocols/context/static/handoff-guide.md` – When to delegate -> -> If you need detailed Semgrep usage, read: -> - `protocols/context/static/tools/semgrep.md` – Tier 3 deep dive -> -> [Rest of role prompt body...] -> ``` -> -> #### `clink` role prompt example: -> -> ```markdown -> You are a research synthesis specialist. -> -> At startup, read: -> - protocols/context/static/tools-guide.md (tier 1: tool selection) -> - protocols/context/static/handoff-guide.md (delegation rules) -> -> When comparing web research tools, read: -> - protocols/context/static/category/web-research.md (tier 2: Tavily vs Brave vs Fetch) -> -> [Rest of role prompt body...] -> ``` - -### Must-read (for certain roles only) - -- **Style guides** (in `style-guides/`): - - - Code-specific: `code-style-guide.md` *(unused for now, rely instead on guides provided by repos themselves)* - - Docs-specific: `docs-style-guide.md` - -### Read as needed (progressive disclosure) - -#### Per-category decision guides (read when exploring options) - -- **Location**: `reference/category/*.md` - -- **Size**: 3,500-5,000 characters each - -- **Categories**: - - - `web-research.md` - Tavily, Brave, Fetch (simple/medium tools) - - `code-search.md` - Serena, Grep patterns (simple tools) - - `memory.md` - Qdrant, Memory MCP, claude-mem (simple/medium tools) - - `documentation.md` - Context7 (simple tool) - -- **Content (for each category)**: - - - Side-by-side comparison table (tool vs strengths vs use cases) - - 2-3 examples per tool - - Common parameters and patterns - - When to escalate to tier 3 complex tools - -- **Read when**: - - - Need to compare alternatives within a category - - Learning basic usage - - "Tier 2" - -### Per-MCP deep dives (read on-demand for complex tools) - -- **Location**: `reference/deep-dives/*.md` (planned) -- **Size**: 4,000-6,000 characters each -- **Content per MCP**: - - - Detailed tool-by-tool breakdown - - Advanced usage patterns and examples - - Parameter reference tables - - Common pitfalls and gotchas - -- **Read when**: deep understanding of an MCP being used is required ("tier 3") - -> ### Benefits of progressive disclosure for tool docs -> -> | Benefit | Description | -> | :------ | :---------- | -> | Low startup cost | Tier 1 only costs ~330 tokens (every agent) | -> | Progressive disclosure | Agents drill down only when needed | -> | Comparison enabled | Tier 2 shows trade-offs between related tools in the same category | -> | Depth available | Tier 3 provides comprehensive guidance for complex tools | -> | Maintenance decoupled | Update complex tool docs without touching simple ones | -> | Token efficient | Most tasks complete with tier 1 + tier 2 (~1,200-1,500 tokens total) | - -## Using with PAL's `clink` - -This repo's role bodies in `agents/role-prompts/` can be used as clink roles across any project. clink loads CLI client configs and role prompt files at startup. - -- Prereqs - - Run PAL MCP with clink enabled (for example via this repo's setup script). Uses stdio transport. - - Create user‑level CLI client configs under `~/.pal/cli_clients/*.json` (or point `CLI_CLIENTS_CONFIG_PATH` to your configs). - -- Config search precedence - - Repo built‑ins: `conf/cli_clients` - - `CLI_CLIENTS_CONFIG_PATH` (directory or single JSON) - - User overrides: `~/.pal/cli_clients` - -- Map roles to these prompt files - - Option A (absolute paths): point `prompt_path` at files in `agents/role-prompts/`. - - Option B (symlink + relative): symlink this folder under `~/.pal/cli_clients/systemprompts/` and use a relative `prompt_path`. - -Example (`~/.pal/cli_clients/gemini.json`): - -```json -{ - "name": "gemini", - "command": "gemini", - "additional_args": ["--yolo", "-o", "json"], - "env": {}, - "roles": { - "frontend": { - "prompt_path": "/path/to/bureau/agents/role-prompts/frontend.md" - }, - "testing": { - "prompt_path": "/path/to/bureau/agents/role-prompts/testing.md" - } - } -} -``` - -Or using a symlinked layout: - -```bash -mkdir -p ~/.pal/cli_clients/systemprompts/clink -ln -s "$PWD/bureau/agents/role-prompts" \ - ~/.pal/cli_clients/systemprompts/clink/for-use-prompts -``` - -```json -{ - "name": "codex", - "command": "codex", - "additional_args": ["--json", "--dangerously-bypass-approvals-and-sandbox"], - "roles": { - "architecture_audit": { - "prompt_path": "systemprompts/clink/for-use-prompts/architecture-audit.md" - } - } -} -``` - -- Prompt resolution rules - - Relative `prompt_path` resolves relative to the JSON's directory, then falls back to the PAL project root. - - Absolute paths are used as‑is. - - Role names are per‑CLI. If duplicate CLI `name` definitions exist across search paths, later ones override earlier. - -- Verify and use - - Restart the PAL server to reload configs. - - Invoke from your agent: `clink with gemini role=frontend to assess UI components in src/ui/`. - - You can pass file paths as context, for example: `clink with codex role=architecture_audit on src/, services/auth/`. - - If only one CLI is configured, clink can default to it and allow omitting `cli_name`. - - Troubleshoot: bad paths raise "Prompt file not found: …". Check PAL server logs for details. - - The clink tool schema enumerates available `cli_name` and `role` values: use it to confirm your roles are loaded. - - The example JSONs include permissive flags (for example, Codex `--dangerously-bypass-approvals-and-sandbox`). Remove or adjust them for stricter guardrails. - -Tip: Keep role bodies short and reference Tier‑1/2/3 docs from `protocols/context/static/` in the prompt text (don’t inline). - -## Using with Claude Code subagents - -This repo’s subagent files in `agents/claude-subagents/` are ready to install. - -- Where they live (precedence) - - Project: `.claude/agents/` (overrides user for same `name`) - - User: `~/.claude/agents/` (available everywhere) - -- Install (copy or symlink) - -```bash -mkdir -p ~/.claude/agents -ln -s "$PWD/bureau/agents/claude-subagents/frontend.md" ~/.claude/agents/frontend.md -ln -s "$PWD/bureau/agents/claude-subagents/testing.md" ~/.claude/agents/testing.md -``` - -- Minimal, correct frontmatter (already included): - -```markdown ---- -name: frontend -description: Use for frontend implementation reviews and UI architecture decisions; prefer concrete, file:line guidance and least‑invasive fixes. -tools: Read, Grep, Glob, Bash -model: inherit ---- -``` - -- Make delegation proactive - - Write specific triggers in `description` (for example, “Use after UI code changes; review only the diff.”). - - Grant only the tools the role needs (omit `tools` to inherit all; include to restrict). - - Prefer `model: inherit`; set default via `CLAUDE_CODE_SUBAGENT_MODEL` if desired. - - Name must be unique, lowercase, hyphenated (for example, `architecture-agent`). - -- Verify and use - - Run `/agents` in Claude to confirm the subagents appear. - - Use natural prompts and let Claude auto‑delegate, or explicitly invoke a subagent by name. - - You can define session‑only subagents via CLI: - -```bash -claude --agents '{ - "code-reviewer": { - "description": "Expert code reviewer. Use proactively after code changes.", - "prompt": "You are a senior code reviewer. Focus on code quality, security, and best practices.", - "tools": ["Read", "Grep", "Glob", "Bash"], - "model": "sonnet" - } -}' -``` - -Note: Role bodies should reference `protocols/context/static/tools-guide.md` (Tier 1) and any relevant Tier‑2/3 guides. We'll add `agents/handoff-guide.md` next and reference it as a must‑read for delegation rules. - -## Using with OpenCode subagents - -This repo's role prompts in `agents/role-prompts/` are automatically configured as OpenCode subagents by the setup script. - -- Where they live - - Role prompts are symlinked to `~/.config/opencode/agent/bureau-agents/` - - Agents are registered in `~/.config/opencode/opencode.json` under the `agent` key with `mode: "subagent"` - -- Install (via setup script) - -```bash -agents/scripts/set-up-agents.sh -``` - -The setup script: -1. Creates symlinks from `agents/role-prompts/*.md` to `~/.config/opencode/agent/bureau-agents/` -2. Registers each agent in `opencode.json` with `mode: "all"` - -- Verify and use - - Press **Tab** in OpenCode to cycle through available agents: Bureau agents should appear - - Use natural prompts and let OpenCode auto-delegate, or explicitly mention a subagent by name - - Example: `Have the debugger agent investigate this stack trace` diff --git a/agents/claude-subagents/accessibility-auditor.md b/agents/claude-subagents/accessibility-auditor.md index fd99fdb2..d67a3f8c 100644 --- a/agents/claude-subagents/accessibility-auditor.md +++ b/agents/claude-subagents/accessibility-auditor.md @@ -27,8 +27,8 @@ Approach: - Motion: respect prefers-reduced-motion; provide pause controls for animations. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Audit report: issues by WCAG criterion, severity, affected elements. diff --git a/agents/claude-subagents/ai-ml-eng.md b/agents/claude-subagents/ai-ml-eng.md index 84cb9728..597b5faa 100644 --- a/agents/claude-subagents/ai-ml-eng.md +++ b/agents/claude-subagents/ai-ml-eng.md @@ -24,10 +24,8 @@ Approach: - Prepare safe diffs and a staged rollout; validate on representative data. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: risks, hotspots, and evidence (paths:lines) with short rationale. diff --git a/agents/claude-subagents/api-client-designer.md b/agents/claude-subagents/api-client-designer.md index 53b14a83..7b5578cc 100644 --- a/agents/claude-subagents/api-client-designer.md +++ b/agents/claude-subagents/api-client-designer.md @@ -29,10 +29,8 @@ Approach: - Request coalescing: batch APIs, deduplication windows, streaming for high volume. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [documentation category guide](../reference/by-category/documentation.md) (Tier 2) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - SDK design: initialization, auth, client lifecycle, resource cleanup. diff --git a/agents/claude-subagents/api-integration.md b/agents/claude-subagents/api-integration.md index 270e8b8d..dd01c7d8 100644 --- a/agents/claude-subagents/api-integration.md +++ b/agents/claude-subagents/api-integration.md @@ -26,12 +26,9 @@ Approach: - Set per-endpoint SLOs and error taxonomy; add schema-change alerts and observability hooks. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: quick tool selection) -- the [documentation guide](../reference/by-category/documentation.md) (Tier 2: Context7 vs alternatives; versioned docs) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3: official framework/library docs) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (security/hygiene checks for APIs) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (structure and formatting for deliverables) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: quick tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (structure and formatting for deliverables) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Charter: goals, boundaries, protocol decision and trade-offs. diff --git a/agents/claude-subagents/api-mocking-specialist.md b/agents/claude-subagents/api-mocking-specialist.md index 9c2addcd..de3c0171 100644 --- a/agents/claude-subagents/api-mocking-specialist.md +++ b/agents/claude-subagents/api-mocking-specialist.md @@ -27,8 +27,8 @@ Approach: - Record/replay: capture real responses for realistic mocking. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Mock implementation: MSW handlers, WireMock mappings, Pact contracts. diff --git a/agents/claude-subagents/architect.md b/agents/claude-subagents/architect.md index 82aa2298..13e602dc 100644 --- a/agents/claude-subagents/architect.md +++ b/agents/claude-subagents/architect.md @@ -25,9 +25,8 @@ Approach: - Specify artifacts (ADRs/diagrams/SLO sketches) and enable a walking skeleton. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Decision brief: objectives, constraints, options, trade‑offs, chosen path, revisit triggers; evidence links. diff --git a/agents/claude-subagents/architecture-audit.md b/agents/claude-subagents/architecture-audit.md index 475750dd..50cc9436 100644 --- a/agents/claude-subagents/architecture-audit.md +++ b/agents/claude-subagents/architecture-audit.md @@ -19,9 +19,8 @@ When to invoke: - Onboarding to inherited/acquired codebases At startup, read: -- the [compact MCP list](../reference/tools-guide.md) (tier 1: tool selection) -- the [code search guide](../reference/category/code-search.md) (tier 2: finding patterns) -- the [handoff guidelines](../reference/handoff-guide.md) (delegation rules) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (delegation rules) Approach: 1. Map high-level structure: layers, modules, boundaries, data flows @@ -42,5 +41,4 @@ Constraints and handoffs: - Distinguish must-fix (security, correctness, blockers) from should-improve - Use clink to delegate specialized deep dives (Semgrep for patterns, performance analysis) - AskUserQuestion for business constraints, roadmap priorities, team capacity -- Read tier 3 deep dives when specialized analysis needed diff --git a/agents/claude-subagents/auth-specialist.md b/agents/claude-subagents/auth-specialist.md index 0324dd6b..755a9457 100644 --- a/agents/claude-subagents/auth-specialist.md +++ b/agents/claude-subagents/auth-specialist.md @@ -29,10 +29,8 @@ Approach: - Zero‑trust: mTLS, service mesh identities, short‑lived certs, least privilege. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Flow diagrams: auth/authz sequences with security boundaries and token exchanges. diff --git a/agents/claude-subagents/background-job-architect.md b/agents/claude-subagents/background-job-architect.md index b2c4fe3e..9b570654 100644 --- a/agents/claude-subagents/background-job-architect.md +++ b/agents/claude-subagents/background-job-architect.md @@ -28,8 +28,8 @@ Approach: - Graceful shutdown: finish current job or checkpoint before worker dies. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Job definition: class/function, arguments, queue, retry policy, timeout. diff --git a/agents/claude-subagents/build-optimizer.md b/agents/claude-subagents/build-optimizer.md index 70f9092b..e2a88fc4 100644 --- a/agents/claude-subagents/build-optimizer.md +++ b/agents/claude-subagents/build-optimizer.md @@ -27,8 +27,8 @@ Approach: - Minimize transforms: avoid unnecessary Babel plugins, use native ESM. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Config changes: webpack.config.js, vite.config.ts, etc. with comments. diff --git a/agents/claude-subagents/caching-specialist.md b/agents/claude-subagents/caching-specialist.md index 2c6c1f9e..0ae08ac8 100644 --- a/agents/claude-subagents/caching-specialist.md +++ b/agents/claude-subagents/caching-specialist.md @@ -29,10 +29,8 @@ Approach: - Monitoring: track hit rate, miss rate, latency, evictions, memory usage. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Cache topology: layers (browser/CDN/app/DB), responsibilities, TTLs. diff --git a/agents/claude-subagents/chaos-engineer.md b/agents/claude-subagents/chaos-engineer.md index 8f96c0fd..61315709 100644 --- a/agents/claude-subagents/chaos-engineer.md +++ b/agents/claude-subagents/chaos-engineer.md @@ -28,10 +28,8 @@ Approach: - Automate: codify experiments in CI/staging; run continuously or on schedule. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Experiment plan: hypothesis, failure scenario, blast radius, abort criteria. diff --git a/agents/claude-subagents/ci-pipeline-builder.md b/agents/claude-subagents/ci-pipeline-builder.md index 791a03c7..b9361b11 100644 --- a/agents/claude-subagents/ci-pipeline-builder.md +++ b/agents/claude-subagents/ci-pipeline-builder.md @@ -27,8 +27,8 @@ Approach: - Reusability: composite actions, reusable workflows, shared templates. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Pipeline config: .github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile, etc. diff --git a/agents/claude-subagents/code-reviewer.md b/agents/claude-subagents/code-reviewer.md index 57dd6bb9..85f5299b 100644 --- a/agents/claude-subagents/code-reviewer.md +++ b/agents/claude-subagents/code-reviewer.md @@ -29,10 +29,8 @@ Approach: - Provide specific feedback: file:line, suggested fix, rationale, severity (blocker/optional). Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: PR scope, risk level, overall assessment (approve/request changes/block). diff --git a/agents/claude-subagents/concurrency-specialist.md b/agents/claude-subagents/concurrency-specialist.md index d377b809..ee1bfdd9 100644 --- a/agents/claude-subagents/concurrency-specialist.md +++ b/agents/claude-subagents/concurrency-specialist.md @@ -28,8 +28,8 @@ Approach: - Reason about happens-before: understand memory models and ordering. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Analysis: shared state identification, potential race conditions. diff --git a/agents/claude-subagents/cost-optimization-finops.md b/agents/claude-subagents/cost-optimization-finops.md index 36673910..8f3f0eeb 100644 --- a/agents/claude-subagents/cost-optimization-finops.md +++ b/agents/claude-subagents/cost-optimization-finops.md @@ -23,10 +23,8 @@ Approach: - Governance/anomalies: enforce tags, budgets/alerts, PR checks; correlate spikes; rollback/guardrail; document. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: baseline spend, top drivers, savings targets, constraints. diff --git a/agents/claude-subagents/cpp-pro.md b/agents/claude-subagents/cpp-pro.md index ecb8c235..cc901e73 100644 --- a/agents/claude-subagents/cpp-pro.md +++ b/agents/claude-subagents/cpp-pro.md @@ -26,10 +26,8 @@ Approach: - Validate: benchmark deltas, perf counters, compile‑time checks; document diffs/risks. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: context, target standard, constraints, assumptions. diff --git a/agents/claude-subagents/data-eng.md b/agents/claude-subagents/data-eng.md index de7eb268..16843217 100644 --- a/agents/claude-subagents/data-eng.md +++ b/agents/claude-subagents/data-eng.md @@ -28,10 +28,8 @@ Approach: - Backfills: incremental with checkpointing; validate output; minimal recompute. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Data flow: DAG with sources/transforms/sinks, dependencies, SLAs. diff --git a/agents/claude-subagents/datetime-specialist.md b/agents/claude-subagents/datetime-specialist.md index f400c673..77f8b8e1 100644 --- a/agents/claude-subagents/datetime-specialist.md +++ b/agents/claude-subagents/datetime-specialist.md @@ -28,8 +28,8 @@ Approach: - Document assumptions: what timezone is input, what timezone is output. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Data model: how datetime is stored, what timezone context is preserved. diff --git a/agents/claude-subagents/db-internals.md b/agents/claude-subagents/db-internals.md index 98592954..dec3cd1a 100644 --- a/agents/claude-subagents/db-internals.md +++ b/agents/claude-subagents/db-internals.md @@ -26,10 +26,8 @@ Approach: - Prepare staged, reversible migrations/backfills; validate in staging. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: affected queries/tables with evidence (paths:lines, plan snippets). diff --git a/agents/claude-subagents/debugger.md b/agents/claude-subagents/debugger.md index aeff6757..7faa32b6 100644 --- a/agents/claude-subagents/debugger.md +++ b/agents/claude-subagents/debugger.md @@ -27,10 +27,8 @@ Approach: - Document findings: timeline, evidence (logs/traces/dumps), root cause, fix validation. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Repro: minimal steps and environment to trigger the issue. diff --git a/agents/claude-subagents/dependency-auditor.md b/agents/claude-subagents/dependency-auditor.md index 3965cbb8..11b7c58c 100644 --- a/agents/claude-subagents/dependency-auditor.md +++ b/agents/claude-subagents/dependency-auditor.md @@ -27,8 +27,8 @@ Approach: - Minimize attack surface: remove unused deps, prefer well-maintained packages. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Vulnerability report: CVE ID, severity, affected versions, fix version, exploitability. diff --git a/agents/claude-subagents/devops-infra-as-code.md b/agents/claude-subagents/devops-infra-as-code.md index 30d34363..13969546 100644 --- a/agents/claude-subagents/devops-infra-as-code.md +++ b/agents/claude-subagents/devops-infra-as-code.md @@ -25,11 +25,9 @@ Approach: - Document runbooks and patterns; standardize as reusable modules/templates. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: quick MCP decision guide) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2: Serena vs grep vs Sourcegraph for infra/code searches) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3: scanning IaC/K8s/CI for security/hygiene) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (structure and formatting for deliverables) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: quick MCP decision guide) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (structure and formatting for deliverables) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Current vs target state: risks, constraints, goals. diff --git a/agents/claude-subagents/distributed-systems-architect.md b/agents/claude-subagents/distributed-systems-architect.md index 7610d18d..f799601c 100644 --- a/agents/claude-subagents/distributed-systems-architect.md +++ b/agents/claude-subagents/distributed-systems-architect.md @@ -25,10 +25,8 @@ Approach: Must‑read at startup: - the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) -- the [web research guide](../../protocols/context/static/by-category/web-research.md) (Tier 2) -- the [Sourcegraph deep dive](../../protocols/context/static/deep-dives/sourcegraph.md) (Tier 3 as needed) - the [docs style guide](../../protocols/context/static/tools-guide.md) (for concise outputs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture brief: goals, CAP stance, consistency model, failure assumptions. diff --git a/agents/claude-subagents/distributed-systems.md b/agents/claude-subagents/distributed-systems.md index 61bbbd5f..16e098a1 100644 --- a/agents/claude-subagents/distributed-systems.md +++ b/agents/claude-subagents/distributed-systems.md @@ -25,10 +25,8 @@ Approach: - Validate: tracing/correlation IDs; chaos tests for partitions/clocks. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [web research guide](../reference/by-category/web-research.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture brief: goals, CAP stance, consistency model, failure assumptions. diff --git a/agents/claude-subagents/environment-debugger.md b/agents/claude-subagents/environment-debugger.md index 305ae91c..ba777571 100644 --- a/agents/claude-subagents/environment-debugger.md +++ b/agents/claude-subagents/environment-debugger.md @@ -29,8 +29,8 @@ Approach: - Automate checks: version validation in CI, environment linting. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diagnosis: what's different between environments. diff --git a/agents/claude-subagents/event-driven.md b/agents/claude-subagents/event-driven.md index 38ce2c2a..f0543869 100644 --- a/agents/claude-subagents/event-driven.md +++ b/agents/claude-subagents/event-driven.md @@ -29,10 +29,8 @@ Approach: - Error handling: dead‑letter queues, retries with backoff, poison message detection. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Event flow: diagram with producers/consumers/topics, partitioning, ordering guarantees. diff --git a/agents/claude-subagents/explainer.md b/agents/claude-subagents/explainer.md index f690438c..1fab0a6d 100644 --- a/agents/claude-subagents/explainer.md +++ b/agents/claude-subagents/explainer.md @@ -18,7 +18,7 @@ When to invoke: - Clarifying why architectural decisions were made At startup, read: -- the [compact MCP list](../reference/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) Approach: 1. Start with the README, package.json/requirements.txt, or entry points to map high-level structure @@ -26,7 +26,6 @@ Approach: 3. Trace a single, simple execution path end-to-end before generalizing 4. Use analogies and diagrams when helpful (offer to create mermaid visualizations) 5. Pause after each concept to check understanding before proceeding -6. For complex tools, read references in [per-category MCP guides](../reference/by-category/) and [per-MCP deep dives](../reference/deep-dives/) as needed Output format: - Layered explanations: overview → key components → detailed walkthrough diff --git a/agents/claude-subagents/feature-flag-engineer.md b/agents/claude-subagents/feature-flag-engineer.md index 0a5a77d4..ac3bd2b0 100644 --- a/agents/claude-subagents/feature-flag-engineer.md +++ b/agents/claude-subagents/feature-flag-engineer.md @@ -28,8 +28,8 @@ Approach: - Cleanup automation: lint rules for stale flags, alerts for expired flags, removal PRs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Flag specification: name, description, type, default, targeting rules, expiration. diff --git a/agents/claude-subagents/finops-optimizer.md b/agents/claude-subagents/finops-optimizer.md index c702cfee..59a74979 100644 --- a/agents/claude-subagents/finops-optimizer.md +++ b/agents/claude-subagents/finops-optimizer.md @@ -27,10 +27,8 @@ Approach: Must‑read at startup: - the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../../protocols/context/static/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../../protocols/context/static/deep-dives/sourcegraph.md) (Tier 3 as needed) - the [docs style guide](../../protocols/context/static/tools-guide.md) (for concise outputs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: baseline, top cost drivers, savings targets, constraints. diff --git a/agents/claude-subagents/frontend.md b/agents/claude-subagents/frontend.md index edbe8b89..8c0559de 100644 --- a/agents/claude-subagents/frontend.md +++ b/agents/claude-subagents/frontend.md @@ -26,11 +26,9 @@ Approach: - Measure every change (Lighthouse/RUM); block regressions. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: quick tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2: code navigation/search choices) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3: scanning UI/a11y/security anti-patterns) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (structure and formatting for deliverables) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: quick tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (structure and formatting for deliverables) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Audit: baseline Web Vitals, a11y gaps, bundle analysis, and prioritized issues. diff --git a/agents/claude-subagents/git-surgeon.md b/agents/claude-subagents/git-surgeon.md index 87b5af1b..cacf6a26 100644 --- a/agents/claude-subagents/git-surgeon.md +++ b/agents/claude-subagents/git-surgeon.md @@ -28,8 +28,8 @@ Approach: - Clean history: atomic commits, meaningful messages, no merge commits in features. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Commands: exact git commands to run, in order, with explanations. diff --git a/agents/claude-subagents/golang-pro.md b/agents/claude-subagents/golang-pro.md index 96b36e32..3f854974 100644 --- a/agents/claude-subagents/golang-pro.md +++ b/agents/claude-subagents/golang-pro.md @@ -26,9 +26,8 @@ Approach: - Stage small commits with clear rationale and rollback paths. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diff summary with rationale and Go idioms. diff --git a/agents/claude-subagents/graphql-specialist.md b/agents/claude-subagents/graphql-specialist.md index 861621e9..d4e64667 100644 --- a/agents/claude-subagents/graphql-specialist.md +++ b/agents/claude-subagents/graphql-specialist.md @@ -27,8 +27,8 @@ Approach: - Monitor: field-level tracing, query complexity scoring, slow query logging. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Schema SDL: types, queries, mutations with descriptions and deprecation notices. diff --git a/agents/claude-subagents/historian.md b/agents/claude-subagents/historian.md index a49937f2..1aa5f868 100644 --- a/agents/claude-subagents/historian.md +++ b/agents/claude-subagents/historian.md @@ -29,10 +29,8 @@ Approach: - Extract rationale: commit messages, PR descriptions, code comments, design docs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Timeline: chronological narrative with commits, PRs, issues, and decision points. diff --git a/agents/claude-subagents/http-client-specialist.md b/agents/claude-subagents/http-client-specialist.md index a504eb34..4cd0e3fb 100644 --- a/agents/claude-subagents/http-client-specialist.md +++ b/agents/claude-subagents/http-client-specialist.md @@ -28,8 +28,8 @@ Approach: - Test failure modes: simulate timeouts, 5xx errors, connection refused. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Client configuration: timeouts, pool sizes, retry policy, circuit breaker settings. diff --git a/agents/claude-subagents/implementation-helper.md b/agents/claude-subagents/implementation-helper.md index d90c895c..2aec5a6c 100644 --- a/agents/claude-subagents/implementation-helper.md +++ b/agents/claude-subagents/implementation-helper.md @@ -20,8 +20,8 @@ When to invoke: - Making principled engineering decisions under constraints At startup, read: -- the [compact MCP list](../reference/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) -- the [handoff guidelines](../reference/handoff-guide.md) (delegation rules) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (delegation rules) Approach: 1. Clarify scope and success criteria; identify ambiguities upfront diff --git a/agents/claude-subagents/incident-commander.md b/agents/claude-subagents/incident-commander.md index 60bdef55..15f89335 100644 --- a/agents/claude-subagents/incident-commander.md +++ b/agents/claude-subagents/incident-commander.md @@ -28,10 +28,8 @@ Approach: - Patterns: track recurring issues, MTTR, alert noise; propose systemic fixes. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Incident brief: severity, impact, start time, status, war room link. diff --git a/agents/claude-subagents/interviewer.md b/agents/claude-subagents/interviewer.md index d0f4378d..5dc1b2ab 100644 --- a/agents/claude-subagents/interviewer.md +++ b/agents/claude-subagents/interviewer.md @@ -29,10 +29,8 @@ Approach: - Provide feedback: acknowledge correct reasoning, gently correct misconceptions. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Question sequence: progressive difficulty, building on previous answers. diff --git a/agents/claude-subagents/kubernetes-operator.md b/agents/claude-subagents/kubernetes-operator.md index 85d6a5d8..ae2b26bc 100644 --- a/agents/claude-subagents/kubernetes-operator.md +++ b/agents/claude-subagents/kubernetes-operator.md @@ -27,8 +27,8 @@ Approach: - Observability: prometheus annotations, structured logging, tracing headers. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Manifests: Deployment, Service, ConfigMap, etc. with inline comments. diff --git a/agents/claude-subagents/localization-engineer.md b/agents/claude-subagents/localization-engineer.md index e99ba122..a35f3f7f 100644 --- a/agents/claude-subagents/localization-engineer.md +++ b/agents/claude-subagents/localization-engineer.md @@ -28,8 +28,8 @@ Approach: - Plan for text expansion: translations can be 30-50% longer. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Message catalog: extracted strings in ICU format or framework-specific. diff --git a/agents/claude-subagents/log-analyst.md b/agents/claude-subagents/log-analyst.md index 3bba9cfe..3b9b7664 100644 --- a/agents/claude-subagents/log-analyst.md +++ b/agents/claude-subagents/log-analyst.md @@ -27,8 +27,8 @@ Approach: - Context gathering: what happened before the error, related logs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Timeline: sequence of events with timestamps and service names. diff --git a/agents/claude-subagents/message-queue-architect.md b/agents/claude-subagents/message-queue-architect.md index ad37e4b4..89ea4913 100644 --- a/agents/claude-subagents/message-queue-architect.md +++ b/agents/claude-subagents/message-queue-architect.md @@ -28,8 +28,8 @@ Approach: - Dead letter strategy: capture failures, enable investigation, support replay. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture diagram: producers, topics/queues, consumers, DLQ flow. diff --git a/agents/claude-subagents/migration-refactoring.md b/agents/claude-subagents/migration-refactoring.md index 6171c7b0..6e2f339f 100644 --- a/agents/claude-subagents/migration-refactoring.md +++ b/agents/claude-subagents/migration-refactoring.md @@ -26,11 +26,9 @@ Approach: - Plan data migration (expand–migrate–contract), backfills, reconciliation; clean up flags/shims. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (concise specs/ADRs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (concise specs/ADRs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Migration spec: target, scope, assumptions, risks, success criteria. diff --git a/agents/claude-subagents/mobile-architect.md b/agents/claude-subagents/mobile-architect.md index c4176898..1e2bfc58 100644 --- a/agents/claude-subagents/mobile-architect.md +++ b/agents/claude-subagents/mobile-architect.md @@ -26,10 +26,8 @@ Approach: Must‑read at startup: - the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../../protocols/context/static/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../../protocols/context/static/deep-dives/sourcegraph.md) (Tier 3 as needed) - the [docs style guide](../../protocols/context/static/tools-guide.md) (for concise outputs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture brief: platform targets, framework decision, modules, state/nav pattern, risks. diff --git a/agents/claude-subagents/monorepo-architect.md b/agents/claude-subagents/monorepo-architect.md index 09fb198a..932b4b8d 100644 --- a/agents/claude-subagents/monorepo-architect.md +++ b/agents/claude-subagents/monorepo-architect.md @@ -27,8 +27,8 @@ Approach: - Code sharing: internal packages with proper exports, avoid barrel files at scale. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Structure diagram: folder hierarchy with package boundaries and dependencies. diff --git a/agents/claude-subagents/networking-edge-infra.md b/agents/claude-subagents/networking-edge-infra.md index 10e98ed6..012830fb 100644 --- a/agents/claude-subagents/networking-edge-infra.md +++ b/agents/claude-subagents/networking-edge-infra.md @@ -27,11 +27,9 @@ Approach: - Plan rollout: canary by path/region, staged shifts; monitoring + rollback. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (decision docs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (decision docs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: objectives, constraints, SLOs, assumptions. diff --git a/agents/claude-subagents/observability.md b/agents/claude-subagents/observability.md index a56b56b5..64cffcf0 100644 --- a/agents/claude-subagents/observability.md +++ b/agents/claude-subagents/observability.md @@ -25,10 +25,8 @@ Approach: - For incidents: declare, stabilize, correlate traces/metrics/logs, document timeline and follow‑ups. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Telemetry plan: coverage deltas, propagation, key metrics/log fields. diff --git a/agents/claude-subagents/optimization.md b/agents/claude-subagents/optimization.md index 1cc28e99..383bc573 100644 --- a/agents/claude-subagents/optimization.md +++ b/agents/claude-subagents/optimization.md @@ -24,10 +24,8 @@ Approach: - Re-measure after each change; keep diffs small and reversible. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: baseline vs targets; constraints and risks. diff --git a/agents/claude-subagents/orm-optimization-specialist.md b/agents/claude-subagents/orm-optimization-specialist.md index 36e47563..93aa5a7b 100644 --- a/agents/claude-subagents/orm-optimization-specialist.md +++ b/agents/claude-subagents/orm-optimization-specialist.md @@ -28,8 +28,8 @@ Approach: - Test with realistic data: N+1 is invisible with 2 records, catastrophic with 1000. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Query analysis: current queries, N+1 identification, query count per operation. diff --git a/agents/claude-subagents/platform-eng.md b/agents/claude-subagents/platform-eng.md index 563865bf..6536525e 100644 --- a/agents/claude-subagents/platform-eng.md +++ b/agents/claude-subagents/platform-eng.md @@ -24,11 +24,9 @@ Approach: - Document in the portal; version templates/actions; provide upgrade paths and escape hatches. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: sprawl hotspots/manual steps with evidence (paths:lines). diff --git a/agents/claude-subagents/realtime.md b/agents/claude-subagents/realtime.md index 8c844954..7bdfaf85 100644 --- a/agents/claude-subagents/realtime.md +++ b/agents/claude-subagents/realtime.md @@ -25,11 +25,9 @@ Approach: - Validate via load/soak/replay/chaos; stage behind flags/canaries; re-measure. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: quick tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2: code navigation/search choices) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3: official framework/RTOS/protocol docs) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (structure and formatting for deliverables) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: quick tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (structure and formatting for deliverables) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Budget: topology, per-hop targets, assumptions, constraints. diff --git a/agents/claude-subagents/regex-wizard.md b/agents/claude-subagents/regex-wizard.md index b49f0e33..8892aaf8 100644 --- a/agents/claude-subagents/regex-wizard.md +++ b/agents/claude-subagents/regex-wizard.md @@ -27,8 +27,8 @@ Approach: - Document: explain each part of the pattern inline. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Pattern: the regex with inline comments (x-flag style when supported). diff --git a/agents/claude-subagents/rust-pro.md b/agents/claude-subagents/rust-pro.md index ad33455b..12499927 100644 --- a/agents/claude-subagents/rust-pro.md +++ b/agents/claude-subagents/rust-pro.md @@ -26,9 +26,8 @@ Approach: - Stage small commits with rationale and rollback paths. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diff summary with rationale, referenced Rust idioms. diff --git a/agents/claude-subagents/scalability-reliability.md b/agents/claude-subagents/scalability-reliability.md index 08783d11..ea77bbc0 100644 --- a/agents/claude-subagents/scalability-reliability.md +++ b/agents/claude-subagents/scalability-reliability.md @@ -25,11 +25,9 @@ Approach: - Codify runbooks, dashboards‑as‑code, and post‑incident action tracking; reduce alert noise. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: quick tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2: navigating code/config for reliability patterns) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3: scanning for reliability/security anti‑patterns) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (structure and formatting for deliverables) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: quick tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (structure and formatting for deliverables) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - SLO spec: SLIs, targets, windows; error‑budget policy and alerts. diff --git a/agents/claude-subagents/schema-evolution.md b/agents/claude-subagents/schema-evolution.md index 5a950bdb..38ab30f0 100644 --- a/agents/claude-subagents/schema-evolution.md +++ b/agents/claude-subagents/schema-evolution.md @@ -29,10 +29,8 @@ Approach: - API versioning: URL/header versioning, deprecation notices, sunset timelines. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Migration plan: phases (expand/migrate/contract), DDL statements, rollback steps. diff --git a/agents/claude-subagents/search-implementation-specialist.md b/agents/claude-subagents/search-implementation-specialist.md index f8150a6c..994d3a10 100644 --- a/agents/claude-subagents/search-implementation-specialist.md +++ b/agents/claude-subagents/search-implementation-specialist.md @@ -28,8 +28,8 @@ Approach: - Iterate: relevance is never "done"; instrument, measure, improve continuously. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Index mapping: field types, analyzers, multi-fields for different query patterns. diff --git a/agents/claude-subagents/searcher.md b/agents/claude-subagents/searcher.md index 21d0ce8b..6c7b0e36 100644 --- a/agents/claude-subagents/searcher.md +++ b/agents/claude-subagents/searcher.md @@ -26,10 +26,8 @@ Approach: - Present the user with a list of relevant code snippets. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - A list of code snippets, each with the file path and line number. diff --git a/agents/claude-subagents/security-compliance.md b/agents/claude-subagents/security-compliance.md index 15ab206a..add9cf56 100644 --- a/agents/claude-subagents/security-compliance.md +++ b/agents/claude-subagents/security-compliance.md @@ -24,10 +24,8 @@ Approach: - Verify/capture: targeted tests; adherence dashboards; store decisions/evidence. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- a [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Risk summary: top findings, severity, assets, impact. diff --git a/agents/claude-subagents/serverless-specialist.md b/agents/claude-subagents/serverless-specialist.md index 5a6084a5..e832a9ff 100644 --- a/agents/claude-subagents/serverless-specialist.md +++ b/agents/claude-subagents/serverless-specialist.md @@ -28,8 +28,8 @@ Approach: - Idempotency: event sources may duplicate; design handlers to be replayable. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Function design: handler structure, initialization, event parsing, response format. diff --git a/agents/claude-subagents/shell-scripter.md b/agents/claude-subagents/shell-scripter.md index 5d5bbba5..6226c886 100644 --- a/agents/claude-subagents/shell-scripter.md +++ b/agents/claude-subagents/shell-scripter.md @@ -27,8 +27,8 @@ Approach: - Clean up: trap EXIT for cleanup, use mktemp for temp files. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Script: complete, commented, with shebang and set flags. diff --git a/agents/claude-subagents/state-machine-designer.md b/agents/claude-subagents/state-machine-designer.md index 89c06ffa..46963f83 100644 --- a/agents/claude-subagents/state-machine-designer.md +++ b/agents/claude-subagents/state-machine-designer.md @@ -28,8 +28,8 @@ Approach: - Visualize always: state diagrams are documentation and debugging tools. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - State diagram: visual representation (Mermaid, XState Viz, or ASCII). diff --git a/agents/claude-subagents/task-decomposer.md b/agents/claude-subagents/task-decomposer.md index a2c5d899..96940983 100644 --- a/agents/claude-subagents/task-decomposer.md +++ b/agents/claude-subagents/task-decomposer.md @@ -29,8 +29,8 @@ Approach: - Agent routing: map subtasks to roles (architect, debugger, testing, etc.). Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) (when to delegate to which agent) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (when to delegate to which agent) Output format: - Summary: problem statement, success criteria, constraints, assumptions. diff --git a/agents/claude-subagents/tech-debt.md b/agents/claude-subagents/tech-debt.md index c66ff89f..ca278648 100644 --- a/agents/claude-subagents/tech-debt.md +++ b/agents/claude-subagents/tech-debt.md @@ -26,11 +26,9 @@ Approach: - Sequence work into safe iterations; measure deltas after each step. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (for concise decision docs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (for concise decision docs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: context, objectives, constraints; current risk level. diff --git a/agents/claude-subagents/terraform-specialist.md b/agents/claude-subagents/terraform-specialist.md index ed73a325..cda89874 100644 --- a/agents/claude-subagents/terraform-specialist.md +++ b/agents/claude-subagents/terraform-specialist.md @@ -27,8 +27,8 @@ Approach: - Prefer data sources over hardcoded IDs; use `moved` blocks for refactors. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Plan summary: resources to add/change/destroy with risk assessment (low/medium/high). diff --git a/agents/claude-subagents/testing.md b/agents/claude-subagents/testing.md index 6db366f7..f6815ef7 100644 --- a/agents/claude-subagents/testing.md +++ b/agents/claude-subagents/testing.md @@ -25,10 +25,8 @@ Approach: - Prepare minimal, safe diffs and rollout notes; validate trends in CI. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: flake hotspots and gaps with evidence (paths:lines). diff --git a/agents/claude-subagents/type-system-expert.md b/agents/claude-subagents/type-system-expert.md index 0982e6d3..7bfb1401 100644 --- a/agents/claude-subagents/type-system-expert.md +++ b/agents/claude-subagents/type-system-expert.md @@ -28,8 +28,8 @@ Approach: - Document complex types: JSDoc, comments, examples. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Type definitions: with inline comments explaining each part. diff --git a/agents/claude-subagents/webhook-integration-specialist.md b/agents/claude-subagents/webhook-integration-specialist.md index ae9bd3f1..b35f29bd 100644 --- a/agents/claude-subagents/webhook-integration-specialist.md +++ b/agents/claude-subagents/webhook-integration-specialist.md @@ -27,8 +27,8 @@ Approach: - **Sending:** provide event logs, retry buttons, and delivery status in dashboard. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Endpoint implementation: signature verification, quick response, async processing. diff --git a/agents/role-prompts/accessibility-auditor.md b/agents/role-prompts/accessibility-auditor.md index ee06db81..59b72332 100644 --- a/agents/role-prompts/accessibility-auditor.md +++ b/agents/role-prompts/accessibility-auditor.md @@ -21,8 +21,8 @@ Approach: - Motion: respect prefers-reduced-motion; provide pause controls for animations. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Audit report: issues by WCAG criterion, severity, affected elements. diff --git a/agents/role-prompts/ai-ml-eng.md b/agents/role-prompts/ai-ml-eng.md index 71a44b08..35e5e503 100644 --- a/agents/role-prompts/ai-ml-eng.md +++ b/agents/role-prompts/ai-ml-eng.md @@ -20,10 +20,8 @@ Approach: - Document decisions and a staged plan; prepare safe diffs. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) -- the [code search guide](../reference/by-category/code-search.md) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: risks, hotspots, and evidence (paths:lines) with short rationale. diff --git a/agents/role-prompts/api-client-designer.md b/agents/role-prompts/api-client-designer.md index 79974e3e..c0b5bdab 100644 --- a/agents/role-prompts/api-client-designer.md +++ b/agents/role-prompts/api-client-designer.md @@ -23,10 +23,8 @@ Approach: - Request coalescing: batch APIs, deduplication windows, streaming for high volume. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [documentation category guide](../reference/by-category/documentation.md) (Tier 2) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - SDK design: initialization, auth, client lifecycle, resource cleanup. diff --git a/agents/role-prompts/api-integration.md b/agents/role-prompts/api-integration.md index d00e81b8..fb2a4fe6 100644 --- a/agents/role-prompts/api-integration.md +++ b/agents/role-prompts/api-integration.md @@ -20,11 +20,9 @@ Approach: - Wire contract/conformance tests to CI; add per‑endpoint SLOs; plan rollout/rollback. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [documentation category guide](../reference/by-category/documentation.md) (Tier 2) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (clear ADRs/decision docs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (clear ADRs/decision docs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: goals, scope, trust boundaries, SLOs, assumptions. diff --git a/agents/role-prompts/api-mocking-specialist.md b/agents/role-prompts/api-mocking-specialist.md index ae912934..2add22b8 100644 --- a/agents/role-prompts/api-mocking-specialist.md +++ b/agents/role-prompts/api-mocking-specialist.md @@ -21,8 +21,8 @@ Approach: - Record/replay: capture real responses for realistic mocking. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Mock implementation: MSW handlers, WireMock mappings, Pact contracts. diff --git a/agents/role-prompts/architect.md b/agents/role-prompts/architect.md index 3e7ff457..b250fccb 100644 --- a/agents/role-prompts/architect.md +++ b/agents/role-prompts/architect.md @@ -19,9 +19,8 @@ Approach: - Specify artifacts (ADRs/diagrams/SLO sketches) and enable a walking skeleton. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Decision brief: objectives, constraints, options, trade‑offs, chosen path, revisit triggers; evidence links. diff --git a/agents/role-prompts/architecture-audit.md b/agents/role-prompts/architecture-audit.md index 5ca63b0d..ae8b226f 100644 --- a/agents/role-prompts/architecture-audit.md +++ b/agents/role-prompts/architecture-audit.md @@ -13,9 +13,8 @@ When to invoke: - Onboarding to inherited/acquired codebases At startup, read: -- the [compact MCP list](../reference/tools-guide.md) (tier 1: tool selection) -- the [code search guide](../reference/category/code-search.md) (tier 2: finding patterns) -- the [handoff guidelines](../reference/handoff-guide.md) (delegation rules) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (delegation rules) Approach: 1. Map high-level structure: layers, modules, boundaries, data flows @@ -36,4 +35,3 @@ Constraints and handoffs: - Distinguish must-fix (security, correctness, blockers) from should-improve - Use clink to delegate specialized deep dives (Semgrep for patterns, performance analysis) - AskUserQuestion for business constraints, roadmap priorities, team capacity -- Read tier 3 deep dives when specialized analysis needed diff --git a/agents/role-prompts/auth-specialist.md b/agents/role-prompts/auth-specialist.md index b30442fe..b61edab8 100644 --- a/agents/role-prompts/auth-specialist.md +++ b/agents/role-prompts/auth-specialist.md @@ -23,10 +23,8 @@ Approach: - Zero‑trust: mTLS, service mesh identities, short‑lived certs, least privilege. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Flow diagrams: auth/authz sequences with security boundaries and token exchanges. diff --git a/agents/role-prompts/background-job-architect.md b/agents/role-prompts/background-job-architect.md index 2d09e111..c0f5dcfa 100644 --- a/agents/role-prompts/background-job-architect.md +++ b/agents/role-prompts/background-job-architect.md @@ -22,8 +22,8 @@ Approach: - Graceful shutdown: finish current job or checkpoint before worker dies. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Job definition: class/function, arguments, queue, retry policy, timeout. diff --git a/agents/role-prompts/build-optimizer.md b/agents/role-prompts/build-optimizer.md index 17f208e0..027ef243 100644 --- a/agents/role-prompts/build-optimizer.md +++ b/agents/role-prompts/build-optimizer.md @@ -21,8 +21,8 @@ Approach: - Minimize transforms: avoid unnecessary Babel plugins, use native ESM. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Config changes: webpack.config.js, vite.config.ts, etc. with comments. diff --git a/agents/role-prompts/caching-specialist.md b/agents/role-prompts/caching-specialist.md index 68f4d7d6..d5de113b 100644 --- a/agents/role-prompts/caching-specialist.md +++ b/agents/role-prompts/caching-specialist.md @@ -23,10 +23,8 @@ Approach: - Monitoring: track hit rate, miss rate, latency, evictions, memory usage. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Cache topology: layers (browser/CDN/app/DB), responsibilities, TTLs. diff --git a/agents/role-prompts/chaos-engineer.md b/agents/role-prompts/chaos-engineer.md index 999ce6fc..1359dccd 100644 --- a/agents/role-prompts/chaos-engineer.md +++ b/agents/role-prompts/chaos-engineer.md @@ -22,10 +22,8 @@ Approach: - Automate: codify experiments in CI/staging; run continuously or on schedule. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Experiment plan: hypothesis, failure scenario, blast radius, abort criteria. diff --git a/agents/role-prompts/ci-pipeline-builder.md b/agents/role-prompts/ci-pipeline-builder.md index 59226eab..cc5571fd 100644 --- a/agents/role-prompts/ci-pipeline-builder.md +++ b/agents/role-prompts/ci-pipeline-builder.md @@ -21,8 +21,8 @@ Approach: - Reusability: composite actions, reusable workflows, shared templates. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Pipeline config: .github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile, etc. diff --git a/agents/role-prompts/code-reviewer.md b/agents/role-prompts/code-reviewer.md index 54ebf284..6f4e900e 100644 --- a/agents/role-prompts/code-reviewer.md +++ b/agents/role-prompts/code-reviewer.md @@ -23,10 +23,8 @@ Approach: - Provide specific feedback: file:line, suggested fix, rationale, severity (blocker/optional). Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: PR scope, risk level, overall assessment (approve/request changes/block). diff --git a/agents/role-prompts/concurrency-specialist.md b/agents/role-prompts/concurrency-specialist.md index 0097f2dc..d56eb1f3 100644 --- a/agents/role-prompts/concurrency-specialist.md +++ b/agents/role-prompts/concurrency-specialist.md @@ -22,8 +22,8 @@ Approach: - Reason about happens-before: understand memory models and ordering. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Analysis: shared state identification, potential race conditions. diff --git a/agents/role-prompts/cost-optimization-finops.md b/agents/role-prompts/cost-optimization-finops.md index d9813e1b..90abe4a0 100644 --- a/agents/role-prompts/cost-optimization-finops.md +++ b/agents/role-prompts/cost-optimization-finops.md @@ -17,10 +17,8 @@ Approach: - Governance/anomalies: enforce tags, budgets/alerts, PR checks; correlate spikes; rollback/guardrail; document. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: baseline spend, top drivers, savings targets, constraints. diff --git a/agents/role-prompts/cpp-pro.md b/agents/role-prompts/cpp-pro.md index 7e3c3bab..1f63f301 100644 --- a/agents/role-prompts/cpp-pro.md +++ b/agents/role-prompts/cpp-pro.md @@ -20,10 +20,8 @@ Approach: - Validate: benchmark deltas, perf counters, compile‑time checks; document diffs/risks. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: context, target standard, constraints, assumptions. diff --git a/agents/role-prompts/data-eng.md b/agents/role-prompts/data-eng.md index d789da2b..cadcee28 100644 --- a/agents/role-prompts/data-eng.md +++ b/agents/role-prompts/data-eng.md @@ -22,10 +22,8 @@ Approach: - Backfills: incremental with checkpointing; validate output; minimal recompute. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Data flow: DAG with sources/transforms/sinks, dependencies, SLAs. diff --git a/agents/role-prompts/datetime-specialist.md b/agents/role-prompts/datetime-specialist.md index 4372a3ae..295b9811 100644 --- a/agents/role-prompts/datetime-specialist.md +++ b/agents/role-prompts/datetime-specialist.md @@ -22,8 +22,8 @@ Approach: - Document assumptions: what timezone is input, what timezone is output. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Data model: how datetime is stored, what timezone context is preserved. diff --git a/agents/role-prompts/db-internals.md b/agents/role-prompts/db-internals.md index 1f056f51..f9bc6a98 100644 --- a/agents/role-prompts/db-internals.md +++ b/agents/role-prompts/db-internals.md @@ -21,10 +21,8 @@ Approach: - Prepare staged, reversible migrations/backfills; validate in staging. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) -- the [code search guide](../reference/by-category/code-search.md) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: affected queries/tables with evidence (paths:lines, plans). diff --git a/agents/role-prompts/debugger.md b/agents/role-prompts/debugger.md index be97397d..c5d6fbd4 100644 --- a/agents/role-prompts/debugger.md +++ b/agents/role-prompts/debugger.md @@ -21,10 +21,8 @@ Approach: - Document findings: timeline, evidence (logs/traces/dumps), root cause, fix validation. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Repro: minimal steps and environment to trigger the issue. diff --git a/agents/role-prompts/dependency-auditor.md b/agents/role-prompts/dependency-auditor.md index edc375ba..151701a5 100644 --- a/agents/role-prompts/dependency-auditor.md +++ b/agents/role-prompts/dependency-auditor.md @@ -21,8 +21,8 @@ Approach: - Minimize attack surface: remove unused deps, prefer well-maintained packages. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Vulnerability report: CVE ID, severity, affected versions, fix version, exploitability. diff --git a/agents/role-prompts/devops-infra-as-code.md b/agents/role-prompts/devops-infra-as-code.md index 3daa07ed..cb8b3e8c 100644 --- a/agents/role-prompts/devops-infra-as-code.md +++ b/agents/role-prompts/devops-infra-as-code.md @@ -19,10 +19,8 @@ Approach: - Document runbooks and patterns; standardize as reusable modules/templates. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Current vs target state: risks, constraints, and goals. diff --git a/agents/role-prompts/distributed-systems.md b/agents/role-prompts/distributed-systems.md index bcc267e0..0594034c 100644 --- a/agents/role-prompts/distributed-systems.md +++ b/agents/role-prompts/distributed-systems.md @@ -19,10 +19,8 @@ Approach: - Validate: tracing/correlation IDs; chaos tests for partitions/clocks. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [web research guide](../reference/by-category/web-research.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture brief: goals, CAP stance, consistency model, failure assumptions. diff --git a/agents/role-prompts/environment-debugger.md b/agents/role-prompts/environment-debugger.md index 5a024591..a3e6d360 100644 --- a/agents/role-prompts/environment-debugger.md +++ b/agents/role-prompts/environment-debugger.md @@ -23,8 +23,8 @@ Approach: - Automate checks: version validation in CI, environment linting. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diagnosis: what's different between environments. diff --git a/agents/role-prompts/event-driven.md b/agents/role-prompts/event-driven.md index 0e7f365b..dc070a3f 100644 --- a/agents/role-prompts/event-driven.md +++ b/agents/role-prompts/event-driven.md @@ -23,10 +23,8 @@ Approach: - Error handling: dead‑letter queues, retries with backoff, poison message detection. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Event flow: diagram with producers/consumers/topics, partitioning, ordering guarantees. diff --git a/agents/role-prompts/explainer.md b/agents/role-prompts/explainer.md index 29ea1b37..fc158c9e 100644 --- a/agents/role-prompts/explainer.md +++ b/agents/role-prompts/explainer.md @@ -12,7 +12,7 @@ When to invoke: - Clarifying why architectural decisions were made At startup, read: -- the [compact MCP list](../reference/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) Approach: 1. Start with the README, package.json/requirements.txt, or entry points to map high-level structure @@ -20,7 +20,6 @@ Approach: 3. Trace a single, simple execution path end-to-end before generalizing 4. Use analogies and diagrams when helpful (offer to create mermaid visualizations) 5. Pause after each concept to check understanding before proceeding -6. For complex tools, read references in [per-category MCP guides](../reference/by-category/) and [per-MCP deep dives](../reference/deep-dives/) as needed Output format: - Layered explanations: overview → key components → detailed walkthrough diff --git a/agents/role-prompts/feature-flag-engineer.md b/agents/role-prompts/feature-flag-engineer.md index 03379be3..52118662 100644 --- a/agents/role-prompts/feature-flag-engineer.md +++ b/agents/role-prompts/feature-flag-engineer.md @@ -22,8 +22,8 @@ Approach: - Cleanup automation: lint rules for stale flags, alerts for expired flags, removal PRs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Flag specification: name, description, type, default, targeting rules, expiration. diff --git a/agents/role-prompts/frontend.md b/agents/role-prompts/frontend.md index 3ba9708c..3aa8bd15 100644 --- a/agents/role-prompts/frontend.md +++ b/agents/role-prompts/frontend.md @@ -20,10 +20,8 @@ Approach: - Measure every change (Lighthouse/RUM); block regressions. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Audit report: baseline Web Vitals, a11y gaps, bundle analysis. diff --git a/agents/role-prompts/git-surgeon.md b/agents/role-prompts/git-surgeon.md index 7cfbf49b..63473116 100644 --- a/agents/role-prompts/git-surgeon.md +++ b/agents/role-prompts/git-surgeon.md @@ -22,8 +22,8 @@ Approach: - Clean history: atomic commits, meaningful messages, no merge commits in features. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Commands: exact git commands to run, in order, with explanations. diff --git a/agents/role-prompts/golang-pro.md b/agents/role-prompts/golang-pro.md index 18da6df4..7ee0e7f6 100644 --- a/agents/role-prompts/golang-pro.md +++ b/agents/role-prompts/golang-pro.md @@ -20,9 +20,8 @@ Approach: - Stage small commits with clear rationale and rollback paths. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diff summary with rationale and Go idioms. diff --git a/agents/role-prompts/graphql-specialist.md b/agents/role-prompts/graphql-specialist.md index 039dcb6a..9477aa22 100644 --- a/agents/role-prompts/graphql-specialist.md +++ b/agents/role-prompts/graphql-specialist.md @@ -21,8 +21,8 @@ Approach: - Monitor: field-level tracing, query complexity scoring, slow query logging. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Schema SDL: types, queries, mutations with descriptions and deprecation notices. diff --git a/agents/role-prompts/historian.md b/agents/role-prompts/historian.md index 1db6ec0a..677fd4ba 100644 --- a/agents/role-prompts/historian.md +++ b/agents/role-prompts/historian.md @@ -23,10 +23,8 @@ Approach: - Extract rationale: commit messages, PR descriptions, code comments, design docs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Timeline: chronological narrative with commits, PRs, issues, and decision points. diff --git a/agents/role-prompts/http-client-specialist.md b/agents/role-prompts/http-client-specialist.md index 646e2138..f2fbe559 100644 --- a/agents/role-prompts/http-client-specialist.md +++ b/agents/role-prompts/http-client-specialist.md @@ -22,8 +22,8 @@ Approach: - Test failure modes: simulate timeouts, 5xx errors, connection refused. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Client configuration: timeouts, pool sizes, retry policy, circuit breaker settings. diff --git a/agents/role-prompts/implementation-helper.md b/agents/role-prompts/implementation-helper.md index 1205bf13..90150806 100644 --- a/agents/role-prompts/implementation-helper.md +++ b/agents/role-prompts/implementation-helper.md @@ -14,8 +14,8 @@ When to invoke: - Making principled engineering decisions under constraints At startup, read: -- the [compact MCP list](../reference/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) -- the [handoff guidelines](../reference/handoff-guide.md) (delegation rules) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) to make yourself fully aware of the MCP tools available to you, as well as the extra resources about them in this repo (for when you need them) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (delegation rules) Approach: 1. Clarify scope and success criteria; identify ambiguities upfront diff --git a/agents/role-prompts/incident-commander.md b/agents/role-prompts/incident-commander.md index 856852b3..50380a74 100644 --- a/agents/role-prompts/incident-commander.md +++ b/agents/role-prompts/incident-commander.md @@ -22,10 +22,8 @@ Approach: - Patterns: track recurring issues, MTTR, alert noise; propose systemic fixes. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Incident brief: severity, impact, start time, status, war room link. diff --git a/agents/role-prompts/interviewer.md b/agents/role-prompts/interviewer.md index 0f7a99f1..40a941f3 100644 --- a/agents/role-prompts/interviewer.md +++ b/agents/role-prompts/interviewer.md @@ -23,10 +23,8 @@ Approach: - Provide feedback: acknowledge correct reasoning, gently correct misconceptions. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Question sequence: progressive difficulty, building on previous answers. diff --git a/agents/role-prompts/kubernetes-operator.md b/agents/role-prompts/kubernetes-operator.md index d77a5465..e8a2f93d 100644 --- a/agents/role-prompts/kubernetes-operator.md +++ b/agents/role-prompts/kubernetes-operator.md @@ -21,8 +21,8 @@ Approach: - Observability: prometheus annotations, structured logging, tracing headers. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Manifests: Deployment, Service, ConfigMap, etc. with inline comments. diff --git a/agents/role-prompts/localization-engineer.md b/agents/role-prompts/localization-engineer.md index 9d574712..57897105 100644 --- a/agents/role-prompts/localization-engineer.md +++ b/agents/role-prompts/localization-engineer.md @@ -22,8 +22,8 @@ Approach: - Plan for text expansion: translations can be 30-50% longer. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Message catalog: extracted strings in ICU format or framework-specific. diff --git a/agents/role-prompts/log-analyst.md b/agents/role-prompts/log-analyst.md index be634e59..d021e17e 100644 --- a/agents/role-prompts/log-analyst.md +++ b/agents/role-prompts/log-analyst.md @@ -21,8 +21,8 @@ Approach: - Context gathering: what happened before the error, related logs. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Timeline: sequence of events with timestamps and service names. diff --git a/agents/role-prompts/message-queue-architect.md b/agents/role-prompts/message-queue-architect.md index f42592f3..671ca47d 100644 --- a/agents/role-prompts/message-queue-architect.md +++ b/agents/role-prompts/message-queue-architect.md @@ -22,8 +22,8 @@ Approach: - Dead letter strategy: capture failures, enable investigation, support replay. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture diagram: producers, topics/queues, consumers, DLQ flow. diff --git a/agents/role-prompts/migration-refactoring.md b/agents/role-prompts/migration-refactoring.md index bd0fca7b..f9415c72 100644 --- a/agents/role-prompts/migration-refactoring.md +++ b/agents/role-prompts/migration-refactoring.md @@ -20,11 +20,9 @@ Approach: - Plan data migration (expand–migrate–contract), backfills, reconciliation; clean up flags/shims. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (concise specs/ADRs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (concise specs/ADRs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Migration spec: target, scope, assumptions, risks, success criteria. diff --git a/agents/role-prompts/mobile-eng-architect.md b/agents/role-prompts/mobile-eng-architect.md index 2fd6c25b..f01922cb 100644 --- a/agents/role-prompts/mobile-eng-architect.md +++ b/agents/role-prompts/mobile-eng-architect.md @@ -18,10 +18,8 @@ Approach: - Release/CI: Fastlane/Gradle, signing, test gates, phased rollout, store checks. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Architecture brief: platforms, framework choice, modules, state/nav. diff --git a/agents/role-prompts/monorepo-architect.md b/agents/role-prompts/monorepo-architect.md index fe526d14..37c434d1 100644 --- a/agents/role-prompts/monorepo-architect.md +++ b/agents/role-prompts/monorepo-architect.md @@ -21,8 +21,8 @@ Approach: - Code sharing: internal packages with proper exports, avoid barrel files at scale. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Structure diagram: folder hierarchy with package boundaries and dependencies. diff --git a/agents/role-prompts/networking-edge-infra.md b/agents/role-prompts/networking-edge-infra.md index d19ad876..11d1d044 100644 --- a/agents/role-prompts/networking-edge-infra.md +++ b/agents/role-prompts/networking-edge-infra.md @@ -21,11 +21,9 @@ Approach: - Plan rollout: canary by path/region, staged shifts; monitoring + rollback. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (decision docs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (decision docs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: objectives, constraints, SLOs, assumptions. diff --git a/agents/role-prompts/observability.md b/agents/role-prompts/observability.md index 83cf8c1b..2eacd13d 100644 --- a/agents/role-prompts/observability.md +++ b/agents/role-prompts/observability.md @@ -19,10 +19,8 @@ Approach: - For incidents: declare, stabilize, correlate traces/metrics/logs, document timeline and follow‑ups. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Telemetry plan: coverage deltas, propagation, key metrics/log fields. diff --git a/agents/role-prompts/optimization.md b/agents/role-prompts/optimization.md index 81b6ffee..2da74cef 100644 --- a/agents/role-prompts/optimization.md +++ b/agents/role-prompts/optimization.md @@ -18,10 +18,8 @@ Approach: - Re-measure after each change; keep diffs small and reversible. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: baseline vs targets; constraints and risks. diff --git a/agents/role-prompts/orm-optimization-specialist.md b/agents/role-prompts/orm-optimization-specialist.md index ac1126f8..d43d0aee 100644 --- a/agents/role-prompts/orm-optimization-specialist.md +++ b/agents/role-prompts/orm-optimization-specialist.md @@ -22,8 +22,8 @@ Approach: - Test with realistic data: N+1 is invisible with 2 records, catastrophic with 1000. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Query analysis: current queries, N+1 identification, query count per operation. diff --git a/agents/role-prompts/platform-eng.md b/agents/role-prompts/platform-eng.md index 57097ed2..185b752b 100644 --- a/agents/role-prompts/platform-eng.md +++ b/agents/role-prompts/platform-eng.md @@ -19,10 +19,8 @@ Approach: - Document in portal; provide upgrade paths/escape hatches; track feedback. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) -- the [code search guide](../reference/by-category/code-search.md) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: sprawl hotspots/manual steps with evidence (paths:lines). diff --git a/agents/role-prompts/realtime.md b/agents/role-prompts/realtime.md index 2e3dfb34..cfff4551 100644 --- a/agents/role-prompts/realtime.md +++ b/agents/role-prompts/realtime.md @@ -19,10 +19,8 @@ Approach: - Validate with load/soak/replay/chaos; stage behind flags/canaries; re‑measure. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Context7 deep dive](../reference/deep-dives/context7.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Budget: topology, per‑hop targets, assumptions, and constraints. diff --git a/agents/role-prompts/regex-wizard.md b/agents/role-prompts/regex-wizard.md index ec71946c..eeb9891a 100644 --- a/agents/role-prompts/regex-wizard.md +++ b/agents/role-prompts/regex-wizard.md @@ -21,8 +21,8 @@ Approach: - Document: explain each part of the pattern inline. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Pattern: the regex with inline comments (x-flag style when supported). diff --git a/agents/role-prompts/rust-pro.md b/agents/role-prompts/rust-pro.md index ac2c9782..c283f45f 100644 --- a/agents/role-prompts/rust-pro.md +++ b/agents/role-prompts/rust-pro.md @@ -20,9 +20,8 @@ Approach: - Stage small commits with rationale and rollback paths. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Diff summary with rationale, referenced Rust idioms. diff --git a/agents/role-prompts/scalability-reliability.md b/agents/role-prompts/scalability-reliability.md index 1e767acb..202cb1f0 100644 --- a/agents/role-prompts/scalability-reliability.md +++ b/agents/role-prompts/scalability-reliability.md @@ -19,10 +19,8 @@ Approach: - Capture runbooks, dashboards‑as‑code, and post‑incident actions. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - SLO spec: SLIs, targets, windows; error‑budget policy and alerts. diff --git a/agents/role-prompts/schema-evolution.md b/agents/role-prompts/schema-evolution.md index 74f6fe82..2d34b30d 100644 --- a/agents/role-prompts/schema-evolution.md +++ b/agents/role-prompts/schema-evolution.md @@ -23,10 +23,8 @@ Approach: - API versioning: URL/header versioning, deprecation notices, sunset timelines. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Migration plan: phases (expand/migrate/contract), DDL statements, rollback steps. diff --git a/agents/role-prompts/search-implementation-specialist.md b/agents/role-prompts/search-implementation-specialist.md index b906c00d..8b531322 100644 --- a/agents/role-prompts/search-implementation-specialist.md +++ b/agents/role-prompts/search-implementation-specialist.md @@ -22,8 +22,8 @@ Approach: - Iterate: relevance is never "done"; instrument, measure, improve continuously. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Index mapping: field types, analyzers, multi-fields for different query patterns. diff --git a/agents/role-prompts/searcher.md b/agents/role-prompts/searcher.md index cc6fb82a..cc096efc 100644 --- a/agents/role-prompts/searcher.md +++ b/agents/role-prompts/searcher.md @@ -16,10 +16,8 @@ Approach: - Present the user with a list of relevant code snippets. Must-read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - A list of code snippets, each with the file path and line number. diff --git a/agents/role-prompts/security-compliance.md b/agents/role-prompts/security-compliance.md index e3661338..ca586c96 100644 --- a/agents/role-prompts/security-compliance.md +++ b/agents/role-prompts/security-compliance.md @@ -18,10 +18,8 @@ Approach: - Verify/capture: targeted tests; adherence dashboards; store decisions/evidence. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) (Tier 3 as needed) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Risk summary: top findings, severity, assets, impact. diff --git a/agents/role-prompts/serverless-specialist.md b/agents/role-prompts/serverless-specialist.md index 190789b0..fb13717a 100644 --- a/agents/role-prompts/serverless-specialist.md +++ b/agents/role-prompts/serverless-specialist.md @@ -22,8 +22,8 @@ Approach: - Idempotency: event sources may duplicate; design handlers to be replayable. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Function design: handler structure, initialization, event parsing, response format. diff --git a/agents/role-prompts/shell-scripter.md b/agents/role-prompts/shell-scripter.md index a7621b7e..999be34a 100644 --- a/agents/role-prompts/shell-scripter.md +++ b/agents/role-prompts/shell-scripter.md @@ -21,8 +21,8 @@ Approach: - Clean up: trap EXIT for cleanup, use mktemp for temp files. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Script: complete, commented, with shebang and set flags. diff --git a/agents/role-prompts/state-machine-designer.md b/agents/role-prompts/state-machine-designer.md index 1b02e9dd..b8b0ab1e 100644 --- a/agents/role-prompts/state-machine-designer.md +++ b/agents/role-prompts/state-machine-designer.md @@ -22,8 +22,8 @@ Approach: - Visualize always: state diagrams are documentation and debugging tools. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - State diagram: visual representation (Mermaid, XState Viz, or ASCII). diff --git a/agents/role-prompts/task-decomposer.md b/agents/role-prompts/task-decomposer.md index 8edca017..cbba94ae 100644 --- a/agents/role-prompts/task-decomposer.md +++ b/agents/role-prompts/task-decomposer.md @@ -23,8 +23,8 @@ Approach: - Agent routing: map subtasks to roles (architect, debugger, testing, etc.). Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) (when to delegate to which agent) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) (when to delegate to which agent) Output format: - Summary: problem statement, success criteria, constraints, assumptions. diff --git a/agents/role-prompts/tech-debt.md b/agents/role-prompts/tech-debt.md index 69a56a40..4a17d5ab 100644 --- a/agents/role-prompts/tech-debt.md +++ b/agents/role-prompts/tech-debt.md @@ -20,11 +20,9 @@ Approach: - Sequence work into safe iterations; measure deltas after each step. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [code search guide](../reference/by-category/code-search.md) (Tier 2) -- the [Sourcegraph deep dive](../reference/deep-dives/sourcegraph.md) (Tier 3 as needed) -- the [docs style guide](../reference/style-guides/docs-style-guide.md) (for concise decision docs) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [docs style guide](../../protocols/context/static/style-guides/docs-style-guide.md) (for concise decision docs) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Summary: context, objectives, constraints; current risk level. diff --git a/agents/role-prompts/terraform-specialist.md b/agents/role-prompts/terraform-specialist.md index 70f52903..96acfecc 100644 --- a/agents/role-prompts/terraform-specialist.md +++ b/agents/role-prompts/terraform-specialist.md @@ -21,8 +21,8 @@ Approach: - Prefer data sources over hardcoded IDs; use `moved` blocks for refactors. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Plan summary: resources to add/change/destroy with risk assessment (low/medium/high). diff --git a/agents/role-prompts/testing.md b/agents/role-prompts/testing.md index 251a09f7..f4762e47 100644 --- a/agents/role-prompts/testing.md +++ b/agents/role-prompts/testing.md @@ -21,10 +21,8 @@ Approach: - Document patterns; prepare minimal, safe diffs and rollout notes. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) -- the [code search guide](../reference/by-category/code-search.md) -- the [Semgrep deep dive](../reference/deep-dives/semgrep.md) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Findings: flake hotspots and gaps with evidence (paths:lines). diff --git a/agents/role-prompts/type-system-expert.md b/agents/role-prompts/type-system-expert.md index 6b32b9b5..2245a36f 100644 --- a/agents/role-prompts/type-system-expert.md +++ b/agents/role-prompts/type-system-expert.md @@ -22,8 +22,8 @@ Approach: - Document complex types: JSDoc, comments, examples. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Type definitions: with inline comments explaining each part. diff --git a/agents/role-prompts/webhook-integration-specialist.md b/agents/role-prompts/webhook-integration-specialist.md index 316cc99b..0ce99202 100644 --- a/agents/role-prompts/webhook-integration-specialist.md +++ b/agents/role-prompts/webhook-integration-specialist.md @@ -21,8 +21,8 @@ Approach: - **Sending:** provide event logs, retry buttons, and delivery status in dashboard. Must‑read at startup: -- the [compact MCP list](../reference/tools-guide.md) (Tier 1: tool selection) -- the [handoff guidelines](../reference/handoff-guide.md) +- the [compact MCP list](../../protocols/context/static/tools-guide.md) (Tier 1: tool selection) +- the [handoff guidelines](../../protocols/context/static/handoff-guide.md) Output format: - Endpoint implementation: signature verification, quick response, async processing. diff --git a/agents/scripts/README.md b/agents/scripts/README.md new file mode 100644 index 00000000..3bcc0953 --- /dev/null +++ b/agents/scripts/README.md @@ -0,0 +1,23 @@ +# Agent setup scripts + +This directory contains scripts that set up Bureau's agent role prompts for use within the supported coding agent CLIs, both as background subagents and direct, main agents. + +## Why role prompt bodies are *embedded* in the launcher scripts generated for Codex & Gemini + +> [!NOTE] +> This section concerns *only* the launcher generator scripts [for Codex](set-up-codex-role-launchers.sh) and [for Gemini](set-up-gemini-role-launchers.sh). + +### Background + +- The launcher generators *embed* the role prompt content into each generated launcher script. +- These, in turn, only write their role prompt to a temp file when they're run (the temp file gets automatically cleaned via `trap`, and backups are used to avoid clobbering). + +### Rationale + +- **Maintains a snapshot of behavior:** launchers are stable even if role prompts are changed later (the role prompts only update upon the next run of the corresponding launcher generator script and/or `open-bureau`) +- **Launchers keep working without the repo** even if it's moved or deleted, avoiding runtime dependency on repo availability. +- **Keeping security/perms consistent** by not requiring launchers to read repo files at runtime. + +> [!TIP] +> +> After updating any role prompts, you **must** run [`open-bureau`](../../bin/open-bureau) to regenerate the launcher scripts to contain the updated prompts. diff --git a/agents/scripts/set-up-agents.sh b/agents/scripts/set-up-agents.sh index 2cc0555d..69f6fc5d 100755 --- a/agents/scripts/set-up-agents.sh +++ b/agents/scripts/set-up-agents.sh @@ -5,12 +5,6 @@ set -euo pipefail -# Color codes for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -NC='\033[0m' # No Color - # Find the repo root (where this script's ancestor directory is) SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" AGENTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" @@ -18,8 +12,9 @@ CLAUDE_AGENTS_DIRNAME="claude-subagents" CLINK_AGENTS_DIRNAME="role-prompts" REPO_ROOT="$(cd "$AGENTS_DIR/.." && pwd)" -# Source agent selection library +# Source internal Bureau libraries source "$REPO_ROOT/bin/lib/agent-selection.sh" +source "$REPO_ROOT/bin/lib/logging.sh" # Detect installed CLIs based on directory existence (exits if none found, logs detected CLIs) discover_agents @@ -27,30 +22,15 @@ discover_agents # Subdirectory name for symlinked agents AGENTS_SUBDIR="bureau-agents" -echo -e "${GREEN}Setting up Bureau agents${NC}" -echo -e "Repo root: $REPO_ROOT" -echo -e "Selected agents: ${AGENTS[*]}" -echo "" - -# Function to print step headers -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -# Function to print success -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -# Function to print error and exit -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} +log_banner "Setting up Bureau agents" +echo "Repo root: $REPO_ROOT" +echo "Selected agents: ${AGENTS[*]}" +log_empty_line # Check if we're in the right place if [[ ! -d "$AGENTS_DIR/$CLAUDE_AGENTS_DIRNAME" ]] || [[ ! -d "$AGENTS_DIR/$CLINK_AGENTS_DIRNAME" ]]; then - print_error "Cannot find agent directories! ($CLAUDE_AGENTS_DIRNAME/ and $CLINK_AGENTS_DIRNAME within $AGENTS_DIR)" + log_error "Cannot find agent directories! ($CLAUDE_AGENTS_DIRNAME/ and $CLINK_AGENTS_DIRNAME within $AGENTS_DIR)" + exit 1 fi # ============================================================================ @@ -58,19 +38,19 @@ fi # ============================================================================ # Only set up PAL if any agent is selected (PAL is cross-CLI, used by all) if [[ ${#AGENTS[@]} -gt 0 ]]; then - print_step "Setting up clink subagents for PAL MCP" + log_action "Setting up clink subagents for PAL MCP" # Create directory structure mkdir -p ~/.pal/cli_clients/systemprompts - print_success "Ensured/created ~/.pal/cli_clients/systemprompts" + log_success "Ensured/created ~/.pal/cli_clients/systemprompts" # Symlink role prompts folder if [[ -L ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR ]]; then rm ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR - print_success "Removed existing Bureau symlink at ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR (to ensure consistency after any reconfiguration)" + log_success "Removed existing Bureau symlink at ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR (to ensure consistency after any reconfiguration)" fi ln -s "$AGENTS_DIR/$CLINK_AGENTS_DIRNAME" ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR - print_success "Symlinked role prompts (for use with clink) to ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR" + log_success "Symlinked role prompts (for use with clink) to ~/.pal/cli_clients/systemprompts/$AGENTS_SUBDIR" echo "" fi @@ -79,23 +59,23 @@ fi # Step 2: Set up Claude Code subagents # ============================================================================ if agent_enabled "Claude Code"; then - print_step "Setting up Claude Code subagents" + log_action "Setting up Claude Code subagents" # Symlink Claude subagents folder if [[ -L ~/.claude/agents/$AGENTS_SUBDIR ]]; then rm ~/.claude/agents/$AGENTS_SUBDIR - print_success "Removed existing Bureau symlink at ~/.claude/agents/$AGENTS_SUBDIR (to ensure consistency after any reconfiguration)" + log_success "Removed existing Bureau symlink at ~/.claude/agents/$AGENTS_SUBDIR (to ensure consistency after any reconfiguration)" fi mkdir -p ~/.claude/agents - print_success "Ensured/created ~/.claude/agents directory" + log_success "Ensured/created ~/.claude/agents directory" # Symlink all Claude subagent files ln -s "$AGENTS_DIR/$CLAUDE_AGENTS_DIRNAME" ~/.claude/agents/$AGENTS_SUBDIR - print_success "Symlinked Claude subagent templates/role prompts to ~/.claude/agents/$AGENTS_SUBDIR" + log_success "Symlinked Claude subagent templates/role prompts to ~/.claude/agents/$AGENTS_SUBDIR" echo "" else - print_step "Skipping Claude Code subagents (CLI directory not found)" + log_action "Skipping Claude Code subagents (CLI directory not found)" echo "" fi @@ -105,49 +85,67 @@ fi # Claude Code slash commands if agent_enabled "Claude Code"; then - print_step "Setting up Claude Code slash commands" + log_action "Setting up Claude Code slash commands" "$AGENTS_DIR/scripts/set-up-claude-slash-commands.sh" echo "" fi # Codex role launchers if agent_enabled "Codex"; then - print_step "Setting up Codex role launchers" + log_action "Setting up Codex role launchers" "$AGENTS_DIR/scripts/set-up-codex-role-launchers.sh" echo "" fi # Gemini CLI role launchers if agent_enabled "Gemini CLI"; then - print_step "Setting up Gemini CLI role launchers" + log_action "Setting up Gemini CLI role launchers" "$AGENTS_DIR/scripts/set-up-gemini-role-launchers.sh" echo "" fi -# OpenCode agents (symlink for auto-discovery) +# OpenCode agents (filtered symlinks for auto-discovery) if agent_enabled "OpenCode"; then - print_step "Setting up Bureau agents for OpenCode" + log_action "Setting up Bureau agents for OpenCode" + OPENCODE_AGENTS_DIR="$HOME/.config/opencode/agent/$AGENTS_SUBDIR" + + # Get filtered role list + AGENTS_ENABLED_FOR_OPENCODE=$(uv run python -m operations.roles_catalog opencode) - # Symlink role prompts - OpenCode auto-discovers agents from this directory - OPEN_AGENT_DIR="$HOME/.config/opencode/agent/$AGENTS_SUBDIR" - mkdir -p "$HOME/.config/opencode/agent" + # Remove old directory (may be symlink or real directory) so config always wins + if [[ -e "$OPENCODE_AGENTS_DIR" ]]; then + rm -rf "$OPENCODE_AGENTS_DIR" + log_success "Removed old agent directory at $OPENCODE_AGENTS_DIR" + fi - if [[ -L "$OPEN_AGENT_DIR" ]]; then - rm "$OPEN_AGENT_DIR" - print_success "Removed old symlink at $OPEN_AGENT_DIR" + if [[ -z "$AGENTS_ENABLED_FOR_OPENCODE" ]]; then + log_warning "No agents enabled for OpenCode. Skipping setup and clearing any previously generated OpenCode Bureau agents." + log_info "To enable agents, update the roles.enabled list in your local.yml" + else + # Create directory and populate with symlinks corresponding to the agents + # enabled for OpenCode + mkdir -p "$OPENCODE_AGENTS_DIR" + + count=0 + for agent_name in $AGENTS_ENABLED_FOR_OPENCODE; do + source_file="$AGENTS_DIR/$CLINK_AGENTS_DIRNAME/${agent_name}.md" + if [[ -f "$source_file" ]]; then + ln -s "$source_file" "$OPENCODE_AGENTS_DIR/${agent_name}.md" + count=$((count + 1)) + else + log_warning "Agent file not found: $source_file (skipping)" + fi + done + + log_success "Created $count filtered agent symlinks in $OPENCODE_AGENTS_DIR" fi - ln -s "$AGENTS_DIR/$CLINK_AGENTS_DIRNAME" "$OPEN_AGENT_DIR" - print_success "Symlinked Bureau role prompts to $OPEN_AGENT_DIR (OpenCode auto-discovers these)" echo "" fi -# ============================================================================ -# Done! -# ============================================================================ -echo -e "${GREEN}✓ Agent setup complete!${NC}" +log_success "Agent setup complete!" echo "" echo "Next steps:" -echo " 1. Run the configs setup script: protocols/scripts/set-up-configs.sh" +echo " 1. Run the configs setup script: protocols/scripts/set-up-protocols.sh" echo " 2. Restart PAL MCP server to reload clink configs" echo " 3. Verify Claude Code agents with: claude (then run /agents)" echo " 4. Install claude-mem plugin:" diff --git a/agents/scripts/set-up-claude-slash-commands.sh b/agents/scripts/set-up-claude-slash-commands.sh index f5338b82..0c5d0ab9 100755 --- a/agents/scripts/set-up-claude-slash-commands.sh +++ b/agents/scripts/set-up-claude-slash-commands.sh @@ -1,95 +1,55 @@ #!/usr/bin/env bash + +# Exits on any: +# - error (-e) +# - undefined variable (-u) +# - failed pipe (-o pipefail) set -euo pipefail # Setup script for Claude Code slash commands that inject agent role prompts -# This allows launching agents in the current conversation via /architect, /frontend, etc. - -# Color codes for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' # No Color +# This allows launching agents in the current conversation via /architect-bureau, /frontend-bureau, etc. -# Find the repo root +# Locate repo root SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" AGENTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" REPO_ROOT="$(cd "$AGENTS_DIR/.." && pwd)" CLAUDE_SUBAGENTS_DIR="$AGENTS_DIR/claude-subagents" -# Source agent selection library -source "$REPO_ROOT/bin/lib/agent-selection.sh" +# Source internal Bureau libraries +source "$REPO_ROOT/bin/lib/agent-selection.sh" # reads configs to determine enabled agents +source "$REPO_ROOT/bin/lib/logging.sh" +source "$REPO_ROOT/bin/lib/roles-setup.sh" # handles cross-CLI role prompt setup # Detect installed CLIs (exits if none found, logs detected CLIs) discover_agents - -# Skip entirely if Claude not enabled -if ! agent_enabled "Claude Code"; then - echo -e "${YELLOW}Claude Code not enabled. Skipping slash commands setup.${NC}" - echo "To enable Claude Code:" - echo " mkdir -p ~/.claude" - echo " Then re-run this script or agents/scripts/set-up-agents.sh" - exit 0 -fi - -# Target directory for slash commands -COMMANDS_DIR="$HOME/.claude/commands" - -echo -e "${GREEN}Agent slash command setup for Claude Code${NC}" -echo -e "Source: $CLAUDE_SUBAGENTS_DIR" -echo -e "Target: $COMMANDS_DIR" -echo "" - -# Function to print step headers -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -# Function to print success -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -# Function to print info -print_info() { - echo -e "${BLUE}ℹ${NC} $1" -} - -# Function to print error and exit -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} +echo "Source: $CLAUDE_SUBAGENTS_DIR" +echo "Target: $HOME/.claude/commands" +log_empty_line # Check if source directory exists if [[ ! -d "$CLAUDE_SUBAGENTS_DIR" ]]; then - print_error "Cannot find claude-subagents directory at: $CLAUDE_SUBAGENTS_DIR" + log_error "Cannot find claude-subagents/ at the expected path: $CLAUDE_SUBAGENTS_DIR" + exit 1 fi -# Create commands directory if it doesn't exist -mkdir -p "$COMMANDS_DIR" -print_success "Ensured $COMMANDS_DIR exists" - -# Counter for generated commands -count=0 - -# Process each agent file -print_step "Generating slash commands from agent files" -echo "" - -for agent_file in "$CLAUDE_SUBAGENTS_DIR"/*.md; do - # Get the base name without extension (e.g., "architect" from "architect.md") - agent_name=$(basename "$agent_file" .md) +# Claude-specific processing function +process_claude_command() { + local role_name="$1" + local target_dir="$2" + local role_file="$CLAUDE_SUBAGENTS_DIR/${role_name}.md" - # Target command file - command_file="$COMMANDS_DIR/${agent_name}.md" + # Skip if file doesn't exist (configuration error) + if [[ ! -f "$role_file" ]]; then + log_warning "Role file not found: $role_file (skipping)" + return 1 + fi - # Extract the content after the frontmatter (everything after the second ---) - # The frontmatter is between the first --- and second --- - # We want everything after the second --- + # Target command file (suffixed to avoid collisions with user-created commands) + local command_file="$target_dir/${role_name}-bureau.md" - # Using awk to skip frontmatter and get the actual content - agent_content=$(awk ' + # Extract content after frontmatter (everything after the second ---) + local role_content + role_content=$(awk ' BEGIN { in_frontmatter=0; past_frontmatter=0 } /^---$/ { if (!past_frontmatter) { @@ -99,7 +59,7 @@ for agent_file in "$CLAUDE_SUBAGENTS_DIR"/*.md; do } } past_frontmatter { print } - ' "$agent_file") + ' "$role_file") # Create the slash command file with a preamble cat > "$command_file" << EOF @@ -119,18 +79,18 @@ print_success "Generated $count slash commands" # Print usage instructions echo "" -echo -e "${GREEN}Setup complete!${NC}" +log_success "Setup complete!" echo "" echo "Usage:" echo -e " 1. Launch Claude Code: ${BLUE}claude${NC}" echo " 2. Use any agent role via slash command:" echo "" -echo -e " ${BLUE}/architect${NC} - Principal software architect" -echo -e " ${BLUE}/frontend${NC} - Frontend architecture & UX" -echo -e " ${BLUE}/observability${NC} - Monitoring & incident response" -echo -e " ${BLUE}/security-compliance${NC} - Security & privacy architect" -echo -e " ${BLUE}/testing${NC} - Test quality & reliability" -echo " ... and $((count - 5)) more" +echo -e " ${BLUE}/architect${NC} - Principal software architect" +echo -e " ${BLUE}/frontend${NC} - Frontend architecture & UX" +echo -e " ${BLUE}/observability${NC} - Monitoring & incident response" +echo -e " ${BLUE}/security-compliance${NC} - Security & privacy architect" +echo -e " ${BLUE}/testing${NC} - Test quality & reliability" +echo " ... and more" echo "" echo -e " 3. List all available commands: ${BLUE}/help${NC}" echo "" diff --git a/agents/scripts/set-up-codex-role-launchers.sh b/agents/scripts/set-up-codex-role-launchers.sh index 5924a01a..36ad9120 100755 --- a/agents/scripts/set-up-codex-role-launchers.sh +++ b/agents/scripts/set-up-codex-role-launchers.sh @@ -1,15 +1,12 @@ #!/usr/bin/env bash set -euo pipefail -# Setup script for Codex role launcher wrappers -# Creates executable scripts in ~/.local/bin/ for launching Codex with specific agent roles - -# Color codes for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' # No Color +# Setup script for Codex role launcher wrappers: creates executable scripts +# in ~/.local/bin/ for launching Codex with specific agent roles +# +# Note: to see the rationale for *embedding* role prompts in the launcher scripts +# (e.g. rather than providing a Bureau-internal path to the role prompt file): +# see agents/scripts/README.md # Find the repo root SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" @@ -17,88 +14,52 @@ AGENTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" REPO_ROOT="$(cd "$AGENTS_DIR/.." && pwd)" CLINK_ROLES_DIR="$AGENTS_DIR/role-prompts" -# Source agent selection library +# Source internal Bureau libraries source "$REPO_ROOT/bin/lib/agent-selection.sh" +source "$REPO_ROOT/bin/lib/logging.sh" +source "$REPO_ROOT/bin/lib/roles-setup.sh" # Detect installed CLIs (exits if none found, logs detected CLIs) discover_agents -# Skip entirely if Codex not enabled -if ! agent_enabled "Codex"; then - echo -e "${YELLOW}Codex not enabled. Skipping role launchers setup.${NC}" - echo "To enable Codex:" - echo " mkdir -p ~/.codex" - echo " Then re-run this script or agents/scripts/set-up-agents.sh" - exit 0 -fi - -# Target directory for launcher scripts -LAUNCHERS_DIR="$HOME/.local/bin" - -echo -e "${GREEN}Role launcher setup for Codex${NC}" +log_success "Role launcher setup for Codex" echo -e "Source: $CLINK_ROLES_DIR" -echo -e "Target: $LAUNCHERS_DIR" +echo -e "Target: $HOME/.local/bin" echo "" -# Function to print step headers -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -# Function to print success -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -# Function to print info -print_info() { - echo -e "${BLUE}ℹ${NC} $1" -} - -# Function to print error and exit -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} - # Check if source directory exists if [[ ! -d "$CLINK_ROLES_DIR" ]]; then - print_error "Cannot find role-prompts directory at: $CLINK_ROLES_DIR" + log_error "Cannot find role-prompts directory at: $CLINK_ROLES_DIR" + exit 1 fi -# Create launchers directory if it doesn't exist -mkdir -p "$LAUNCHERS_DIR" -print_success "Ensured $LAUNCHERS_DIR exists" - # Check if ~/.local/bin is in PATH if [[ ":$PATH:" != *":$HOME/.local/bin:"* ]]; then - print_info "Note: $HOME/.local/bin is not in your PATH" - print_info "Add this to your ~/.zshrc or ~/.bashrc:" + log_info "Note: $HOME/.local/bin is not in your PATH" + log_info "Add this to your ~/.zshrc or ~/.bashrc:" echo "" echo " export PATH=\"\$HOME/.local/bin:\$PATH\"" echo "" fi -# Counter for generated launchers -count=0 - -# Process each role file -print_step "Generating role launchers from clink role prompts" -echo "" - -for role_file in "$CLINK_ROLES_DIR"/*.md; do - # Get the base name without extension (e.g., "architect" from "architect.md") - role_name=$(basename "$role_file" .md) +# Codex-specific processing function +process_codex_launcher() { + local role_name="$1" + local target_dir="$2" + local role_file="$CLINK_ROLES_DIR/${role_name}.md" - # Create a launcher script name with "codex-" prefix - launcher_name="codex-${role_name}" - launcher_file="$LAUNCHERS_DIR/$launcher_name" + # Skip if file doesn't exist + if [[ ! -f "$role_file" ]]; then + log_warning "Role file not found: $role_file (skipping)" + return 1 + fi - # Read the role prompt content - role_content=$(cat "$role_file") + # Create launcher script with "codex-" prefix + local launcher_name="codex-${role_name}" + local launcher_file="$target_dir/$launcher_name" # Create the launcher script that: - # 1. Creates a temporary AGENTS.md with the role prompt in the current directory + # 1. Creates a temporary AGENTS.md with the role prompt # 2. Launches codex (which auto-loads ./AGENTS.md) # 3. Cleans up on exit cat > "$launcher_file" << 'EOF_OUTER' @@ -139,16 +100,22 @@ EOF_OUTER # Make it executable chmod +x "$launcher_file" - print_info "Created $launcher_name" - count=$((count + 1)) -done + log_info "Created $launcher_name" + return 0 +} + +# Cleanup: remove all existing Codex launchers before re-creating enabled ones +cleanup_codex_launchers() { + remove_roles_by_pattern "$1" "codex-*" +} + +# Run setup using common workflow +setup_roles_for_cli "Codex" "codex" "$HOME/.local/bin" process_codex_launcher cleanup_codex_launchers -echo "" -print_success "Generated $count role launchers" # Print usage instructions echo "" -echo -e "${GREEN}Setup complete!${NC}" +log_success "Setup complete!" echo "" echo "Usage examples:" echo "" @@ -170,6 +137,6 @@ echo "" # Verify PATH setup if [[ ":$PATH:" != *":$HOME/.local/bin:"* ]]; then - echo -e "${YELLOW}⚠${NC} Remember to add ~/.local/bin to your PATH!" + log_warning "Remember to add ~/.local/bin to your PATH!" echo "" fi diff --git a/agents/scripts/set-up-codex-superpowers.sh b/agents/scripts/set-up-codex-superpowers.sh index b9aa4488..518221b7 100755 --- a/agents/scripts/set-up-codex-superpowers.sh +++ b/agents/scripts/set-up-codex-superpowers.sh @@ -4,17 +4,12 @@ set -euo pipefail -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -RED='\033[0;31m' -NC='\033[0m' - # Find repo root SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -AGENTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" -REPO_ROOT="$(cd "$AGENTS_DIR/.." && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" -# Source agent selection library +# Source logging + agent selection libraries +source "$REPO_ROOT/bin/lib/logging.sh" source "$REPO_ROOT/bin/lib/agent-selection.sh" # Load agent selection from profile @@ -22,7 +17,7 @@ discover_agents # Skip entirely if Codex not selected if ! agent_enabled "Codex"; then - echo -e "${YELLOW}Codex not in CLI profile. Skipping Superpowers setup.${NC}" + log_warning "Codex not in CLI profile. Skipping Superpowers setup." echo "To enable Codex, run:" echo " tools/scripts/set-up-tools.sh -a x # Codex only" echo " tools/scripts/set-up-tools.sh -a cx # Codex + Claude" @@ -31,59 +26,50 @@ if ! agent_enabled "Codex"; then fi REPO_URL="https://github.com/obra/superpowers.git" -TARGET_DIR="${HOME}/.codex/superpowers" -SKILLS_DIR="${HOME}/.codex/skills" -BOOTSTRAP_CMD="${TARGET_DIR}/.codex/superpowers-codex" - -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -print_warning() { - echo -e "${YELLOW}⚠${NC} $1" -} +SUPERPOWERS_DIR="${HOME}/.codex/superpowers" +CODEX_SKILLS_DIR="${HOME}/.agents/skills" +SUPERPOWERS_SKILLS_SOURCE="${SUPERPOWERS_DIR}/skills" +SUPERPOWERS_SKILLS_LINK="${CODEX_SKILLS_DIR}/superpowers" -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} - -print_step "Ensuring Codex directories exist" +log_action "==>" "Ensuring Codex directories exist" mkdir -p "${HOME}/.codex" -mkdir -p "${SKILLS_DIR}" -print_success "Codex directories ready" +mkdir -p "${CODEX_SKILLS_DIR}" +log_success "Codex directories ready" -if [ -d "${TARGET_DIR}/.git" ]; then - print_step "Updating existing Superpowers checkout" - git -C "${TARGET_DIR}" remote set-url origin "${REPO_URL}" - if ! git -C "${TARGET_DIR}" fetch --tags --prune; then - print_warning "Unable to fetch updates for Superpowers. Continuing with existing checkout." +if [ -d "${SUPERPOWERS_DIR}/.git" ]; then + log_action "==>" "Updating existing Superpowers checkout" + git -C "${SUPERPOWERS_DIR}" remote set-url origin "${REPO_URL}" + if ! git -C "${SUPERPOWERS_DIR}" fetch --tags --prune; then + log_warning "Unable to fetch updates for Superpowers. Continuing with existing checkout." else - if ! git -C "${TARGET_DIR}" merge --ff-only origin/main >/dev/null 2>&1; then - print_warning "Could not fast-forward Superpowers repository (local changes?). Leaving as-is." + if ! git -C "${SUPERPOWERS_DIR}" merge --ff-only origin/main >/dev/null 2>&1; then + log_warning "Could not fast-forward Superpowers repository (local changes?). Leaving as-is." else - print_success "Superpowers repository updated" + log_success "Superpowers repository updated" fi fi else - print_step "Cloning Superpowers repository" - git clone "${REPO_URL}" "${TARGET_DIR}" - print_success "Superpowers repository cloned" + log_action "==>" "Cloning Superpowers repository" + git clone "${REPO_URL}" "${SUPERPOWERS_DIR}" + log_success "Superpowers repository cloned" +fi + +if [ ! -d "${SUPERPOWERS_SKILLS_SOURCE}" ]; then + log_error "Superpowers skills directory not found at ${SUPERPOWERS_SKILLS_SOURCE}" + exit 1 +fi + +if [ -L "${SUPERPOWERS_SKILLS_LINK}" ]; then + rm -f "${SUPERPOWERS_SKILLS_LINK}" fi -if [ ! -x "${BOOTSTRAP_CMD}" ]; then - print_warning "Bootstrap command ${BOOTSTRAP_CMD} not found or not executable" +if [ -e "${SUPERPOWERS_SKILLS_LINK}" ]; then + log_error "Path exists and is not a symlink: ${SUPERPOWERS_SKILLS_LINK}" + log_error "Move or remove it, then rerun this setup script." + exit 1 else - print_step "Running Superpowers bootstrap (verification)" - if "${BOOTSTRAP_CMD}" bootstrap >/dev/null 2>&1; then - print_success "Superpowers bootstrap completed" - else - print_warning "Bootstrap command exited with a non-zero status. Review output above if any." - fi + ln -s "${SUPERPOWERS_SKILLS_SOURCE}" "${SUPERPOWERS_SKILLS_LINK}" + log_success "Linked ${SUPERPOWERS_SKILLS_LINK} -> ${SUPERPOWERS_SKILLS_SOURCE}" fi -print_success "Superpowers setup for Codex complete!" +log_success "Superpowers setup for Codex complete!" diff --git a/agents/scripts/set-up-gemini-role-launchers.sh b/agents/scripts/set-up-gemini-role-launchers.sh index eab32c3c..2090f65a 100755 --- a/agents/scripts/set-up-gemini-role-launchers.sh +++ b/agents/scripts/set-up-gemini-role-launchers.sh @@ -1,15 +1,12 @@ #!/usr/bin/env bash set -euo pipefail -# Setup script for Gemini CLI role launcher wrappers -# Creates executable scripts in ~/.local/bin/ for launching Gemini with specific agent roles - -# Color codes for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' # No Color +# Setup script for Gemini CLI role launcher wrappers: creates executable scripts +# in ~/.local/bin/ for launching Gemini with specific agent roles +# +# Note: to see the rationale for *embedding* role prompts in the launcher scripts +# (e.g. rather than providing a Bureau-internal path to the role prompt file): +# see agents/scripts/README.md # Find the repo root SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" @@ -17,89 +14,53 @@ AGENTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" REPO_ROOT="$(cd "$AGENTS_DIR/.." && pwd)" CLINK_ROLES_DIR="$AGENTS_DIR/role-prompts" -# Source agent selection library +# Source internal Bureau libraries source "$REPO_ROOT/bin/lib/agent-selection.sh" +source "$REPO_ROOT/bin/lib/logging.sh" +source "$REPO_ROOT/bin/lib/roles-setup.sh" # Detect installed CLIs (exits if none found, logs detected CLIs) discover_agents -# Skip entirely if Gemini not enabled -if ! agent_enabled "Gemini CLI"; then - echo -e "${YELLOW}Gemini CLI not enabled. Skipping role launchers setup.${NC}" - echo "To enable Gemini CLI:" - echo " mkdir -p ~/.gemini" - echo " Then re-run this script or agents/scripts/set-up-agents.sh" - exit 0 -fi - -# Target directory for launcher scripts -LAUNCHERS_DIR="$HOME/.local/bin" - -echo -e "${GREEN}Role launcher setup for Gemini${NC}" +log_success "Role launcher setup for Gemini" echo -e "Source: $CLINK_ROLES_DIR" -echo -e "Target: $LAUNCHERS_DIR" +echo -e "Target: $HOME/.local/bin" echo "" -# Function to print step headers -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -# Function to print success -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -# Function to print info -print_info() { - echo -e "${BLUE}ℹ${NC} $1" -} - -# Function to print error and exit -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} - # Check if source directory exists if [[ ! -d "$CLINK_ROLES_DIR" ]]; then - print_error "Cannot find role-prompts directory at: $CLINK_ROLES_DIR" + log_error "Cannot find role-prompts directory at: $CLINK_ROLES_DIR" + exit 1 fi -# Create launchers directory if it doesn't exist -mkdir -p "$LAUNCHERS_DIR" -print_success "Ensured $LAUNCHERS_DIR exists" - # Check if ~/.local/bin is in PATH if [[ ":$PATH:" != *":$HOME/.local/bin:"* ]]; then - print_info "Note: $HOME/.local/bin is not in your PATH" - print_info "Add this to your ~/.zshrc or ~/.bashrc:" + log_info "Note: $HOME/.local/bin is not in your PATH" + log_info "Add this to your ~/.zshrc or ~/.bashrc:" echo "" echo " export PATH=\"\$HOME/.local/bin:\$PATH\"" echo "" fi -# Counter for generated launchers -count=0 - -# Process each role file -print_step "Generating role launchers from clink role prompts" -echo "" - -for role_file in "$CLINK_ROLES_DIR"/*.md; do - # Get the base name without extension (e.g., "architect" from "architect.md") - role_name=$(basename "$role_file" .md) +# Gemini-specific processing function +process_gemini_launcher() { + local role_name="$1" + local target_dir="$2" + local role_file="$CLINK_ROLES_DIR/${role_name}.md" - # Create a launcher script name with "gemini-" prefix - launcher_name="gemini-${role_name}" - launcher_file="$LAUNCHERS_DIR/$launcher_name" + # Skip if file doesn't exist + if [[ ! -f "$role_file" ]]; then + log_warning "Role file not found: $role_file (skipping)" + return 1 + fi - # Read the role prompt content - role_content=$(cat "$role_file") + # Create launcher script with "gemini-" prefix + local launcher_file="gemini-${role_name}" + launcher_file="$target_dir/$launcher_file" # Create the launcher script that: - # 1. Creates a temporary GEMINI.md with the role prompt - # 2. Launches gemini with that config + # 1. Creates a temporary file with the role prompt + # 2. Launches gemini with --prompt-interactive # 3. Cleans up on exit cat > "$launcher_file" << 'EOF_OUTER' #!/usr/bin/env bash @@ -113,7 +74,7 @@ trap "rm -f $ROLE_FILE" EXIT cat > "$ROLE_FILE" << 'EOF_INNER' EOF_OUTER - # Append the actual role content + # Append the actual role content between the EOF_INNER delineators in the launcher cat "$role_file" >> "$launcher_file" # Close the heredoc and add the launch command @@ -121,23 +82,28 @@ EOF_OUTER EOF_INNER # Launch Gemini with the role as a system prompt via --prompt-interactive -# The role content is injected as the first message gemini --prompt-interactive "$(cat "$ROLE_FILE")" "$@" EOF_OUTER # Make it executable chmod +x "$launcher_file" - print_info "Created $launcher_name" - count=$((count + 1)) -done + log_info "Created $launcher_file" + return 0 +} + +# Cleanup: remove all existing Gemini launchers before re-creating enabled ones +cleanup_gemini_launchers() { + remove_roles_by_pattern "$1" "gemini-*" +} + +# Run setup using common workflow +setup_roles_for_cli "Gemini CLI" "gemini" "$HOME/.local/bin" process_gemini_launcher cleanup_gemini_launchers -echo "" -print_success "Generated $count role launchers" # Print usage instructions echo "" -echo -e "${GREEN}Setup complete!${NC}" +log_success "Setup complete!" echo "" echo "Usage examples:" echo "" @@ -155,6 +121,6 @@ echo "" # Verify PATH setup if [[ ":$PATH:" != *":$HOME/.local/bin:"* ]]; then - echo -e "${YELLOW}⚠${NC} Remember to add ~/.local/bin to your PATH!" + log_warning "Remember to add ~/.local/bin to your PATH!" echo "" fi diff --git a/bin/check-prereqs b/bin/ensure-prereqs similarity index 58% rename from bin/check-prereqs rename to bin/ensure-prereqs index 269cd0b4..5b526dae 100755 --- a/bin/check-prereqs +++ b/bin/ensure-prereqs @@ -6,8 +6,8 @@ # - Syncs project deps from pyproject.toml # # Usage: -# ./bin/check-prereqs # Check prerequisites and set up Python -# ./bin/check-prereqs --quiet # Exit 0 if all present, 1 if any missing (no output) +# ./bin/ensure-prereqs # Check prerequisites and set up Python +# ./bin/ensure-prereqs --quiet # Exit 0 if all present, 1 if any missing (no output) # # Exit codes: # 0 - All prerequisites installed @@ -15,11 +15,8 @@ set -e -# Colors -GREEN='\033[0;32m' -RED='\033[0;31m' -YELLOW='\033[1;33m' -NC='\033[0m' +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +source "$REPO_ROOT/bin/lib/logging.sh" QUIET=false MISSING=0 @@ -48,36 +45,37 @@ check_cmd() { if command -v "$cmd" &>/dev/null; then if [[ "$QUIET" == false ]]; then - echo -e "${GREEN}✓${NC} $cmd" + log_success "$cmd" fi return 0 else if [[ "$QUIET" == false ]]; then - echo -e "${RED}✗${NC} $cmd" - echo -e " └─ ${YELLOW}$install_msg${NC}" + log_error "$cmd" + log_warning " └─ $install_msg" fi MISSING=1 return 1 fi } -[[ "$QUIET" == false ]] && echo -e "Checking Bureau prerequisites...\n" +if [[ "$QUIET" == false ]]; then + log_action "Checking" "Bureau prerequisites..." + log_empty_line +fi # check required prerequisites check_cmd "npx" "Install Node.js: https://nodejs.org or use nvm" check_cmd "uvx" "Install uv: curl -LsSf https://astral.sh/uv/install.sh | sh" check_cmd "docker" "Install Docker: https://docs.docker.com/get-docker/" -[[ "$QUIET" == false ]] && echo "" +[[ "$QUIET" == false ]] && log_empty_line if [[ $MISSING -ne 0 ]]; then - [[ "$QUIET" == false ]] && echo -e "${RED}Install missing prerequisites and try again.${NC}" + [[ "$QUIET" == false ]] && log_error "Install missing prerequisites and try again." exit 1 fi -[[ "$QUIET" == false ]] && echo -e "${GREEN}All prerequisites are present.${NC}" - -REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +[[ "$QUIET" == false ]] && log_success "All prerequisites are present." # Reuse Python version in .python-version PYTHON_VERSION=$(cat "$REPO_ROOT/.python-version" | tr -d '[:space:]') @@ -85,21 +83,29 @@ PYTHON_VERSION=$(cat "$REPO_ROOT/.python-version" | tr -d '[:space:]') # Verify .python-version and pyproject.toml's `requires-python` match PYPROJECT_PYTHON=$(grep 'requires-python' "$REPO_ROOT/pyproject.toml" | grep -oE '[0-9]+\.[0-9]+') if [[ "$PYTHON_VERSION" != "$PYPROJECT_PYTHON" ]]; then - echo -e "${YELLOW}⚠${NC} Version mismatch: .python-version ($PYTHON_VERSION) vs pyproject.toml ($PYPROJECT_PYTHON)" - echo -e " └─ ${YELLOW}Update both files to match${NC}" + log_warning "Version mismatch: .python-version ($PYTHON_VERSION) vs pyproject.toml ($PYPROJECT_PYTHON)" + log_warning " └─ Update both files to match" fi # Ensure correct Python version and sync dependencies -[[ "$QUIET" == false ]] && echo -e "\nSetting up Python environment...\n" +if [[ "$QUIET" == false ]]; then + log_empty_line + log_action "Setting up" "Python environment..." + log_empty_line +fi uv python install "$PYTHON_VERSION" --quiet 2>/dev/null || true -[[ "$QUIET" == false ]] && echo -e "${GREEN}✓${NC} Python $PYTHON_VERSION ready" +[[ "$QUIET" == false ]] && log_success "Python $PYTHON_VERSION ready" if ! (cd "$REPO_ROOT" && uv sync); then - echo -e "${RED}✗${NC} Failed to sync Python dependencies" + log_error "Failed to sync Python dependencies" exit 1 fi -[[ "$QUIET" == false ]] && echo -e "${GREEN}✓${NC} Python dependencies installed" +[[ "$QUIET" == false ]] && log_success "Python dependencies installed" -[[ "$QUIET" == false ]] && echo -e "\n${GREEN}Python environment is ready.${NC}\n" -exit 0 \ No newline at end of file +if [[ "$QUIET" == false ]]; then + log_empty_line + log_success "Python environment is ready." + log_empty_line +fi +exit 0 diff --git a/bin/lib/agent-selection.sh b/bin/lib/agent-selection.sh index 705595f1..256490c3 100644 --- a/bin/lib/agent-selection.sh +++ b/bin/lib/agent-selection.sh @@ -51,7 +51,7 @@ _get_repo_root() { } # Internal: call get-config Python module -# (caller must ensure env is setup, i.e. by running check-prereqs) +# (caller must ensure env is setup, i.e. by running ensure-prereqs) # Usage: _get_config _get_config() { local repo_root diff --git a/bin/lib/logging.sh b/bin/lib/logging.sh new file mode 100644 index 00000000..9f605291 --- /dev/null +++ b/bin/lib/logging.sh @@ -0,0 +1,224 @@ +#!/usr/bin/env bash +# +# Logging library: consistent, colorized logger functions for all Bureau scripts. +# +# Usage: +# source "$REPO_ROOT/bin/lib/logging.sh" +# +# Available functions: +# log_info "message" - Blue [INFO] prefix +# log_success "message" - Green ✓ prefix +# log_warning "message" - Yellow ⚠ prefix +# log_error "message" - Red ✗ prefix +# log_debug "message" - Gray [DEBUG] prefix (only if LOG_DEBUG=true) +# log_action "verb" "obj" - Blue verb, white object +# log_header "title" "path" ["note1" "note2" ...] - Section headers +# log_empty_line - Blank line +# log_divider - Horizontal rule +# log_separator - Empty + divider + empty +# log_banner "title" - Banner with dividers +# +# Environment variables: +# LOG_COLORS=false - Disable all colors (for CI/logs) +# LOG_DEBUG=true - Enable debug messages +# LOG_QUIET=true - Suppress info messages (warnings/errors only) + +# Color codes (ANSI escape sequences) +if [[ "${LOG_COLORS:-true}" == "true" && -t 1 ]]; then + # Terminal supports colors and user hasn't disabled them + export LOG_GREEN='\033[0;32m' + export LOG_BLUE='\033[0;34m' + export LOG_YELLOW='\033[1;33m' + export LOG_RED='\033[0;31m' + export LOG_GRAY='\033[0;90m' + export LOG_BOLD='\033[1m' + export LOG_NC='\033[0m' # ANSI "no colour" = reset +else + # No color support or disabled + export LOG_GREEN='' + export LOG_BLUE='' + export LOG_YELLOW='' + export LOG_RED='' + export LOG_GRAY='' + export LOG_BOLD='' + export LOG_NC='' +fi + +# For backward compatibility, also export unprefixed versions +export GREEN="$LOG_GREEN" +export BLUE="$LOG_BLUE" +export YELLOW="$LOG_YELLOW" +export RED="$LOG_RED" +export NC="$LOG_NC" + +# Core logging functions +log_info() { + if [[ "${LOG_QUIET:-false}" == "true" ]]; then + return 0 + fi + echo -e "${LOG_BLUE}[INFO]${LOG_NC} $*" +} + +log_success() { + echo -e "${LOG_GREEN}✓${LOG_NC} $*" +} + +log_warning() { + echo -e "${LOG_YELLOW}⚠${LOG_NC} $*" >&2 +} + +log_error() { + echo -e "${LOG_RED}✗${LOG_NC} $*" >&2 +} + +log_debug() { + if [[ "${LOG_DEBUG:-false}" == "true" ]]; then + echo -e "${LOG_GRAY}[DEBUG]${LOG_NC} $*" + fi +} + +# Action logging (verb + object pattern) +log_action() { + local action=$1 + local detail=${2:-} # Optional second argument + echo -e "${LOG_BLUE}${action}${LOG_NC} ${detail}" +} + +# Section headers (with optional notes in yellow) +log_header() { + local label=$1 + local path=$2 + shift 2 + + echo -e "${LOG_BLUE}${label}${LOG_NC} (${path})" + for line in "$@"; do + echo -e " ${LOG_YELLOW}${line}${LOG_NC}" + done +} + +# Formatting helpers +log_empty_line() { + echo "" +} + +log_divider() { + echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" +} + +log_separator() { + log_empty_line + log_divider + log_empty_line +} + +# Print a styled banner (for script headers) +log_banner() { + local title=$1 + local padding=" " + local divider_char="━" + local line_len=$(( ${#title} + 16 )) + + local divider + divider=$(printf '%*s' "$line_len" "" | tr ' ' "$divider_char") + + echo "$divider" + echo "${padding}${title}${padding}" + echo "$divider" +} + +# Progress indicator (for long-running operations) +log_progress() { + local current=$1 + local total=$2 + local message=${3:-""} + + local percent=$((current * 100 / total)) + local bar_width=30 + local filled=$((bar_width * current / total)) + local empty=$((bar_width - filled)) + + # Build progress bar + local bar="" + for ((i=0; i=0.8.0 + - mcp-server-qdrant + - --transport + - streamable-http + env: + QDRANT_URL: "http://127.0.0.1:${mcp.runtime_services.qdrant_db.host_port}" + COLLECTION_NAME: "${mcp.runtime_services.qdrant_mcp.settings.collection}" + EMBEDDING_PROVIDER: "${mcp.runtime_services.qdrant_mcp.settings.embedding_provider}" + FASTMCP_SERVER_PORT: "${mcp.runtime_services.qdrant_mcp.port}" + FASTMCP_SERVER_STATELESS_HTTP: "true" + healthcheck: + tcp: ${mcp.runtime_services.qdrant_mcp.port} + + sourcegraph_mcp: + enabled: true + kind: http_process + port: 8783 + depends_on: + dependencies: [sourcegraph_repo] + command: + - uv + - --directory + - ${mcp.dependencies.sourcegraph_repo.path} + - run + - sourcegraph-mcp + env: + SRC_ENDPOINT: https://sourcegraph.com + MCP_STREAMABLE_HTTP_PORT: "${mcp.runtime_services.sourcegraph_mcp.port}" + healthcheck: + tcp: ${mcp.runtime_services.sourcegraph_mcp.port} + + semgrep_mcp: + enabled: false + kind: http_process + port: 8784 + command: + - semgrep + - mcp + - -t + - streamable-http + - --port + - ${mcp.runtime_services.semgrep_mcp.port} + healthcheck: + tcp: ${mcp.runtime_services.semgrep_mcp.port} + + serena_mcp: + enabled: true + kind: http_process + port: 8785 + command: + - uvx + - --from + - git+https://github.com/oraios/serena + - serena + - start-mcp-server + - --transport + - streamable-http + - --port + - ${mcp.runtime_services.serena_mcp.port} + healthcheck: + tcp: ${mcp.runtime_services.serena_mcp.port} + + client_configs: + qdrant: + enabled: true + clients: + default: + transport: http + url: "http://localhost:${mcp.runtime_services.qdrant_mcp.port}/mcp/" + + sourcegraph: + enabled: true + clients: + default: + transport: http + url: "http://localhost:${mcp.runtime_services.sourcegraph_mcp.port}/sourcegraph/mcp/" + + semgrep: + enabled: false + clients: + default: + transport: http + url: "http://localhost:${mcp.runtime_services.semgrep_mcp.port}/mcp/" + + serena: + enabled: true + clients: + default: + transport: http + url: "http://localhost:${mcp.runtime_services.serena_mcp.port}/mcp/" + + context7: + enabled: true + requires_env: [CONTEXT7_API_KEY] + clients: + default: + transport: http + url: "https://mcp.context7.com/mcp" + headers: + CONTEXT7_API_KEY: "${CONTEXT7_API_KEY}" + Accept: "application/json, text/event-stream" + codex: + transport: stdio + command: + - npx + - -y + - "@upstash/context7-mcp" + - --api-key + - ${CONTEXT7_API_KEY} + + tavily: + enabled: true + requires_env: [TAVILY_API_KEY] + clients: + default: + transport: http + url: "https://mcp.tavily.com/mcp/?tavilyApiKey=${TAVILY_API_KEY}" + + brave: + enabled: true + requires_env: [BRAVE_API_KEY] + clients: + default: + transport: stdio + command: + - npx + - -y + - "@brave/brave-search-mcp-server" + - --transport + - stdio + env: + BRAVE_API_KEY: "${BRAVE_API_KEY}" + + pal: + enabled: true + settings: + disabled_tools: "analyze,apilookup,challenge,chat,codereview,consensus,debug,docgen,planner,precommit,refactor,secaudit,testgen,thinkdeep,tracer" + clients: + default: + transport: stdio + command: &pal_command + - sh + - -c + - "for p in $(which uvx 2>/dev/null) $HOME/.local/bin/uvx /opt/homebrew/bin/uvx /usr/local/bin/uvx uvx; do [ -x \"$p\" ] && exec \"$p\" --from git+https://github.com/BeehiveInnovations/pal-mcp-server.git pal-mcp-server; done; echo \"uvx not found\" >&2; exit 1" + env: + PATH: "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:$HOME/.local/bin:${CLI_BIN_PATHS}" + DISABLED_TOOLS: "${mcp.client_configs.pal.settings.disabled_tools}" + CUSTOM_API_URL: "http://localhost:11434" + claude: + transport: stdio + command: *pal_command + env: + PATH: "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:$HOME/.local/bin:${CLI_BIN_PATHS}" + DISABLED_TOOLS: "${mcp.client_configs.pal.settings.disabled_tools}" + CUSTOM_API_URL: "http://localhost:11434" + post_config: + claude_settings_env: + MCP_TIMEOUT: "300000" + MCP_TOOL_TIMEOUT: "1200000" + gemini: + transport: stdio + command: *pal_command + env: + PATH: "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:$HOME/.local/bin:${CLI_BIN_PATHS}" + DISABLED_TOOLS: "${mcp.client_configs.pal.settings.disabled_tools}" + CUSTOM_API_URL: "http://localhost:11434" + timeout_ms: 1200000 + codex: + transport: stdio + command: *pal_command + env: + PATH: "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:$HOME/.local/bin:${CLI_BIN_PATHS}" + DISABLED_TOOLS: "${mcp.client_configs.pal.settings.disabled_tools}" + CUSTOM_API_URL: "http://localhost:11434" + startup_timeout_sec: 300 + tool_timeout_sec: 1200 + + filesystem: + enabled: true + settings: + whitelist: "${path_to.workspace}" + allowed_methods: + - read_multiple_files + clients: + default: + transport: stdio + command: + - npx + - -y + - mcp-filter + - -s + - "npx -y @modelcontextprotocol/server-filesystem ${mcp.client_configs.filesystem.settings.whitelist}" + - -a + - read_multiple_files + + fetch: + enabled: true + clients: + default: + transport: stdio + command: + - uvx + - mcp-server-fetch + + memory: + enabled: true + storage_path: ~/.memory-mcp/memory.jsonl + clients: + default: + transport: stdio + command: + - npx + - -y + - "@modelcontextprotocol/server-memory" + env: + MEMORY_FILE_PATH: "${mcp.client_configs.memory.storage_path}" + + playwright: + enabled: true + clients: + default: + transport: stdio + command: + - npx + - -y + - "@playwright/mcp@latest" + + +# ───────────────────────────────────────────────────────────────────────────── +# Agent configuration +# ───────────────────────────────────────────────────────────────────────────── +# Coding agent CLIs that Bureau should configure +# Available: claude, gemini, codex, opencode + +agents: + - claude + - gemini + - codex + - opencode + + +# ───────────────────────────────────────────────────────────────────────────── +# Roles and skills +# ───────────────────────────────────────────────────────────────────────────── + +# Role prompt availability: choose which agents are available when launching +# CLIs directly (separate from PAL's clink cross-CLI delegation, configured +# via pal.base-roles) +# Options: "all", explicit list, or empty list [] +roles: + enabled: + - architect + - debugger + - code-reviewer + - optimization + - testing + - migration-refactoring + + # roles to explicitly exclude (takes precedence over enabled) + disabled: [] + + # source directories for role discovery + sources: + - path: agents/role-prompts # For Codex, Gemini, OpenCode + cli: [codex, gemini, opencode] + - path: agents/claude-subagents # For Claude Code (has frontmatter) + cli: [claude] + +# Skills configuration (installed via protocols/scripts/set-up-skills.sh) +skills: + enabled: [micro-mode, assess-mode] + disabled: [] + sources: + - path: protocols/context/static/skills + prefix: bureau- + +# PAL clink configuration: controls which models and system prompts are +# available when using PAL MCP's clink tool to spawn subagents across CLIs +pal: + # Baseline roles made available to ALL coding CLIs + # + # - Roles are auto-discovered based on the set in agents/role-prompts + # - Options (also for per-CLI `extra-roles` settings below): + # - "all" => include all role prompts (note that when this is set, + # the per-CLI `extra-roles` settings below will have no effect) + # - "none" => include no roles + # - an explicitly-specified list of role prompts, referenced by their + # file's basename, e.g. `architect` for `agents/role-prompts/architect.md` + base-roles: all + + # PAL CLI-specific setting options + # IMPORTANT: After changing any of the settings below: + # 1. Re-run protocols/scripts/set-up-protocols.sh (or bin/open-bureau wrapper) + # 2. Restart coding CLIs (or if possible, use their MCP-related commands + # without restarting) to reconnect to PAL + claude: + # Takes any valid option for Claude's `model` config setting + # (see https://code.claude.com/docs/en/model-config) + model: sonnet + + # Extra role prompts (beyond `base-roles` above) to make available to + # only Claude when called via clink + extra-roles: none + + codex: + # Takes any valid option for Codex's `model` config setting + # (see https://developers.openai.com/codex/local-config#configuration-options) + model: gpt-5.2-codex + + # Takes any valid option for Codex's `model_reasoning_effort` config setting + # (see https://developers.openai.com/codex/local-config#configuration-options) + effort: medium + + # Extra role prompts (beyond `base-roles` above) to make available to + # only Codex when called via clink + extra-roles: none + + gemini: + # Extra role prompts (beyond `base-roles` above) to make available to + # only Gemini when called via clink + extra-roles: none + + +# ───────────────────────────────────────────────────────────────────────────── +# Paths +# ───────────────────────────────────────────────────────────────────────────── + +# File paths for config files, connection settings, etc. +# Note: environment variables take precedence over these values +path_to: + # Base workspace for user projects (override using: $BUREAU_WORKSPACE) + # When set, the following paths are automatically derived from it + # (unless explicitly overridden): + # - serena_memories_root (root for recursive scanning of Serena memory + # files during Bureau-run cleanup) + workspace: ~/code + + # MCP server clone location (relative paths resolve from main repo root, + # shared across Bureau worktrees) + mcp_clones: .mcp-servers + + +# ───────────────────────────────────────────────────────────────────────────── +# Auto-approval rules +# ───────────────────────────────────────────────────────────────────────────── + +# Auto-approval settings +# When true, agents won't prompt for permission before using MCP tools +auto_approved: + mcp_tools: false + bash: + enabled: false + ruleset: + allow: + - "ls" + - "rg" + - "git status" + - "git diff" + - "git log" + - "npm run test" + - "python" + - "sed -i" + - "tee" + - "cat" + - "printf" + - "mkdir -p" + # deny: + # - "rm" + # - "git add" + # - "git commit" + # - "git reset" + # - "git checkout" + # - "git merge" + # - "git rebase" + # - "git push" + # - "git pull" + # - "git clean" + # - "pip install" + # - "npm install" + # - "pnpm install" + # - "yarn add" + # - "sudo" + # - "dd" + # - "mkfs" + +prune_disabled_mcps: true + + +# ───────────────────────────────────────────────────────────────────────────── +# Coding standards +# ───────────────────────────────────────────────────────────────────────────── + +# Coding standards documents read by agents at startup and used by assess mode +# Override: provide your own files in local.yml to replace these defaults +# code_standards: +# - protocols/context/static/code-standards.md + +# Assess mode skill configuration +assess_mode: + default_target: git-diff + default_diff: HEAD + + +# ───────────────────────────────────────────────────────────────────────────── +# Retention & cleanup +# ───────────────────────────────────────────────────────────────────────────── + +# Retention periods (per memory backend), after which memories older than +# this threshold will be moved to trash automatically by the cleanup process +# Formats: never, 24h (hours), 30d (days), 2w (weeks), 3m (months), 1y (years) +retention_period_for: + claude_mem: 30d # Claude-mem SQLite database (Claude Code only) + serena: 90d # Serena project memories (.serena/memories/) + qdrant: 180d # Qdrant vector database (used by all enabled CLIs) + memory_mcp: 365d # Memory MCP JSONL file (used by all enabled CLIs) + +# Cleanup interval (for stale memories) +cleanup: + min_interval: 24h + +# Grace period after stale items are moved to trash before permanent deletion +trash: + grace_period: 30d + + +# ───────────────────────────────────────────────────────────────────────────── +# Startup timeouts +# ───────────────────────────────────────────────────────────────────────────── + +# Timeouts (in seconds) when starting up +# (increase for slower machines) +startup_timeout_for: + mcp_servers: 200 # HTTP MCP servers + docker_daemon: 120 # Docker containers diff --git a/directives.yml b/directives.yml deleted file mode 100644 index a8c78c4f..00000000 --- a/directives.yml +++ /dev/null @@ -1,108 +0,0 @@ -# Bureau user-tunable configuration -# - Changes to this file will directly alter Bureau's behavior. -# - See CONFIGURATION.md for details. -# -# This file is version-controlled and shared across all users. -# For local overrides (not to be tracked by git), create and write to local.yml instead. - -# Coding agent CLIs that Bureau should configure -# Available: claude, gemini, codex, opencode -agents: - - claude - - gemini - - codex - - opencode - -# File paths for config files, connection settings, etc. -# Note: environment variables take precedence over these values -path_to: - # Base workspace for user projects (override using: $BUREAU_WORKSPACE) - # When set, the following paths are automatically derived from it (unless explicitly overridden): - # - serena_memories_root (root for recursive scanning of Serena memory files during Bureau-run cleanup) - # - fs_mcp_whitelist (Filesystem MCP security boundary) - workspace: ~/code - - # MCP server clone location (relative paths resolve from main repo root, shared across Bureau worktrees) - mcp_clones: .mcp-servers - -# MCP tool permissions -# When true, agents won't prompt for permission before using MCP tools -mcp: - auto_approve: no - -# PAL clink configuration: controls which models and system prompts are available -# when using PAL MCP's clink tool to spawn subagents across coding CLIs -pal: - # Baseline roles made available to ALL coding CLIs - # - # - Roles are auto-discovered based on the set in agents/role-prompts - # - Options (also for per-CLI `extra-roles` settings below): - # - "all" => include all role prompts (note that when this is set, - # the per-CLI `extra-roles` settings below will have no effect) - # - "none" => include no roles - # - an explicitly-specified list of role prompts, referenced by their file's basename, - # e.g. `architect` for `agents/role-prompts/architect.md` - base-roles: all - - # PAL CLI-specific setting options - # IMPORTANT: After changing any of the settings below: - # 1. Re-run protocols/scripts/set-up-configs.sh (or bin/open-bureau wrapper) - # 2. Restart coding CLIs (or if possible, use their MCP-related commands without - # restarting) to reconnect to PAL - claude: - # Takes any valid option for Claude's `model` config setting - # (see https://code.claude.com/docs/en/model-config) - model: sonnet - - # Extra role prompts (beyond `base-roles` above) to make available to - # only Claude when called via clink - extra-roles: none - - codex: - # Takes any valid option for Codex's `model` config setting - # (see https://developers.openai.com/codex/local-config#configuration-options) - model: gpt-5.2-codex - - # Takes any valid option for Codex's `model_reasoning_effort` config setting - # (see https://developers.openai.com/codex/local-config#configuration-options) - effort: medium - - # Extra role prompts (beyond `base-roles` above) to make available to - # only Codex when called via clink - extra-roles: none - - gemini: - # Extra role prompts (beyond `base-roles` above) to make available to - # only Gemini when called via clink - extra-roles: none - -# Retention periods (per memory backend), after which memories older than this threshold -# will be moved to trash automatically by the cleanup process -# Formats: never, 24h (hours), 30d (days), 2w (weeks), 3m (months), 1y (years) -retention_period_for: - claude_mem: 30d # Claude-mem SQLite database (Claude Code only) - serena: 90d # Serena project memories (.serena/memories/) - qdrant: 180d # Qdrant vector database (used by all enabled CLIs) - memory_mcp: 365d # Memory MCP JSONL file (used by all enabled CLIs) - -# Cleanup interval (for stale memories) -cleanup: - min_interval: 24h - -# Grace period after stale items are moved to trash -# before permanent deletion -trash: - grace_period: 30d - -# Timeouts (in seconds) to wait when starting up components (increase for slower machines) -startup_timeout_for: - mcp_servers: 200 # For HTTP MCP servers started by Bureau - docker_daemon: 120 # For Docker daemon startup - -# Ports for locally-run HTTP servers & backing DB containers -port_for: - qdrant_db: 8780 - qdrant_mcp: 8782 - sourcegraph_mcp: 8783 - semgrep_mcp: 8784 - serena_mcp: 8785 diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 88a6ffe5..9c0e92b1 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -40,30 +40,30 @@ Configuration is loaded and merged in this order (later sources override earlier | Priority | File | Purpose | Tracked by git? | |:---------|:-----|:--------|:------------------| -| 1 (lowest) | **`charter.yml`** | Fixed system defaults | Yes | -| 2 | **`directives.yml`** | Streamlined collection of team-/user-oriented settings that are often tweaked | Yes | -| 3 | **`local.yml`** | Personal overrides | No | -| 4 (highest) | **Environment variables** | Runtime overrides *(should be used rarely; for more persistent personal overrides, use `local.yml`)* | N/A | +| 1 **(lowest)** | [`defaults.yml`](/defaults.yml) | All git-tracked package defaults (ships with Bureau) | Yes | +| 2 | `.bureau.yml` | Optional project-level config (discovered by walk-up from working directory) | Yes (in *your* project) | +| 3 | `local.yml` | Personal overrides | No | +| 4 **(highest)** | Environment variables | Runtime overrides *(should be used rarely; for more persistent personal overrides, use `local.yml`)* | N/A | ### When to use each file -#### `charter.yml` +#### `defaults.yml` -- Don't edit unless you're changing upstream service endpoints or package conventions. -- These are values that rarely (if ever) need changing. +- This is the **single source of all git-tracked defaults that ship with Bureau.** +- It contains examples of how to set config values (to use as exemplars to follow when overriding them in `.bureau.yml` or `local.yml`). -#### `directives.yml` +#### `.bureau.yml` -- *Read* to see examples of how to set config values (to then override in your `local.yml`). -- *Edit* to change team-wide defaults like retention periods, enabled agents, ports, or paths. +- Optional project-level config file discovered by walking up from the current working directory (like ESLint, Prettier, or Ruff configs). +- Can override any setting from `defaults.yml`. +- Shareable via git: commit it to your project repo so all team members using Bureau get the same project-specific settings. +- Typical uses: project-specific workspace paths, retention periods, enabled agents, MCP catalog entries, or custom tool settings. - - Changes here affect everyone using *that* particular Bureau installation. +#### `local.yml` -#### `local.yml` +Create and write to this file for personal overrides that shouldn't be shared, e.g.: -Create and write to this file for personal overrides that shouldn't be shared, e.g.: - -- custom workspace paths +- custom workspace paths - custom retention periods for memories (configured per-MCP) - disabling Bureau configuration for agent CLIs you don't use @@ -85,22 +85,319 @@ agents: Remove an agent from the list to skip configuring it. Note that the CLI's config directory must also exist (e.g., `~/.claude/` for Claude Code). + ### `mcp` -**File:** `directives.yml` +MCP catalog configuration. + +**Cross-cutting conventions:** + +- **Canonical IDs:** map keys serve as *canonical IDs* for both `mcp.runtime_services` and `mcp.client_configs`. + + - Do **not** add `name` fields; these cause needless duplication and drift. + +- **Placeholder expansion:** all string fields support `${...}` expansion. + + - **Expansion order:** OS environment variables first, then config key paths (e.g. `${path_to.workspace}`, `${mcp.runtime_services.qdrant_mcp.port}`). + - Unknown placeholders remain untouched. + - **Environment variables for MCP servers:** + - To pass environment variables to an MCP server process, add them as `env` entries in the server's client config. + - The **values** in these `env` entries can themselves contain `${...}` placeholders that get resolved from your OS environment at config-load time before being passed to the MCP server. + - **Example:** + + ```yaml + mcp: + client_configs: + context7: + clients: + default: + transport: http + url: https://api.context7.dev/mcp + env: + CONTEXT7_API_KEY: "${CONTEXT7_API_KEY}" # ← Expanded from your shell's $CONTEXT7_API_KEY + ``` + +- **Dependency semantics:** + + - `depends_on.services` is a list of service IDs. + - A service is considered ready only after its `healthcheck` succeeds (if present). + - Startup order is topological: servers with `depends_on.services` are skipped if any dependency is disabled or missing. + +- **Unknown keys:** unknown keys are preserved during resolution to allow future/custom extensions. +- **Enabled by default:** dependencies, runtime services, and client configs are all **enabled by default** when `enabled` is omitted. -MCP tool permissions. + - **Adding an entry to any MCP bucket is sufficient to activate it on the next `open-bureau` run.** + - You *must* set `enabled: false` explicitly to define an entry without activating it. + +**Top-level structure:** + +- `auto_approved` (object): + - `mcp_tools` (bool, default: `false`): Whether MCP tools should be auto‑approved by setup scripts. + - `bash` (object): + - `enabled` (bool, default: `false`): Whether Bash allow/deny rules should be applied. + - `ruleset` (object): + - `allow` (list\): Literal command prefixes to allow. + - `deny` (list\): Literal command prefixes to deny. +- `prune_disabled_mcps` (bool, default: `false`): When `true`, Bureau prunes previously managed MCPs that are no longer desired, using per‑CLI registry fingerprints to avoid removing user‑modified entries. +- `mcp` (object): + - `dependencies` (map\): Non-daemon prerequisites (git repos, file storage) prepared before services. + - `runtime_services` (map\): Managed runtime services (containers, processes) that may depend on dependencies. + - `client_configs` (map\): MCP servers exposed to CLIs. + +#### `mcp.dependencies` + +Defines non-daemon prerequisites (git repos, file storage) that are prepared before services. Dependencies cannot depend on other dependencies — they are prepared in sorted order first, then services (which may depend on them) are started. + +**Config schema** for each entry in `mcp.dependencies.`: + +- `enabled` (bool, default: `true`): Skip dependency if `false`. +- `kind` (string, required): One of `git_repo`, `file`. +- All kind-specific fields (see below). + +**Dependency kinds:** + +- `git_repo`: + - `repo_url` (string, required) + - `branch` (string, optional) + - `path` (string, required): Clone destination. + - `post_clone` (`list>`, optional): Commands run in `path` after clone/update. +- `file`: + - `path` (string, required): File path used by cleanup and other tools. + +**Example:** +```yaml +mcp: + dependencies: + sourcegraph_repo: + kind: git_repo + repo_url: https://github.com/user/repo.git + branch: main + path: ${path_to.mcp_clones}/repo + post_clone: + - ["uv", "sync"] + + claude_mem_storage: + kind: file + path: ~/.claude-mem/claude-mem.db +``` +#### `mcp.runtime_services` + +Defines managed runtime services that Bureau starts (containers, local HTTP processes). Services can depend on dependencies via `depends_on.dependencies`. + +**Config schema** for each entry in `mcp.runtime_services.`: + +- `enabled` (bool, default: `true`): Skip service if `false`. +- `kind` (string, required): One of `docker_container`, `http_process`; see kind-specific fields below. +- `depends_on` (object, optional): + - `services` (list\): Service IDs that must be started and pass `healthcheck` first. + - `dependencies` (list\): Dependency IDs that must be prepared first. +- `healthcheck` (object, optional): + - `tcp` (int): Port to probe for readiness. +- `env` (map\, optional): Environment vars for process services. +- `command` (list\, optional): Command array (executable + args) for process services. +- `settings` (map\, optional): Service-specific data used for templating. + +**Kind-specific fields:** + +- `docker_container`: + - `container_name` (string, optional): Docker container name. + - `image` (string, required): Docker image ref. + - `host_port` (int, required): Host port bound to container. + - `container_port` (int, required): Container port to expose. + - `mounts` (`list`, optional): + - `host_path` (string, required) + - `container_path` (string, required) +- `http_process`: + - `port` (int, required): Port the process should listen on. + - `command` (`list`, required): Command array to launch server. + - `env` (`map`, optional) + +**Example:** ```yaml mcp: - auto_approve: no # yes/true or no/false + runtime_services: + sourcegraph_mcp: + kind: http_process + port: 8783 + depends_on: + dependencies: [sourcegraph_repo] + command: + - uv + - --directory + - ${mcp.dependencies.sourcegraph_repo.path} + - run + - sourcegraph-mcp ``` -When set to `yes` or `true`, agents won't prompt for permission before using MCP tools and other common functionality (e.g. trivial bash commands). This is convenient for trusted setups but bypasses the safety confirmation dialogs. +#### `mcp.client_configs` + +Defines MCP servers exposed to CLIs, including per‑CLI client overrides. Servers can depend on runtime services and dependencies via `depends_on`. + +**Config schema** for each entry in `mcp.client_configs.`: + +- `enabled` (bool, default: `true`): Skip server if `false`. +- `requires_env` (`list`, optional): If any env var is missing/empty, the server is skipped. +- `depends_on` (object, optional): + - `services` (list\): Service IDs that must be enabled/resolved for the server to be included. + - `dependencies` (list\): Dependency IDs that must be enabled/resolved for the server to be included. +- `clients` (`map`, required): Per‑CLI client configs. + - `clients.default` (Client, optional but strongly recommended): Used by all CLIs unless a CLI override exists. + - `clients.` (Client, optional): Overrides for `claude`, `gemini`, `codex`, `opencode`. + - `clients.disabled_for` (`list`, optional): Agent names to exclude from this server. Values should match entries in the top-level `agents` list. When listed, the agent does not receive this server, even if a `clients.` override exists. +- `settings` (`map`, optional): Server-level settings (e.g. PAL disabled tools). Pass‑through; the renderer should not drop unknown keys. +- `storage_path` (string, optional): Server‑specific storage (used by cleanup, e.g. Memory MCP). + +**Client** — each entry in `mcp.client_configs..clients.`: + +- `transport` (string, required): `http` or `stdio`. +- `url` (string, required for `http`): MCP HTTP endpoint. +- `headers` (`map`, optional): HTTP headers (expanded). +- `command` (`list`, required for `stdio`): Command array to launch MCP server. +- `env` (`map`, optional): Environment vars for stdio servers. +- `timeout_ms` (int, optional): Per‑server tool timeout (Gemini). +- `startup_timeout_sec` (int, optional): Startup timeout (Codex). +- `tool_timeout_sec` (int, optional): Tool timeout (Codex). +- `post_config` (object, optional): CLI‑specific side effects, e.g.: + - `claude_settings_env` (`map`): Adds keys to `~/.claude/settings.json` under `.env`. -**Accepted values:** -- `yes` or `true` - Enable auto-approval -- `no` or `false` - Require manual approval (default) +> [!NOTE] +> Codex HTTP does not support custom headers; use `clients.codex` with `stdio` for servers requiring headers (e.g. Context7). + +### `skills` + + +Controls which skills are installed by `protocols/scripts/set-up-skills.sh`. + +```yaml +skills: + enabled: all + disabled: [] + sources: + - path: protocols/context/static/skills + prefix: bureau- +``` + +**Fields:** +- `enabled`: `all` or a list of skill directory names (without prefix) to include. +- `disabled`: list of skill directory names to exclude. +- `sources`: list of directories to scan for skills. + - `path`: absolute path or repo‑relative path. + - `prefix`: prefix applied to installed skill names (e.g. `bureau-`). + +> [!CAUTION] +> - `protocols/scripts/set-up-skills.sh` removes **all** existing skills with the `bureau-` prefix from each CLI's skills directory before reinstalling. +> - **Avoid naming your own custom skills `bureau-*`** unless you expect them to be wiped during setup. + + +### `assess_mode` + +**Files:** `directives.yml` (defaults), `local.yml` (personal overrides) + +Runtime configuration for the [`bureau-assess-mode` skill](../protocols/context/static/skills/assess-mode/SKILL.md). These values are read by the skill at activation time to determine what to review. Standards for audit are configured via the top-level [`code_standards`](#code_standards) setting. + +```yaml +assess_mode: + default_target: git-diff + default_diff: HEAD +``` + +**Fields:** +- `default_target`: how the skill determines what to review when the user doesn't specify explicit files. Currently only `git-diff` is supported. +- `default_diff`: the git ref to diff against when `default_target` is `git-diff`. Common values: `HEAD` (unstaged + untracked vs last commit), `main` (full branch diff), or any commit SHA. + +### `roles` + +**Files:** `directives.yml` (defaults), `local.yml` (personal overrides) + +Controls which agent roles are available when launching CLIs **directly** through their native features (slash commands for Claude Code, launcher scripts for Codex/Gemini, auto-discovery for OpenCode). This is **separate from** PAL's `clink` tool cross-CLI delegation, which is configured via [`pal.base-roles`](#palbase-roles). + +```yaml +roles: + enabled: + - architect + - debugger + - code-reviewer + - optimization + - testing + - migration-refactoring + disabled: [] + sources: + - path: agents/role-prompts # For Codex, Gemini, OpenCode + cli: [codex, gemini, opencode] + - path: agents/claude-subagents # For Claude Code (has frontmatter) + cli: [claude] +``` + +**Fields:** +- `enabled`: `all` or a list of agent role names to include. Agent names correspond to role file stems (e.g., `architect` for `architect.md`). +- `disabled`: List of agent role names to exclude (takes precedence over `enabled`). +- `sources`: List of directories to scan for agent role prompts, with per-CLI mappings. + - `path`: Relative path from repo root to agent role directory. + - `cli`: List of CLIs that should use this source directory. + +**How setup scripts use this:** + +| CLI | Native Feature | Setup Script | Result | +|:----|:---------------|:-------------|:-------| +| Claude Code | Slash commands | `agents/scripts/set-up-claude-slash-commands.sh` | Creates `~/.claude/commands/*.md` files | +| Codex | Launcher scripts | `agents/scripts/set-up-codex-role-launchers.sh` | Creates `~/.local/bin/codex-*` executables | +| Gemini CLI | Launcher scripts | `agents/scripts/set-up-gemini-role-launchers.sh` | Creates `~/.local/bin/gemini-*` executables | +| OpenCode | Auto-discovery | `agents/scripts/set-up-agents.sh` | Creates filtered symlinks in `~/.config/opencode/agent/bureau-agents/` | + +**Default enabled agents:** + +The default configuration enables 6 core agent roles, excluding all others: +- `architect` - Principal software architect for system design +- `debugger` - Deep debugging and root-cause analysis +- `code-reviewer` - Code quality and security audits +- `optimization` - Performance optimization specialist +- `testing` - Test infrastructure and quality engineering +- `migration-refactoring` - Large-scale refactoring strategist + +**Distinction from PAL configuration:** + +| Setting | Scope | Purpose | +|:--------|:------|:--------| +| `roles` | **Native CLI usage** | Controls slash commands, launchers, auto-discovery | +| `pal.base-roles` | **PAL's `clink` tool** | Controls cross-CLI subagent delegation | + +These are independent: you can have all agents available for PAL's `clink` while restricting native CLI usage to a smaller set, or vice versa. + +**Example: Enable all agents for native usage** + +```yaml +# local.yml +roles: + enabled: all +``` + +**Example: Enable specific agents with exclusions** + +```yaml +# local.yml +roles: + enabled: all + disabled: + - chaos-engineer + - incident-commander # Exclude specific roles from "all" +``` + +**Example: Custom agent set** + +```yaml +# local.yml +roles: + enabled: + - architect + - frontend + - security-compliance + - distributed-systems + disabled: [] +``` + +> [!NOTE] +> After modifying `roles` configuration, run `./bin/open-bureau` to regenerate slash commands, launchers, and symlinks. For Claude Code, the changes take effect immediately (run `/help` to see updated list). For Codex/Gemini launchers, you may need to restart your shell or run `hash -r` to refresh the command cache. ### `pal` @@ -250,20 +547,15 @@ File and directory paths used by Bureau and its tools. |:--------|:--------|:------------| | `workspace` | `~/code` | Base workspace directory; other paths derive from this | | `serena_memories_root` | (= `workspace`) | Root directory for scanning Serena memory files *(used for **Bureau-run cleanup only**)* | -| `fs_mcp_whitelist` | (= `workspace`) | Directory boundary for Filesystem MCP access | | `mcp_clones` | `.mcp-servers/` | Clone location for MCP server source code | -| `storage_for.claude_mem` | `~/.claude-mem/claude-mem.db` | Claude-mem SQLite database path | -| `storage_for.memory_mcp` | `~/.memory-mcp/memory.jsonl` | Memory MCP knowledge graph storage | -| `storage_for.qdrant` | `~/.qdrant/storage` | Qdrant Docker volume mount point | > [!NOTE] > > #### Paths automatically derived from `workspace` > -> When `workspace` is set, the following paths are automatically derived from it (unless explicitly overridden): +> When `workspace` is set, the following path is automatically derived from it (unless explicitly overridden): > > - `serena_memories_root` → same as `workspace` -> - `fs_mcp_whitelist` → same as `workspace` > > This means you only need to configure `workspace` in `local.yml` to change all workspace-related paths at once. @@ -272,43 +564,9 @@ File and directory paths used by Bureau and its tools. ```yaml path_to: # User-tunable paths - workspace: ~/code # Base workspace directory - serena_memories_root: ~/code # Root for scanning Serena memory files (used by Bureau-run cleanup only) - fs_mcp_whitelist: ~/code # Filesystem MCP security boundary - mcp_clones: .mcp-servers/ # MCP server clone location (in repo root) - - # Storage paths for memory backends - storage_for: - claude_mem: ~/.claude-mem/claude-mem.db # Claude-mem SQLite database - memory_mcp: ~/.memory-mcp/memory.jsonl # Memory MCP JSONL storage - qdrant: ~/.qdrant/storage # Qdrant Docker volume mount -``` - -### `endpoint_for` - -**File:** `charter.yml` - -Cloud-hosted MCP service endpoints. - -```yaml -endpoint_for: - sourcegraph: https://sourcegraph.com - context7: https://mcp.context7.com/mcp - tavily: https://mcp.tavily.com/mcp/?tavilyApiKey=${TAVILY_API_KEY} -``` - -These rarely need changing unless you're using self-hosted instances. - -### `qdrant` - -**File:** `charter.yml` - -Qdrant vector database settings. - -```yaml -qdrant: - collection: coding-memory # Collection name - embedding_provider: fastembed # Embedding model provider + workspace: ~/code # Base workspace directory + serena_memories_root: ~/code # Root for scanning Serena memory files (used by Bureau-run cleanup only) + mcp_clones: .mcp-servers/ # MCP server clone location (in repo root) ``` ## Environment variable overrides @@ -318,11 +576,14 @@ Some configuration values can be overridden via environment variables: | Environment Variable | Overrides | Description | |:---------------------|:----------|:------------| | `BUREAU_WORKSPACE` | `path_to.serena_memories_root` | Root for scanning Serena memory files | -| `MEMORY_MCP_STORAGE_PATH` | `path_to.storage_for.memory_mcp` | Memory MCP storage path | -| `CLAUDE_MEM_STORAGE_PATH` | `path_to.storage_for.claude_mem` | Claude-mem database path | -| `QDRANT_STORAGE_PATH` | `path_to.storage_for.qdrant` | Qdrant storage directory | -| `QDRANT_COLLECTION_NAME` | `qdrant.collection` | Qdrant collection name | -| `QDRANT_EMBEDDING_PROVIDER` | `qdrant.embedding_provider` | Embedding provider | + +> [!NOTE] +> +> Some remote MCPs require API keys (even for their free versions). Set these env vars accordingly to enable them: +> +> - `TAVILY_API_KEY` +> - `BRAVE_API_KEY` +> - `CONTEXT7_API_KEY` ## Examples @@ -377,9 +638,12 @@ For example, if port `8780` (the default Qdrant DB listening port) is already in ```yaml # local.yml -port_for: - qdrant_db: 9780 -``` +## Agent context files + +**Location:** `~/.config/bureau/protocols/` + +> [!WARNING] +> If you customize Bureau's MCP catalog (add, remove, or reconfigure tools), the default `tools-guide.md` may no longer accurately reflect your setup. **You are responsible for updating or replacing protocols files in `~/.config/bureau/protocols/` to match your configuration.** Run `bin/reset-protocols` to restore defaults at any time. ## Security note for subagents spawned via PAL MCP's `clink` diff --git a/docs/SETUP.md b/docs/SETUP.md index 19417ba1..b7c9683c 100644 --- a/docs/SETUP.md +++ b/docs/SETUP.md @@ -17,7 +17,7 @@ Bureau's bootstrap script handles all setup automatically, providing sensible de > - [`bureau/bin/`](../bin/) > - `~/.local/bin` *(only if using **Gemini CLI** and/or **Codex**)* > -> 3. **Create `local.yml` in the repo root (recommended)** or edit [`directives.yml`](../directives.yml) to set `path_to.workspace` to where you keep the repos/projects you want to work on, e.g.: +> 3. **Create `local.yml` in the repo root (recommended)** or edit [`defaults.yml`](../defaults.yml) to set `path_to.workspace` to where you keep the repos/projects you want to work on, e.g.: > > ```yml > # local.yml @@ -101,7 +101,7 @@ In your shell profile (`~/.zshrc`, `~/.bashrc`, etc.), add the following paths t ### Set Bureau workspace path (`path_to.workspace`) -**Create `local.yml` in the repo root (recommended)** or edit [`directives.yml`](../directives.yml) to set `path_to.workspace` to where you keep the repos/projects you want to work on, e.g.: +**Create `local.yml` in the repo root (recommended)** or edit [`defaults.yml`](../defaults.yml) to set `path_to.workspace` to where you keep the repos/projects you want to work on, e.g.: ```yml # local.yml @@ -131,7 +131,7 @@ $ open-bureau # in this repo's bin/ > [!IMPORTANT] > -> **`open-bureau` must be re-run after editing any config values** in the Bureau `.yml` files, i.e. [`charter.yml`](../charter.yml), [`directives.yml`](../directives.yml) (and `local.yml` if you've created one). +> **`open-bureau` must be re-run after editing any config values** in the Bureau `.yml` files, i.e. [`defaults.yml`](../defaults.yml), `.bureau.yml` (if you've created one), and `local.yml` (if you've created one). #### What `open-bureau` does *(optional extra info)* @@ -176,8 +176,7 @@ $ claude > [!NOTE] > -> Codex also works with Superpowers: Bureau sets it up **automatically** (via cloning the repo to the user-scoped `~/.codex/`).\ -> No manual setup is needed. +> Codex also works with Superpowers; Bureau sets it up **automatically**. > [!IMPORTANT] > @@ -221,7 +220,7 @@ $ claude 1. Ask: "What must-read files were you given?" 2. Should reference delegation rules and clink -3. Run `~/.codex/superpowers/.codex/superpowers-codex find-skills` to verify Superpowers +3. Run `ls -ld ~/.agents/skills/superpowers`: it should be a symlink to `~/.codex/superpowers/skills` 4. Run `codex-explainer` *(custom launch wrapper set up by Bureau)* from the command line to see if it launches Codex with the [`explainer`](../agents/role-prompts/explainer.md) agent active in the *main conversation* > You can test this with any role prompt in [`agents/role-prompts/`](../agents/role-prompts/): the launch wrapper created for each role will have the form `codex-` for each file in the directory. @@ -259,7 +258,7 @@ $ close-bureau # in this repo's bin/ ### How to tweak settings -1. Create a file called `local.yml` in the Bureau repo root **(recommended over changing `directives.yml`)** +1. Create a file called `local.yml` in the Bureau repo root **(recommended over changing `defaults.yml`)** 2. Place your overrides there *(see the [simple power user `local.yml` config below](#simple-power-user-configuration-example) for inspiration)*. 3. Re-run `open-bureau`. @@ -273,16 +272,16 @@ The new `local.yml` will be: | Setting | Default | How to customize | | :--- | :--- | :--- | | Enabled CLI agents | All 4 [supported CLIs](#supported-cli-coding-agents) | Set `agents` list in `local.yml` | -| Bureau workspace path | `~/code` | Set `path_to.fs_mcp_whitelist` in `local.yml` | +| Bureau workspace path | `~/code` | Set `path_to.workspace` in `local.yml` | | Memory retention | 30d–365d, depending on the backend | Set `retention_period_for.*` in `local.yml` | -| [Role prompts](../agents/role-prompts/) and models for PAL `clink` to use with coding CLIs | All role prompts; Sonnet for Claude Code; gpt-5.2-codex with medium reasoning effort for Codex | Set `pal.*` settings in `local.yml` *(see [`directives.yml`](../directives.yml) for quick examples)* | +| [Role prompts](../agents/role-prompts/) and models for PAL `clink` to use with coding CLIs | All role prompts; Sonnet for Claude Code; gpt-5.2-codex with medium reasoning effort for Codex | Set `pal.*` settings in `local.yml` *(see [`defaults.yml`](../defaults.yml) for quick examples)* | ### Simple power user configuration example ```yml # bureau/local.yml (create if it doesn't exist) -mcp: - auto_approve: yes +auto_approved: + mcps: true pal: claude: @@ -299,7 +298,7 @@ pal: | MCP server not starting | Check logs in `/tmp/mcp-*-server.log` | | Docker not running | Start Docker Desktop / Rancher Desktop first | | Missing API key warnings | Set the environment variables listed in prerequisites | -| Port conflicts | Override ports in `local.yml` (see [CONFIGURATION.md](CONFIGURATION.md#port_for)) | +| Port conflicts | Override MCP ports in `local.yml` (see [CONFIGURATION.md](CONFIGURATION.md#change-mcp-ports-to-avoid-conflicts)) | > [!NOTE] > For manual setup steps (symlinking, running individual scripts), see the scripts' sources: @@ -308,5 +307,5 @@ pal: > | --- | --- | > | [`bin/open-bureau`](../bin/open-bureau) | Main bootstrap/setup convenience/wrapper script | > | [`tools/scripts/set-up-tools.sh`](../tools/scripts/set-up-tools.sh) | MCP/tooling setup | -> | [`protocols/scripts/set-up-configs.sh`](../protocols/scripts/set-up-configs.sh) | Config/context file setup | +> | [`protocols/scripts/set-up-protocols.sh`](../protocols/scripts/set-up-protocols.sh) | Config/context file setup | > | [`agents/scripts/set-up-agents.sh`](../agents/scripts/set-up-agents.sh) | Agent-/role-related setup | diff --git a/docs/USAGE.md b/docs/USAGE.md index 156b9109..f049853f 100644 --- a/docs/USAGE.md +++ b/docs/USAGE.md @@ -89,7 +89,7 @@ Bureau configures CLIs based on the `agents` config setting: ```yaml -# directives.yml (or local.yml for personal overrides) +# defaults.yml (or .bureau.yml / local.yml for overrides) agents: - claude # Claude Code - gemini # Gemini CLI @@ -226,12 +226,7 @@ Simply explicitly mention the subagent you want to use and it will automatically - **PAL (`clink` tool only)**: enables cross-CLI collaboration/subagent spawning, [as discussed above](#spawning-subagents-using-any-cli-from-any-cli-via-clink) > [!TIP] -> To learn more about the available MCPs and the specific guidance agents receive on how to use them, read: -> -> - [`tools-guide.md`](../protocols/context/static/tools-guide.md): quick decision guide for selecting the best tool for a given task. -> - [`deep-dives/`](../protocols/context/static/deep-dives/): -> -> - Collection of in-depth guides for each MCP, detailing their capabilities and advanced usage patterns. Only read by agents when necessary to preserve context +> To learn more about the available MCPs and the specific guidance agents receive on how to use them, read [`tools-guide.md`](../protocols/context/static/tools-guide.md) *(quick decision guide for selecting the best tool for a given task)*. ### Non-MCP CLI tools @@ -366,8 +361,9 @@ A skills library that enforces mandatory workflows for common engineering tasks > > You can check which skills are available to (be automatically loaded by) Codex by running: > ```bash -> ~/.codex/superpowers/.codex/superpowers-codex find-skills +> ls -ld ~/.agents/skills/superpowers > ``` +> The symlink target should be `~/.codex/superpowers/skills`. ### Using `claude-mem` *(Claude Code only)* diff --git a/docs/mcp-schema-eval.md b/docs/mcp-schema-eval.md new file mode 100644 index 00000000..2a7a3e80 --- /dev/null +++ b/docs/mcp-schema-eval.md @@ -0,0 +1,170 @@ +# MCP schema evaluation + +> **Date:** 2026-03-04 +> **Scope:** `mcp` key in Bureau's YAML config — schema design, validation coverage, and doc consistency. +> +> **Files reviewed:** +> - [`docs/CONFIGURATION.md`](CONFIGURATION.md) — user-facing schema reference +> - [`defaults.yml`](/defaults.yml) — canonical defaults +> - [`operations/validate_config.py`](/operations/validate_config.py) — validation engine +> - [`operations/mcp_validation_rules.py`](/operations/mcp_validation_rules.py) — declarative rule constants + +--- + +***Contents***: + +- [Architecture overview](#architecture-overview) +- [Strengths](#strengths) +- [Weaknesses](#weaknesses) + - [Missing required-field checks](#missing-required-field-checks) + - [Doc/schema drift](#docschema-drift) + - [Implicit coupling not validated](#implicit-coupling-not-validated) + - [Shallow sub-structure validation](#shallow-sub-structure-validation) + - [Design limitation](#design-limitation) +- [Summary](#summary) + +--- + +## Architecture overview + +The schema has **three buckets** under the `mcp` key, forming a dependency graph: + +``` +dependencies → runtime_services → client_configs + (repos, files) (docker, processes) (what CLIs actually see) +``` + +Each bucket uses a **discriminated union** pattern — a `kind` field determines which fields are required. Validation is split across two files: + +| File | Role | +|:-----|:-----| +| `operations/mcp_validation_rules.py` | Declarative constants (allowed keys, required fields, type rules) | +| `operations/validate_config.py` | Engine that consumes those constants | + +--- + +## Strengths + +### 1. Declarative validation rules + +All field sets, kind enums, and type rules live in `mcp_validation_rules.py` as plain data. Adding a new `kind` or field means editing a dict/set, not writing new validation logic. This is the right separation of concerns. + +### 2. Warnings vs errors distinction + +Unknown keys produce **warnings**, not errors. This means user extension keys (like `settings.collection`) don't break validation, but typos still get flagged. The `ValidationResult` dataclass makes this a first-class concept. + +### 3. Cross-reference validation + +`_validate_cross_references` checks that `depends_on.services` and `depends_on.dependencies` actually point to declared entries. These are warnings (not errors) because references might come from conditionally-loaded config layers — good pragmatic choice for a multi-tier config system. + +### 4. Transport-dependent required fields + +`CLIENT_TRANSPORT_REQUIRED` maps `http → {url}` and `stdio → {command}`, so the validator correctly enforces that HTTP clients have URLs and stdio clients have commands. + +### 5. Placeholder-aware type checking + +`_validate_field_types` skips values containing `${...}` since their final type depends on expansion. Without this, every port reference like `${mcp.runtime_services.qdrant_db.host_port}` would fail the `int` check. + +--- + +## Weaknesses + +### Missing required-field checks + +#### W1. `kind` is not validated as *required* — only its *value* is + +In `_validate_entry_schema`, the `kind_enum` check is: + +```python +if kind_enum is not None and "kind" in entry: +``` + +If `kind` is **missing entirely**, nothing fires — no error about the missing field, no required-field-per-kind check. A `runtime_services` entry without `kind` silently passes schema validation. + +#### W2. `transport` is not validated as required for client entries + +Same pattern: `if transport is not None` means a client entry with no `transport` field passes validation. The transport-required fields check (`if transport:`) also silently skips. A client entry like `{url: "http://..."}` with no `transport` would pass. + +#### W3. `enabled` field type is never validated + +Every bucket supports `enabled: true/false`, but no type rule checks that `enabled` is actually a boolean. Setting `enabled: "yes"` or `enabled: 42` would pass validation. None of `DEPENDENCY_TYPE_RULES`, `RUNTIME_SERVICE_TYPE_RULES`, or `CLIENT_CONFIG_TYPE_RULES` include an `enabled` check. + +--- + +### Doc/schema drift + +#### W4. `sse` transport is mentioned in docs but not in the schema + +`CONFIGURATION.md` says: + +> `http/sse` expect `url`; `stdio` expects `command` + +But `CLIENT_TRANSPORT_KINDS` is `{"http", "stdio"}`. There is no `sse` transport in the validator. Either the docs are wrong or the schema is incomplete. + +#### W5. `healthcheck.tcp` type is not validated + +The docs say `healthcheck.tcp` is an `int` (port number). The type rule says `("healthcheck", "dict")`. But the inner validation (`_validate_healthcheck`) only checks **allowed keys**, not the type of `tcp`. A config like `healthcheck: {tcp: "not-a-number"}` would pass. + +--- + +### Implicit coupling not validated + +#### W6. `${...}` URL references create undeclared dependencies + +Many client configs implicitly reference runtime services via `${mcp.runtime_services.X.port}` in their URLs, but the cross-reference validator only checks `depends_on` blocks. If you disable `qdrant_mcp` but keep the `qdrant` client config enabled (without `depends_on`), validation passes but the URL won't resolve at runtime. + +#### W7. `clients.` keys are not checked against the `agents` list + +`disabled_for` values are cross-checked against the top-level `agents` list (producing warnings for unknown agents), but `clients.` keys are **not** checked. You could have `clients.codex` without `codex` in the agents list and get no feedback. + +#### W8. `clients.default` is "strongly recommended" but not enforced + +The docs say `clients.default` is "optional but strongly recommended." The validator checks `must have at least one client` but doesn't warn if `default` is missing. A config with only `clients.claude` would pass silently, then break for Gemini/Codex users who don't have a client override. + +--- + +### Shallow sub-structure validation + +#### W9. No validation for `settings` sub-structure + +`settings` is allowed but completely opaque — any value type passes. While this is intentional (extension point), there's no opt-in mechanism for known settings schemas. For example, `qdrant_mcp.settings.collection` should be a string and `settings.embedding_provider` should be a string, but nothing checks this. + +#### W10. Mount validation doesn't check value types + +`_validate_mounts` checks that `host_path` and `container_path` **keys exist** in each mount entry, but doesn't validate their **values are strings**. A mount like `{host_path: 42, container_path: []}` would pass. + +--- + +### Design limitation + +#### W11. Dependencies can't express inter-dependency ordering + +The docs say: + +> Dependencies cannot depend on other dependencies — they are prepared in sorted order first. + +But the schema has no mechanism to enforce or express ordering between dependencies. If `dependency_B` needs `dependency_A`'s `path` to exist first, there's no way to declare that. The current sorted-order convention is implicit and fragile. + +--- + +## Summary + +| Category | Count | Items | +|:---------|------:|:------| +| **Strengths** | 5 | Declarative rules, warning/error split, cross-refs, transport-required, placeholder-aware | +| **Missing required checks** | 3 | W1 (`kind`), W2 (`transport`), W3 (`enabled` type) | +| **Doc/schema drift** | 2 | W4 (`sse` transport), W5 (`healthcheck.tcp` type) | +| **Implicit coupling not validated** | 3 | W6 (`${...}` URL refs), W7 (`clients.` vs agents), W8 (`clients.default`) | +| **Shallow sub-structure validation** | 2 | W9 (`settings` opaque), W10 (mount value types) | +| **Design limitation** | 1 | W11 (no inter-dependency ordering) | + +> **Biggest systemic gap:** `kind` and `transport` are the schema's primary discriminators, yet neither is enforced as *present*. The discriminated union pattern works when configs are correct, but fails silently when they're incomplete. +> +> **Potential improvement:** A "strict mode" flag that promotes warnings to errors for CI validation, and enforcing `kind`/`transport` as required fields, would close the largest gaps with minimal changes to the existing declarative structure. + +--- + +## TODO + +- [ ] **Rename `runtime_services` → `services`** across the entire repo (config keys, validators, docs, setup scripts, `depends_on.services` references). The current name is unnecessarily verbose — `services` is sufficient and consistent with `depends_on.services` already using the shorter form. +- [ ] **Rename `depends_on` → `requires`** across the entire repo. Eliminates the stutter between the `depends_on.dependencies` sub-key and the top-level `dependencies` bucket — `requires.dependencies` reads more naturally. diff --git a/docs/prompts.md b/docs/prompts.md new file mode 100644 index 00000000..4ae5b8ab --- /dev/null +++ b/docs/prompts.md @@ -0,0 +1,94 @@ +# prompts to keep saved + +## to pick up + +Read the handoff file at `memory/handoff-mcp-schema-design.md` (in the auto +memory directory). It contains the full state of an in-progress brainstorming +interview for designing MCP schema validation fixes. Your job is to resume this +interview exactly where it left off. + +Before doing anything else: +1. Read the handoff file thoroughly — it has all decisions made so far, pending + questions, file references, and checklist status +2. Read these files to rebuild context: `docs/mcp-schema-eval.md`, + `docs/schema-fix-plan.md`, `operations/mcp_validation_rules.py` +3. Invoke the **brainstorming** skill to load the process, then resume at step 2 + (clarifying questions), picking up from the question about **W6** which the + user hasn't answered yet + +Do NOT re-ask any question that already has a decision in the handoff file. Do +NOT re-explore the codebase. Just re-present the pending W6 question and +continue the interview through W7-W11. + +## add as slash command later + +```md +# Persist current session state for handoff + +You are about to create a **complete context snapshot** of the current +conversation so that a fresh agent with zero context can resume this exact task +without losing any progress. + +## Where to persist + +Write to a file in the **auto memory directory** (the `memory/` directory in +Claude Code's project config, e.g. +`~/.claude/projects//memory/`). This is the ONLY location that +persists across `/clear` and new sessions. + +- **Do NOT use task lists** — they are session-scoped and die on `/clear`. +- **Do NOT pollute repo files** like CLAUDE.md or AGENTS.md. +- Name the file descriptively: `handoff-.md` +- Add or update a pointer in `MEMORY.md` under an `## Active Handoffs` section + so future agents discover it automatically. +- Add a note at the top: `> **Delete this file** once the task is complete.` + +## What to capture + +The handoff file must contain ALL of the following: + +### 1. Task overview +- What we are doing, in 2-3 sentences a stranger could understand +- Which skill/workflow/process is active (if any) and its checklist status +- The key files involved (with absolute or repo-relative paths) + +### 2. Decision log +- Every decision made during this conversation, with: + - The question that was asked + - The option chosen (and its letter/label if applicable) + - Brief rationale if one was given +- Decisions must be listed in chronological order +- Do NOT summarize or compress — each decision is its own entry + +### 3. Pending state +- The exact question or action that was in progress when this snapshot was taken +- Whether the user has answered it yet +- Any subagent results that haven't been presented to the user yet +- Any open threads, unresolved ambiguities, or "we'll come back to this" items + +### 4. Changes already made +- List all file edits, renames, config changes, and fixes made during this + session +- Include enough detail that the next agent won't accidentally redo or revert + them + +### 5. Handoff prompt +- Write the exact prompt the user should paste to a fresh agent to resume + seamlessly +- This prompt must tell the new agent: where to find the handoff file, what to + read first, which skill to invoke (if any), and exactly where to pick up + +## Rules + +- **Err on the side of too much detail.** The cost of redundancy is zero; the + cost of a missing decision is re-doing the entire conversation. +- **Do NOT paraphrase the user's decisions.** Quote or preserve their exact + choice. +- **Include subagent outputs verbatim** (or summarized with key facts preserved) + if they contain information the next agent will need. +- **After writing the handoff file, read it back** and verify: "Could an agent + with empty context reconstruct our full mental stack from this alone?" If not, + add what's missing. +- **Update `MEMORY.md`** — if it doesn't exist, create it. If it does, add or + update the entry under `## Active Handoffs`. Never overwrite existing content. +``` diff --git a/docs/schema-fix-plan.md b/docs/schema-fix-plan.md new file mode 100644 index 00000000..c8c5a8db --- /dev/null +++ b/docs/schema-fix-plan.md @@ -0,0 +1,202 @@ +# Schema Fix Plan + +## High Priority (Breaking/Silent Failures) + +### 1. Runtime service dependency enforcement ✅ +Add enforcement for `mcp.runtime_services..depends_on.services` during catalog resolution so services depending on disabled or missing services are skipped before `service-order.py` runs. Current behavior only checks dependency IDs, which can cause ordering to fail at runtime. + +**Additionally:** Add cycle detection for service dependencies to prevent infinite loops/deadlocks. A service depending on itself (directly or transitively) would cause startup to hang. Implement using DFS-based cycle detection similar to `validate_placeholder_cycles()`. + +**Example problem case:** +```yaml +mcp: + runtime_services: + service_a: + depends_on: + services: [service_b] + service_b:x + depends_on: + services: [service_a] # Circular dependency! +``` + +**Implementation:** Add `validate_service_dependency_cycles()` function in `operations/validate_config.py` and call from `full_validate()`. + +--- + +### 5. ✅ MCP schema validation depth +`operations/validate_config.py` only checks that `mcp.runtime_services` and `mcp.client_configs` exist, not that entries contain required keys or valid values. Add deeper validation for MCP entries (kinds, required fields, and types) to fail fast on misconfiguration. + +**Specific validation gaps to address:** + +#### 5.1 ✅ Kind enum validation +Validate that `kind` fields contain only allowed values: +- `mcp.dependencies..kind`: Must be `git_repo` or `file` +- `mcp.runtime_services..kind`: Must be `docker_container` or `http_process` + +**Problem:** Typos like `kind: http_server` (instead of `http_process`) silently fail later during resolution. + +#### 5.2 ✅ Required field validation per kind +Each `kind` has mandatory fields that should be validated: + +**For `runtime_services`:** +- `docker_container` requires: `image`, `host_port`, `container_port` +- `http_process` requires: `port`, `command` + +**For `dependencies`:** +- `git_repo` requires: `repo_url`, `path` +- `file` requires: `path` + +**For `client_configs`:** +- All entries require: `clients` dict with at least one client +- `http` transport clients require: `url` +- `stdio` transport clients require: `command` + +**Problem:** Missing required fields cause cryptic errors during setup rather than clear validation failures. + +#### 5.3 ✅ Field typo detection +Common typos that should be caught: +- `enabeld` instead of `enabled` +- `depends_on.servies` instead of `depends_on.services` +- `commmand` instead of `command` + +**Resolution:** All three sub-issues (5.1-5.3) were implemented in `mcp_validation_rules.py` (declarative schema constants) and `validate_config.py` (generic `_validate_entry_schema()` engine). Additionally, the following deeper validation was added: + +- **Field type validation**: Declarative `(field, type_tag)` rules in `mcp_validation_rules.py` consumed by `_check_type()` / `_validate_field_types()` in `validate_config.py`. Catches misconfigurations like `command: "uvx run server"` (should be a list) or `port: "8080"` (should be an int). Skips placeholder-bearing values (`${...}`) to avoid false positives on fields that resolve after expansion. +- **Transport enum validation**: `transport` values checked against `CLIENT_TRANSPORT_KINDS` (`http`, `stdio`). Unknown transports produce errors. +- **Sub-structure validation**: `mounts` entries checked for required `host_path`/`container_path` keys. `healthcheck` blocks checked against `HEALTHCHECK_ALLOWED_KEYS`. Unknown sub-keys produce warnings. +- **Cross-reference validation**: `depends_on.services` and `depends_on.dependencies` names verified against declared entries. Mismatches produce warnings (not errors) to handle conditional config layers. +- **Test coverage**: 76 tests in `test_mcp_validation_rules.py` (38 new across 5 test classes: `TestFieldTypeValidation`, `TestMountSubStructure`, `TestHealthcheckSubStructure`, `TestTransportEnumValidation`, `TestCrossReferenceValidation`). + +--- + +## Medium Priority (UX/Clarity) + +### 2. ✅ Per-CLI disable semantics +`render-mcp-setup.py` always falls back to `clients.default`, so you cannot disable a server for a specific CLI while keeping it for others. Add an explicit per-CLI disable flag (for example `clients..enabled: false`) or a `clients.: null` convention and update render logic accordingly. + +**Resolution:** Added `clients.disabled_for` list setting. Agents in this list are excluded from the server, even if a `clients.` override exists. Validated against top-level `agents` list (warnings for unknown agents). Updated `mcp_validation_rules.py`, `mcp_catalog.py`, `render-mcp-setup.py`, `validate_config.py`, and `CONFIGURATION.md`. + +--- + +### 6. ✅ Naming inconsistencies + +#### 6.1 ✅ Confusing term: `auto_approved.mcps` +The setting `auto_approved.mcps` controls whether **MCP tool calls** are auto-approved, but elsewhere `mcp.*` refers to **MCP servers/infrastructure**. This terminology clash causes confusion. + +**Current:** +```yaml +auto_approved: + mcps: false # Approval for MCP TOOL CALLS + +mcp: + client_configs: # Configuration for MCP SERVERS + qdrant: ... +``` + +**Recommendation:** Rename to `auto_approved.mcp_tools` or `auto_approved.mcp_tool_calls` for clarity. + +#### 6.2 ✅ Redundant prefix: `auto_clean_managed_mcps` +The name contains redundant qualifiers. The registry already tracks "managed" MCPs, and "auto" is implicit in a boolean flag. + +**Current:** `auto_clean_managed_mcps: true` +**Clearer alternatives:** `clean_removed_mcps` or `prune_disabled_mcps` + +**Impact:** Low risk but affects config readability. Consider for next major version or document clearly. + +**Resolution:** Clean-break rename (no backward compat) across all config, code, tests, and docs: +- `auto_approved.mcps` → `auto_approved.mcp_tools` — clarifies that this controls tool call approval, not server infrastructure +- `auto_clean_managed_mcps` → `prune_disabled_mcps` — drops redundant "auto" (implicit) and "managed" (implementation detail) + +--- + +### 7. ✅ Default `enabled: true` behavior undocumented +The implementation uses `dep.get("enabled", True)` throughout `mcp_catalog.py`, meaning services/dependencies are **enabled by default** if the key is omitted. This is not documented in CONFIGURATION.md schema reference. + +**Problem:** Users may be surprised when adding a service definition without `enabled: false` causes it to start immediately on next `open-bureau` run. + +**Fix:** Update CONFIGURATION.md schema reference to explicitly state: +```markdown +- `enabled` (bool, default: `true`): Skip service if `false`. + **Note:** Services and dependencies are enabled by default. +``` + +**Resolution:** The schema reference (added in Issue #5) already documented `enabled (bool, default: true)` for all three buckets, but the default-true behavior wasn't called out prominently. Added an explicit "Enabled by default" callout note in the schema reference Notes section. Also removed redundant `enabled: true` from examples, which obscured the default rather than clarifying it. + +--- + +### 8. ✅ Missing placeholder escaping mechanism +Users have no way to include a literal `${...}` string in configuration values if they need it for shell commands or other purposes. The placeholder expansion is always applied. + +**Problem case:** +```yaml +mcp: + runtime_services: + my_service: + command: + - sh + - -c + - "echo ${VARIABLE}" # Expanded by Bureau, not by shell! +``` + +**Recommendation:** Document that `${...}` is always expanded by Bureau before passing to commands. If literal expansion needed, use environment variables or document workarounds. + +**Resolution:** No escape hatch needed. Bureau doesn't launch shell commands directly — it writes resolved values into CLI config files. Users who want env vars forwarded to MCP servers should use the `env` block in their client config (e.g. `MY_VAR: "${MY_VAR}"`), which is the existing pattern throughout `charter.yml`. Added a doc note to CONFIGURATION.md clarifying this approach. + +--- + +## Low Priority (Documentation/Polish) + +### 3. ✅ Unused `filesystem.settings.allowed_methods` +`charter.yml` defines `mcp.client_configs.filesystem.settings.allowed_methods` but `set-up-tools.sh` hardcodes `read_multiple_files`. Either wire this list into the command generation or remove the setting to avoid misleading configuration. + +**Resolution:** Already resolved during the `mcp_catalog.py` refactoring. The `_apply_allowed_methods()` function reads `settings.allowed_methods` at resolution time, strips any existing `-a` flags from the command, and rebuilds them from the config list. The hardcoded `-a read_multiple_files` in `charter.yml` is just a default template. `set-up-tools.sh` no longer exists. Test: `test_filesystem_allowed_methods_override_command`. + +--- + +### 4. Documentation file location mismatch +`docs/CONFIGURATION.md` states the `mcp` block lives in `directives.yml`, but defaults are in `charter.yml` with overrides in `directives.yml` and `local.yml`. Update the docs to reflect actual precedence and source files. + +**Additionally:** The schema reference section is very dense (150+ lines). Consider splitting into: +- Quick Start (common patterns, examples) +- Schema Reference (complete field listing) +- Advanced Topics (placeholder expansion, dependency resolution, troubleshooting) + +--- + +### 9. Missing extension guides ⭐ +Add to CONFIGURATION.md: +- **"Adding a Custom MCP Server"** section with step-by-step template +- **"Troubleshooting"** section covering common issues: + - "Server not appearing" → Check `enabled`, `requires_env`, `depends_on` + - "Placeholder not expanding" → Check syntax, circular references + - "Validation errors" → Common typos and fixes + +--- + +### 10. Inline examples in charter.yml +Add commented-out example blocks showing how to extend the MCP catalog: + +```yaml +# Example: Add your own HTTP MCP server +# mcp: +# client_configs: +# my_server: +# enabled: true +# clients: +# default: +# transport: http +# url: "http://localhost:9000/mcp/" +# +# Example: Add your own stdio MCP server +# mcp: +# client_configs: +# my_stdio_server: +# enabled: true +# clients: +# default: +# transport: stdio +# command: +# - npx +# - -y +# - my-mcp-package +``` diff --git a/lib/__init__.py b/lib/__init__.py deleted file mode 100644 index 0617ed5d..00000000 --- a/lib/__init__.py +++ /dev/null @@ -1 +0,0 @@ -# Bureau library package diff --git a/operations/__init__.py b/operations/__init__.py index 269b4498..21014a90 100644 --- a/operations/__init__.py +++ b/operations/__init__.py @@ -1 +1,12 @@ -# Operations: Bureau's reusable utilities +"""Operations: Bureau's internal package providing reusable utilities.""" + +from .config_templating import expand_placeholders +from .mcp_catalog import resolve_mcp_catalog +from .skills_catalog import resolve_skills_catalog + +# define and expose package's API (for use within Bureau) +__all__ = [ + "expand_placeholders", + "resolve_mcp_catalog", + "resolve_skills_catalog", +] diff --git a/operations/cleanup/README.md b/operations/cleanup/README.md index 488c5249..81671360 100644 --- a/operations/cleanup/README.md +++ b/operations/cleanup/README.md @@ -72,7 +72,7 @@ uv run sweep --wipe claude-mem ## Configuration -Retention periods and cleanup behavior are configured in `directives.yml` (or `local.yml` for personal overrides): +Retention periods and cleanup behavior are configured in `defaults.yml` (or `.bureau.yml` / `local.yml` for overrides): ```yaml retention_period_for: @@ -172,9 +172,10 @@ Each handler is implemented corresponding to its memory storage backend's underl - **Storage model:** - - The backing Qdrant DB service is locally run as a Docker container on the port specified by the `port_for.qdrant_db` config setting (default: `8780`) - - Memories are stored to it in the collection specified by the `qdrant.collection` setting (default: `coding-memory`) - - The backing Qdrant DB is persisted to the directory specified by the `path_to.storage_for.qdrant` setting (default: `~/.qdrant/storage/`) + - The backing Qdrant DB runs via `mcp.runtime_services.qdrant_db` on `host_port` (default: `8780`). + - The cleanup handler reads the collection name from `mcp.runtime_services.qdrant_mcp.settings.collection` (default: `coding-memory`). + - The DB persists data under `mcp.runtime_services.qdrant_db.mounts[*].host_path` (default: `~/.qdrant/storage/`). + - The handler connects using `mcp.runtime_services.qdrant_mcp.env.QDRANT_URL` (default: `http://127.0.0.1:${mcp.runtime_services.qdrant_db.host_port}`). - **Implementation:** diff --git a/operations/cleanup/handlers/claude_mem.py b/operations/cleanup/handlers/claude_mem.py index 4b7879cd..66251f86 100644 --- a/operations/cleanup/handlers/claude_mem.py +++ b/operations/cleanup/handlers/claude_mem.py @@ -1,4 +1,5 @@ """Claude-mem SQLite cleanup handler.""" +import logging import json import sqlite3 from datetime import datetime, timezone @@ -8,6 +9,8 @@ from ..trash import get_trash_dir, generate_trash_filename, write_manifest from ...config_loader import get_storage, get_trash_grace_period +logger = logging.getLogger(__name__) + class ClaudeMemHandler(CleanupHandler): """Cleanup handler for claude-mem SQLite database.""" @@ -23,6 +26,9 @@ def _table_name_for_entity_type(self, entity_type : str): def _get_db_connection(self) -> sqlite3.Connection | None: """Get SQLite connection if database exists.""" db_path = get_storage("claude_mem") + if not db_path: + logger.warning(f"Storage path not configured for {self.name}; skipping cleanup.") + return None return sqlite3.connect(db_path) if db_path.exists() else None def get_stale_items(self, cutoff: datetime) -> list[dict[str, Any]]: diff --git a/operations/cleanup/handlers/memory_mcp.py b/operations/cleanup/handlers/memory_mcp.py index d2cf47c8..20f90243 100644 --- a/operations/cleanup/handlers/memory_mcp.py +++ b/operations/cleanup/handlers/memory_mcp.py @@ -1,4 +1,5 @@ """Memory MCP JSONL cleanup handler.""" +import logging import json from datetime import datetime, timezone from pathlib import Path @@ -8,15 +9,21 @@ from ..trash import get_trash_dir, generate_trash_filename, write_manifest from ...config_loader import get_storage, get_trash_grace_period +logger = logging.getLogger(__name__) + class MemoryMcpHandler(CleanupHandler): """Cleanup handler for Memory MCP's JSONL file.""" name = "memory-mcp" - def _get_file_path(self) -> Path: + def _get_file_path(self) -> Path | None: """Get the Memory MCP JSONL file path.""" - return get_storage("memory_mcp") + file_path = get_storage("memory_mcp") + if not file_path: + logger.warning("Memory MCP storage path not configured; skipping cleanup.") + return None + return file_path def _read_entities(self) -> list[dict[str, Any]]: """Read all entities from JSONL file. @@ -25,7 +32,10 @@ def _read_entities(self) -> list[dict[str, Any]]: CleanupError: On file I/O errors. """ file_path = self._get_file_path() - if not file_path.exists(): + if not (file_path and file_path.exists()): + # return empty list if either: + # - storage path isn't configured in YMLs + # - file doesn't exist yet return [] try: @@ -49,6 +59,8 @@ def _write_entities(self, entities: list[dict[str, Any]]) -> None: CleanupError: On file I/O errors. """ file_path = self._get_file_path() + if not file_path: + raise CleanupError("Memory MCP storage path not configured") try: file_path.parent.mkdir(parents=True, exist_ok=True) diff --git a/operations/config_cli.py b/operations/config_cli.py index c58735af..7e1ac988 100644 --- a/operations/config_cli.py +++ b/operations/config_cli.py @@ -1,13 +1,13 @@ #!/usr/bin/env -S uv run """CLI tool for shell scripts to read Bureau configuration. -1. Merges configs using the order: charter.yml → directives.yml → local.yml → env) +1. Merges configs using the order: defaults.yml → .bureau.yml → local.yml → env) 2. Reads from merged config Usage: get-config agents # Output: claude gemini codex opencode get-config retention_period_for.qdrant # Output: 180d - get-config path_to.qdrant_url # Output: http://127.0.0.1:8780 + get-config mcp.client_configs.memory.storage_path # Output: ~/.memory-mcp/memory.jsonl get-config --check agent claude # Exit 0 if enabled, 1 if not get-config --list agents # List all enabled agents """ diff --git a/operations/config_loader.py b/operations/config_loader.py index ae4c4cfd..68c65b44 100644 --- a/operations/config_loader.py +++ b/operations/config_loader.py @@ -1,26 +1,32 @@ """Bureau configuration loader providing type-safe access to all Bureau settings. 1. Merges configuration from YAML files with the following precedence hierarchy - (later sources override earlier ones): + (later sources override earlier ones): - a. charter.yml: Fixed system config (cloud endpoints, disabled tools) - b. directives.yml: Team defaults (agents, retention, paths) - c. local.yml: Local overrides (gitignored) - d. env vars: Highest-priority overrides + a. defaults.yml: Package defaults (all git-tracked settings) + b. .bureau.yml: Project config (optional, discovered by CWD walk-up) + c. local.yml: Personal overrides (gitignored) + d. env vars: Highest-priority runtime overrides 2. Loads configuration """ + import os import re from datetime import timedelta from functools import lru_cache from pathlib import Path -from typing import Any, TypedDict, cast +from typing import Any, TypeAlias, TypedDict, cast import yaml +# runtime invariants are documented in docs/CONFIGURATION.md and are not fully +# enforceable via TypedDict alone; several schemas below are discriminator-based +# (for example, kind/transport) and rely on runtime validation conventions + + # TypedDict schemas corresponding to nested YAML config sections class RetentionPeriodForConfig(TypedDict): claude_mem: str @@ -42,53 +48,131 @@ class StartupTimeoutForConfig(TypedDict): docker_daemon: int -class PortForConfig(TypedDict): - qdrant_db: int - qdrant_mcp: int - sourcegraph_mcp: int - semgrep_mcp: int - serena_mcp: int - - -class StorageForConfig(TypedDict, total=False): - qdrant: str - memory_mcp: str - claude_mem: str - - class PathToConfig(TypedDict, total=False): workspace: str serena_memories_root: str - fs_mcp_whitelist: str mcp_clones: str - storage_for: StorageForConfig -class QdrantConfig(TypedDict, total=False): - collection: str - embedding_provider: str +class AgentSourceConfig(TypedDict): + path: str + cli: list[str] + + +class NativeAgentsConfig(TypedDict, total=False): + enabled: list[str] | str # List of agent names or "all" + disabled: list[str] + sources: list[AgentSourceConfig] + + +class MCPDependencyConfig(TypedDict, total=False): + """Config for an entry in mcp.dependencies.""" + + enabled: bool + kind: str + repo_url: str + branch: str + path: str + post_clone: list[list[str]] + + +class MCPDependsOnConfig(TypedDict, total=False): + """Shared depends_on schema for services and client configs.""" + + services: list[str] + dependencies: list[str] + + +class MCPHealthcheckConfig(TypedDict, total=False): + """Runtime service healthcheck settings.""" + + tcp: int | str -class EndpointForConfig(TypedDict): - sourcegraph: str - context7: str - tavily: str +class MCPMountConfig(TypedDict): + """Docker mount entry.""" + host_path: str + container_path: str + +class MCPRuntimeServiceConfig(TypedDict, total=False): + """Config for an entry in mcp.runtime_services.""" + + enabled: bool + kind: str + depends_on: MCPDependsOnConfig + healthcheck: MCPHealthcheckConfig + env: dict[str, str] + command: list[str] + settings: dict[str, Any] + # docker_container fields + container_name: str + image: str + host_port: int | str + container_port: int | str + mounts: list[MCPMountConfig] + # http_process fields + port: int | str + + +class MCPPostConfig(TypedDict, total=False): + """CLI-specific side effects for a client config.""" + + claude_settings_env: dict[str, str] + + +class MCPClientTransportConfig(TypedDict, total=False): + """A single clients. transport configuration.""" + + transport: str + url: str + headers: dict[str, str] + command: list[str] + env: dict[str, str] + timeout_ms: int + startup_timeout_sec: int + tool_timeout_sec: int + post_config: MCPPostConfig + + +# allows mixed `clients` values: either a "default" client object +# OR the "disabled_for" list +MCPClientEntry: TypeAlias = MCPClientTransportConfig | list[str] + + +class MCPClientConfig(TypedDict, total=False): + """Config for an entry in mcp.client_configs.""" + + enabled: bool + requires_env: list[str] + depends_on: MCPDependsOnConfig + clients: dict[str, MCPClientEntry] + settings: dict[str, Any] + storage_path: str + + +class MCPConfig(TypedDict, total=False): + """Top-level mcp configuration.""" + + dependencies: dict[str, MCPDependencyConfig] + runtime_services: dict[str, MCPRuntimeServiceConfig] + client_configs: dict[str, MCPClientConfig] + +# root-level config-modeling object class Config(TypedDict, total=False): agents: list[str] retention_period_for: RetentionPeriodForConfig trash: TrashConfig cleanup: CleanupConfig startup_timeout_for: StartupTimeoutForConfig - port_for: PortForConfig path_to: PathToConfig - qdrant: QdrantConfig - endpoint_for: EndpointForConfig + roles: NativeAgentsConfig + mcp: MCPConfig def find_repo_root(start_path: Path | None = None) -> Path: - """Find the repository root by looking for directives.yml or .git directory. + """Find the repository root by looking for defaults.yml or .git directory. Args: start_path: Starting directory for search. Defaults to cwd. @@ -104,18 +188,46 @@ def find_repo_root(start_path: Path | None = None) -> Path: current = start_path.resolve() - while current != current.parent: - if (current / "directives.yml").exists() or (current / ".git").exists(): + while True: + if (current / "defaults.yml").exists() or (current / ".git").exists(): return current + elif current == current.parent: + # reached root dir without finding repo root + raise FileNotFoundError( + f"Could not find repository root (defaults.yml or .git) starting from {start_path}" + ) current = current.parent - # Check root directory - if (current / "directives.yml").exists() or (current / ".git").exists(): - return current - raise FileNotFoundError( - f"Could not find repository root (directives.yml or .git) starting from {start_path}" - ) +def find_project_config() -> Path | None: + """ + Find .bureau.yml by walking up from cwd. + + Searches from the current working directory upward, stopping at + filesystem root. Returns None if no .bureau.yml is found. + + Does NOT search inside the Bureau repo itself + (to avoid confusion if Bureau DOES ever happen to have a .bureau.yml). + """ + try: + repo_root = find_repo_root().resolve() + except FileNotFoundError: + repo_root = None + + current = Path.cwd().resolve() + + while True: + candidate = current / ".bureau.yml" + + # skip the Bureau repo root itself to avoid self-referencing + if (repo_root and current == repo_root) or (not candidate.exists()): + if current == current.parent: + # reached root dir without finding a .bureau.yml + return None + current = current.parent + continue + + return candidate def get_main_repo_root() -> Path: @@ -159,28 +271,33 @@ def get_main_repo_root() -> Path: def deep_merge(base: dict[str, Any], override: dict[str, Any]) -> dict[str, Any]: - """Deep merge two dicts, with the `override` dict taking precedence. + """ + Recursively deep-merge two dicts with `override` taking precedence. + + Nested values are merged ONLY when both are dictionaries; + otherwise the value from `override` replaces `base`. Args: - base: Base dictionary. - override: Dictionary with override values. + base: Base dictionary + override: Dictionary with override values Returns: Merged dictionary. """ result = base.copy() - for key, value in override.items(): - if key in result and isinstance(result[key], dict) and isinstance(value, dict): - result[key] = deep_merge(result[key], value) + for key, override_value in override.items(): + base_value = result.get(key) + if isinstance(base_value, dict) and isinstance(override_value, dict): + result[key] = deep_merge(base_value, override_value) else: - result[key] = value + result[key] = override_value return result def expand_path(path_str: str) -> Path: - """Expand ~ and environment variables in path string.""" + """Expand `~` and environment variables in path string.""" expanded = os.path.expandvars(os.path.expanduser(path_str)) return Path(expanded) @@ -197,36 +314,46 @@ def _load_yaml_file(path: Path) -> dict[str, Any]: def get_config() -> Config: """Load and merge configs, following this resolution order: - 1. charter.yml (base defaults, required) - 2. directives.yml (team config, if exists) - 3. local.yml (local overrides, if exists) + 1. defaults.yml (package defaults, required) + 2. .bureau.yml (project config, optional — discovered by CWD walk-up) + 3. local.yml (personal overrides, optional — gitignored) 4. Environment variables (highest priority) - - For testing: - 1. monkeypatch find_repo_root() to return the temp testing directory path - 2. call clear_config_cache() to clear cache - 3. call get_config() to do a fresh config read, retrieving the test-oriented config - - monkeypatch.setattr("operations.config_loader.find_repo_root", lambda: tmp_path) - clear_config_cache() - config = get_config() - Settings specified at paths LATER in the list OVERRIDE IDENTICAL SETTINGS at paths EARLIER in the list. - > e.g. `mcp.auto_approve: yes` in local.yml overrides `mcp.auto_approve: no` in directives.yml + Settings specified at layers LATER in the list OVERRIDE IDENTICAL SETTINGS + at layers EARLIER in the list. + > e.g. `auto_approved.mcp_tools: true` in local.yml overrides the default in defaults.yml Returns: Merged configuration dictionary. Raises: FileNotFoundError: If repo root cannot be found. + + NOTE: when testing this function: + 1. monkeypatch find_repo_root() to return the temp testing directory path + 2. call clear_config_cache() to clear cache + 3. call get_config() to do a fresh config read, retrieving the test-oriented config + + Example of this testing approach: + + monkeypatch.setattr("operations.config_loader.find_repo_root", lambda: tmp_path) + clear_config_cache() + config = get_config() """ repo_root = find_repo_root() config: dict[str, Any] = {} - # Load configs in precedence order (later overrides earlier) - for filename in ["charter.yml", "directives.yml", "local.yml"]: - config = deep_merge(config, _load_yaml_file(repo_root / filename)) + # 1. Package defaults (required) + config = deep_merge(config, _load_yaml_file(repo_root / "defaults.yml")) + + # 2. Project config (optional, discovered by CWD walk-up) + project_config_path = find_project_config() + if project_config_path: + config = deep_merge(config, _load_yaml_file(project_config_path)) + + # 3. Personal overrides (optional, gitignored) + config = deep_merge(config, _load_yaml_file(repo_root / "local.yml")) # Apply environment variable overrides for path_to path_to = config.get("path_to", {}) @@ -238,48 +365,19 @@ def get_config() -> Config: if env_val := os.environ.get(env_var): path_to[path_key] = env_val - # Apply environment variable overrides for path_to.storage_for - storage_for = path_to.get("storage_for", {}) - storage_env_overrides = { - "memory_mcp": "MEMORY_MCP_STORAGE_PATH", - "claude_mem": "CLAUDE_MEM_STORAGE_PATH", - "qdrant": "QDRANT_STORAGE_PATH", - } - - for storage_key, env_var in storage_env_overrides.items(): - if env_val := os.environ.get(env_var): - storage_for[storage_key] = env_val - # Derive paths from workspace if not explicitly set if workspace := path_to.get("workspace"): # Only set these if not already configured if "serena_memories_root" not in path_to: path_to["serena_memories_root"] = workspace - if "fs_mcp_whitelist" not in path_to: - path_to["fs_mcp_whitelist"] = workspace # Resolve mcp_clones: relative paths are resolved from main repo root (shared across worktrees) if mcp_clones := path_to.get("mcp_clones"): if not mcp_clones.startswith("/") and not mcp_clones.startswith("~"): path_to["mcp_clones"] = str(get_main_repo_root() / mcp_clones) - # Derive qdrant_url if not provided: use port_for.qdrant_db - if "qdrant_url" not in path_to: - ports_cfg = config.get("port_for", {}) - port = ports_cfg.get("qdrant_db", 8780) - path_to["qdrant_url"] = f"http://127.0.0.1:{port}" - - path_to["storage_for"] = storage_for config["path_to"] = path_to - # Merge Qdrant section defaults if missing - qdrant_cfg = config.get("qdrant", {}) - if "collection" not in qdrant_cfg: - qdrant_cfg["collection"] = "coding-memory" - if "embedding_provider" not in qdrant_cfg: - qdrant_cfg["embedding_provider"] = "fastembed" - config["qdrant"] = qdrant_cfg - return config # type: ignore[return-value] @@ -338,7 +436,7 @@ def get_path(path_name: str) -> Path: """Get a configured file path, expanded. Args: - path_name: Path key (serena_memories_root, fs_mcp_whitelist, mcp_clones). + path_name: Path key (serena_memories_root, mcp_clones). Returns: Expanded Path object. @@ -348,18 +446,62 @@ def get_path(path_name: str) -> Path: return expand_path(path_str) if path_str else Path() -def get_storage(storage_name: str) -> Path: +def get_mcp_dependency(name: str) -> MCPDependencyConfig | None: + """Get MCP dependency config by name.""" + mcp_config = get_config().get("mcp") + if not mcp_config: + return None + dependencies = mcp_config.get("dependencies") + if not dependencies: + return None + return dependencies.get(name) + + +def get_mcp_service(name: str) -> MCPRuntimeServiceConfig | None: + """Get MCP runtime service config by name.""" + mcp_config = get_config().get("mcp") + if not mcp_config: + return None + runtime_services = mcp_config.get("runtime_services") + if not runtime_services: + return None + return runtime_services.get(name) + + +def get_mcp_server(name: str) -> MCPClientConfig | None: + """Get MCP client config by name.""" + mcp_config = get_config().get("mcp") + if not mcp_config: + return None + client_configs = mcp_config.get("client_configs") + if not client_configs: + return None + return client_configs.get(name) + + +def get_storage(storage_name: str) -> Path | None: """Get a configured storage path, expanded. Args: - storage_name: Storage key (qdrant, memory_mcp, claude_mem). + storage_name: Storage key (memory_mcp, claude_mem). Returns: - Expanded Path object. + Expanded Path object or None if not configured. """ - config = get_config() - path_str = cast(str, config.get("path_to", {}).get("storage_for", {}).get(storage_name, "")) - return expand_path(path_str) if path_str else Path() + path_str = "" + if storage_name == "memory_mcp": + server = get_mcp_server("memory") or {} + path_str = cast(str, server.get("storage_path", "")) + elif storage_name == "claude_mem": + # Try dependency first, then fall back to service for backwards compatibility + dependency = get_mcp_dependency("claude_mem_storage") + if dependency: + path_str = cast(str, dependency.get("path", "")) + else: + service = get_mcp_service("claude_mem_storage") or {} + path_str = cast(str, service.get("path", "")) + + return expand_path(path_str) if path_str else None # Path constants (computed from config) @@ -388,14 +530,26 @@ def get_trash_dir() -> Path: def get_qdrant_url() -> str: """Get Qdrant server URL.""" - config = get_config() - return cast(str, config.get("path_to", {}).get("qdrant_url", "http://127.0.0.1:8780")) + from .config_templating import expand_placeholders + + service = get_mcp_service("qdrant_mcp") or {} + env = service.get("env", {}) if isinstance(service, dict) else {} + url = env.get("QDRANT_URL") if isinstance(env, dict) else "" + if not url: + return "" + return expand_placeholders(str(url), get_config(), os.environ) def get_qdrant_collection() -> str: """Get Qdrant collection name.""" - config = get_config() - return config.get("qdrant", {}).get("collection", "coding-memory") + from .config_templating import expand_placeholders + + service = get_mcp_service("qdrant_mcp") or {} + settings = service.get("settings", {}) if isinstance(service, dict) else {} + collection = settings.get("collection") if isinstance(settings, dict) else "" + if not collection: + return "" + return expand_placeholders(str(collection), get_config(), os.environ) # Duration parsing (moved from cleanup/config.py) diff --git a/operations/config_templating.py b/operations/config_templating.py new file mode 100644 index 00000000..5714d64a --- /dev/null +++ b/operations/config_templating.py @@ -0,0 +1,69 @@ +""" +Utilities for resolving and expanding placeholders in configuration strings +using values from: +- env vars +- a config dictionary +""" + +from __future__ import annotations + +import os +import re +from typing import Any, Mapping + +# regex used to extract names of value placeholders (formatted to be within `${...}`) +# note `[^}+]` (match all non `}` chars) is used in the capturing group over `.*` since there may +# be multiple value placeholders in a config string; `.*` would match up to the last `}` in the line, +# capturing the entire substring starting from the beginning of first placeholder to the end of the last, +# including everything in between +_PLACEHOLDER_REGEX = re.compile(r"\$\{([^}]+)\}") + +# retrieve value stored in config dict (i.e. as formed by merging YML configs) +def _get_config_value(config: Mapping[str, Any], path_to_key: str) -> str | None: + parts = path_to_key.split(".") + + # safely traverse config tree towards key's location + current: Any = config + for part in parts: + if not isinstance(current, dict) or part not in current: + return None + current = current[part] + return None if current is None else str(current) + + +def expand_placeholders( + value: str, config: Mapping[str, Any], env: Mapping[str, str] | None = None +) -> str: + """ + Expands `${...}` placeholders in a string using environment variables and config values. + + Args: + value: The string containing placeholders to expand. + config: Mapping used to resolve placeholders (via dot-notation) if they are not found + in the environment. + env: A mapping of environment variables to check first. + If None, defaults to os.environ. + + Returns: + The string with all resolvable placeholders replaced by their actual values. + """ + if env is None: + env = os.environ + + def repl(match: re.Match[str]) -> str: + key = match.group(1) + if key in env: + return env[key] + cfg_val = _get_config_value(config, key) + return cfg_val if cfg_val is not None else match.group(0) + + expanded = value + seen = {expanded} + while True: + # config validation is responsible for catching recursive placeholders (or cycles thereof) + # that would cause infinite looping here (e.g. `val="...${val}..."`) + expanded = _PLACEHOLDER_REGEX.sub(repl, expanded) + if expanded in seen: + break + seen.add(expanded) + return expanded diff --git a/operations/json_config_utils.py b/operations/json_config_utils.py index f39674fe..f5513b85 100755 --- a/operations/json_config_utils.py +++ b/operations/json_config_utils.py @@ -71,13 +71,45 @@ def save_json_config(path: str, config: dict, indent: int = 2) -> None: path: Path to the JSON file (supports ~ expansion) config: Configuration dictionary to save indent: JSON indentation level (default: 2) + + Raises: + SystemExit: If the config dict cannot be serialized as JSON + (i.e. is not a plain dict with serializable data) """ config_path = Path(path).expanduser() ensure_parent_dir(config_path) - with open(config_path, 'w', encoding='utf-8') as f: - json.dump(config, f, indent=indent) - f.write('\n') # Add trailing newline + try: + with open(config_path, 'w', encoding='utf-8') as f: + json.dump(config, f, indent=indent) + f.write('\n') # Add trailing newline + except (TypeError, ValueError) as exc: + raise SystemExit(f"Failed to serialize JSON for {path}: {exc}") from exc + + +def load_json_text(text: str, source: str | None = None) -> dict: + """ + Parse JSON from a string with consistent error handling. + + Args: + text: JSON string to parse + source: Optional description of the source (used in error messages) + + Returns: + Parsed JSON dictionary + + Raises: + SystemExit: If JSON is invalid + """ + try: + result = json.loads(text) + except json.JSONDecodeError as exc: + label = source or "JSON" + raise SystemExit(f"Failed to parse {label}: {exc}") from exc + if not isinstance(result, dict): + label = source or "JSON" + raise SystemExit(f"{label} must be a JSON object") + return result def expand_vars(value: str) -> str: diff --git a/operations/mcp_validation_rules.py b/operations/mcp_validation_rules.py new file mode 100644 index 00000000..98d78743 --- /dev/null +++ b/operations/mcp_validation_rules.py @@ -0,0 +1,117 @@ +"""MCP schema constants for deep validation. + +- All validation rules are declared here as data +- The validation engine in validate_config.py consumes these constants, using them + as rules when checking the MCP-related settings in Bureau's YML configs. +""" + +from __future__ import annotations + +# ── Kind enums ────────────────────────────────────────────────────── + +# Permitted values for mcp.dependencies.*.kind +DEPENDENCY_KINDS: set[str] = {"git_repo", "file"} +# Permitted values for mcp.runtime_services.*.kind +RUNTIME_SERVICE_KINDS: set[str] = {"docker_container", "http_process"} + +# ── Required fields per kind ──────────────────────────────────────── + +# Required fields for each mcp.dependencies kind +# i.e. if mcp.dependencies.*.kind = x, these fields must also be included +DEPENDENCY_REQUIRED: dict[str, set[str]] = { + "git_repo": {"repo_url", "path"}, + "file": {"path"}, +} + +# Required fields for each mcp.runtime_services kind +# i.e. if mcp.runtime_services.*.kind = x, these fields must also be included +RUNTIME_SERVICE_REQUIRED: dict[str, set[str]] = { + "docker_container": {"image", "host_port", "container_port"}, + "http_process": {"port", "command"}, +} + +# Fields that must be present for each client_config transport type (http, stdio) +CLIENT_TRANSPORT_REQUIRED: dict[str, set[str]] = { + "http": {"url"}, + "stdio": {"command"}, +} + +# ── Allowed keys per bucket (for typo detection) ─────────────────── +# - These are the KNOWN keys +# - *Unknown* keys produce WARNINGS, NOT ERRORS, so that user extension keys +# (e.g. qdrant_mcp.settings.collection) don't break. + +# Recognized keys for mcp.dependencies.* entries +DEPENDENCY_ALLOWED_KEYS: set[str] = { + "enabled", "kind", "repo_url", "branch", "path", "post_clone", +} + +# Recognized keys for mcp.runtime_services.* entries +RUNTIME_SERVICE_ALLOWED_KEYS: set[str] = { + "enabled", "kind", "depends_on", "healthcheck", "command", + "env", "settings", "port", "container_name", "image", + "host_port", "container_port", "mounts", +} + +# Recognized top-level keys for mcp.client_configs.* entries +CLIENT_CONFIG_ALLOWED_KEYS: set[str] = { + "enabled", "requires_env", "depends_on", "clients", + "settings", "storage_path", +} + +# Recognized keys for individual client entries inside clients.* +CLIENT_ENTRY_ALLOWED_KEYS: set[str] = { + "transport", "url", "headers", "command", "env", + "post_config", "timeout_ms", "startup_timeout_sec", + "tool_timeout_sec", "args", +} + +# Recognized sub-keys inside any depends_on block +DEPENDS_ON_ALLOWED_KEYS: set[str] = {"services", "dependencies"} + +# Reserved keys inside clients.* that are metadata, not client config entries +CLIENTS_RESERVED_KEYS: set[str] = {"disabled_for"} + +# ── Transport enum ───────────────────────────────────────────────── + +# Valid values for client entry transport field +CLIENT_TRANSPORT_KINDS: set[str] = {"http", "stdio"} + +# ── Field type rules ─────────────────────────────────────────────── +# Declarative (field_name, type_tag) tuples consumed by _validate_field_types(). +# Type tags: "int", "dict", "dict[str,str]", "list[str]", "list[dict]", "list[list[str]]" + +# Type rules for mcp.dependencies.* entries +DEPENDENCY_TYPE_RULES: list[tuple[str, str]] = [ + ("post_clone", "list[list[str]]"), +] + +# Type rules for mcp.runtime_services.* entries +RUNTIME_SERVICE_TYPE_RULES: list[tuple[str, str]] = [ + ("command", "list[str]"), + ("port", "int"), + ("host_port", "int"), + ("container_port", "int"), + ("env", "dict[str,str]"), + ("mounts", "list[dict]"), + ("healthcheck", "dict"), +] + +# Type rules for mcp.client_configs.* top-level entries +CLIENT_CONFIG_TYPE_RULES: list[tuple[str, str]] = [ + ("requires_env", "list[str]"), +] + +# Type rules for individual client entries inside clients.* +CLIENT_ENTRY_TYPE_RULES: list[tuple[str, str]] = [ + ("command", "list[str]"), + ("env", "dict[str,str]"), +] + +# ── Sub-structure key sets ───────────────────────────────────────── + +# Recognized sub-keys inside healthcheck blocks +HEALTHCHECK_ALLOWED_KEYS: set[str] = {"tcp"} + +# Required keys inside mounts list entries +MOUNT_REQUIRED_KEYS: set[str] = {"host_path", "container_path"} diff --git a/operations/validate_config.py b/operations/validate_config.py index 25b961ba..35842b22 100644 --- a/operations/validate_config.py +++ b/operations/validate_config.py @@ -1,10 +1,14 @@ """Config validator for Bureau cleanup module. Validates that all required configuration fields are present before -cleanup operations run. This prevents silent failures from missing keys. +cleanup operations run. This prevents silent failures due to missing keys +or unsupported values. """ import sys -from typing import Any, Mapping +from dataclasses import dataclass, field +from typing import Any, Literal, Mapping, overload + +from .config_templating import _PLACEHOLDER_REGEX class ConfigurationError(Exception): @@ -12,6 +16,13 @@ class ConfigurationError(Exception): pass +@dataclass +class ValidationResult: + """Validation output with errors (hard failures) and warnings (soft).""" + errors: list[str] = field(default_factory=list) + warnings: list[str] = field(default_factory=list) + + # Required schema for cleanup operations REQUIRED_SCHEMA: dict[str, Any] = { "agents": list, # At least one agent enabled @@ -34,12 +45,9 @@ class ConfigurationError(Exception): "mcp_servers": int, # Seconds to wait for MCP servers "docker_daemon": int, # Seconds to wait for Docker daemon }, - "port_for": { - "qdrant_db": int, - "qdrant_mcp": int, - "sourcegraph_mcp": int, - "semgrep_mcp": int, - "serena_mcp": int, + "mcp": { + "runtime_services": {}, + "client_configs": {}, }, } @@ -99,7 +107,7 @@ def _validate_node(config: Mapping[str, Any], schema: dict, path: str = "") -> l return errors -def validate_config(config: Mapping[str, Any]) -> list[str]: +def validate_config_schema(config: Mapping[str, Any]) -> list[str]: """Validate config dict against required schema. Args: @@ -123,7 +131,7 @@ def validate_and_raise(config: Mapping[str, Any]) -> None: Raises: ConfigurationError: If configuration is invalid. """ - errors = validate_config(config) + errors = validate_config_schema(config) if errors: error_msg = "Configuration validation failed:\n - " + "\n - ".join(errors) raise ConfigurationError(error_msg) @@ -132,21 +140,22 @@ def validate_and_raise(config: Mapping[str, Any]) -> None: def validate_duration_format(duration: str) -> str | None: """Validate a duration string format. + Delegates to parse_duration() to ensure validation and parsing + always agree on what constitutes a valid duration. + Args: duration: Duration string to validate. Returns: Error message if invalid, None if valid. """ - import re + from .config_loader import parse_duration - if duration.lower() == "always": + try: + parse_duration(duration) return None - - if not re.match(r"^\d+[hdwmy]$", duration.lower()): - return f"Invalid duration format: '{duration}'. Use format like '24h', '30d', '2w', '3m', '1y', or 'always'" - - return None + except ValueError as e: + return str(e) def _check_durations(section: Mapping[str, Any], section_name: str, *keys: str) -> list[str]: @@ -195,22 +204,119 @@ def validate_durations(config: Mapping[str, Any]) -> list[str]: return errors -def full_validate(config: Mapping[str, Any]) -> list[str]: +def _collect_placeholder_refs( + node: Any, path: str, graph: dict[str, set[str]] +) -> None: + """ + Recursively collect placeholder references (i.e. `${...}` segments in config strings + that are meant to be replaced by env var values) into a dependency graph. + """ + if isinstance(node, str): + if refs := set(_PLACEHOLDER_REGEX.findall(node)): + graph[path] = refs + elif isinstance(node, dict): + for k, v in node.items(): + _collect_placeholder_refs(v, f"{path}.{k}" if path else k, graph) + elif isinstance(node, list): + for i, v in enumerate(node): + _collect_placeholder_refs(v, f"{path}[{i}]", graph) + + +def _find_graph_cycles(graph: dict[str, set[str]]) -> list[str]: + """ + Detects cycles via DFS in the placeholder dependency graph, returns formatted cycle paths. + + Used as part of placeholder validation to ensure no cycles exist that would cause + infinite placeholder expansion (e.g. val="...${val}...") + """ + visited, in_stack, stack, cycles = set(), set(), [], [] + + def dfs(node: str) -> None: + visited.add(node) + in_stack.add(node) + stack.append(node) + + for neighbor in graph.get(node, ()): + if neighbor in in_stack: + cycles.append(" → ".join(stack[stack.index(neighbor) :] + [neighbor])) + elif neighbor not in visited: + dfs(neighbor) + + stack.pop() + in_stack.discard(node) + + for node in graph: + if node not in visited: + dfs(node) + + return cycles + + +def validate_placeholder_cycles(config: Mapping[str, Any]) -> list[str]: + """ + Validate that placeholder references in the YML configs' setting strings don't form cycles, which + would cause infinite expansion + + Cycles cause infinite expansion (e.g., val="...${val}..." would cause `val` to keep getting expanded + forever). + """ + graph: dict[str, set[str]] = {} + _collect_placeholder_refs(config, "", graph) + return [f"Circular placeholder reference: {c}" for c in _find_graph_cycles(graph)] + + + + +def validate_mcp_rules(config: Mapping[str, Any]) -> ValidationResult: + """Validate MCP entry schemas: kinds, required fields, unknown keys, types, cross-references. + + Orchestrates the bucket-specific validators plus cross-reference checks + and merges their results. + """ + result = ValidationResult() + for validator in ( + _validate_mcp_dependencies, + _validate_mcp_runtime_services, + _validate_mcp_client_configs, + _validate_cross_references, + ): + r = validator(config) + result.errors.extend(r.errors) + result.warnings.extend(r.warnings) + return result + + +@overload +def validate_config(config: Mapping[str, Any], add_warnings: Literal[True]) -> ValidationResult: ... +@overload +def validate_config(config: Mapping[str, Any], add_warnings: Literal[False] = ...) -> list[str]: ... +def validate_config(config: Mapping[str, Any], add_warnings: bool = False) -> ValidationResult | list[str]: """Perform full validation including structure and format checks. + Backward-compatible: returns errors only (warnings discarded). + Use full_validate_with_warnings() to also get warnings. + Args: config: Configuration dictionary. Returns: List of all error messages. """ - errors = validate_config(config) + errors = validate_config_schema(config) + + # only run deeper checks if structure is valid + if errors: + return errors + + errors.extend(validate_durations(config)) + errors.extend(validate_placeholder_cycles(config)) + errors.extend(validate_service_dependency_cycles(config)) - # Only check duration formats if structure is valid - if not errors: - errors.extend(validate_durations(config)) + # may contain both warnings *and* errors + validation_result = validate_mcp_rules(config) + errors.extend(validation_result.errors) - return errors + return ValidationResult(errors=errors, warnings=validation_result.warnings) if add_warnings else errors def main() -> int: @@ -227,11 +333,15 @@ def main() -> int: print(f"Error: {e}", file=sys.stderr) return 1 - errors = full_validate(config) + result = validate_config(config, add_warnings=True) - if errors: + if result.warnings: + for w in result.warnings: + print(f" \u26a0 {w}", file=sys.stderr) + + if result.errors: print("Configuration validation failed:", file=sys.stderr) - for error in errors: + for error in result.errors: print(f" - {error}", file=sys.stderr) return 1 diff --git a/protocols/config/templates/opencode.json b/protocols/config/templates/opencode.json index bfbafd14..d332ebf4 100644 --- a/protocols/config/templates/opencode.json +++ b/protocols/config/templates/opencode.json @@ -13,99 +13,6 @@ "bash": "ask" }, "tools": {}, - "mcp": { - "context7": { - "type": "remote", - "url": "https://mcp.context7.com/mcp", - "headers": { - "CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}" - }, - "enabled": true - }, - "tavily": { - "type": "remote", - "url": "https://mcp.tavily.com/mcp/?tavilyApiKey={env:TAVILY_API_KEY}", - "enabled": true - }, - "brave": { - "type": "local", - "command": [ - "env", - "BRAVE_API_KEY={env:BRAVE_API_KEY}", - "npx", - "-y", - "@brave/brave-search-mcp-server", - "--transport", - "stdio" - ], - "enabled": true - }, - "pal": { - "type": "local", - "command": [ - "sh", - "-c", - "export PATH='/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:$HOME/.local/bin:{{CLI_BIN_PATHS}}' DISABLED_TOOLS='{{PAL_DISABLED_TOOLS}}' CUSTOM_API_URL='http://localhost:11434'; for p in $(which uvx 2>/dev/null) $HOME/.local/bin/uvx /opt/homebrew/bin/uvx /usr/local/bin/uvx uvx; do [ -x \"$p\" ] && exec \"$p\" --from git+https://github.com/BeehiveInnovations/pal-mcp-server.git pal-mcp-server; done; echo 'uvx not found' >&2; exit 1" - ], - "timeout": 1200000, - "enabled": true - }, - "sourcegraph": { - "type": "remote", - "url": "http://localhost:8783/sourcegraph/mcp/", - "enabled": true - }, - "semgrep": { - "type": "remote", - "url": "http://localhost:8784/mcp/", - "enabled": true - }, - "serena": { - "type": "remote", - "url": "http://localhost:8785/mcp/", - "enabled": true - }, - "playwright": { - "type": "local", - "command": [ - "npx", - "-y", - "@playwright/mcp@latest" - ], - "enabled": true - }, - "memory": { - "type": "local", - "command": [ - "npx", - "-y", - "@modelcontextprotocol/server-memory" - ], - "enabled": true - }, - "filesystem": { - "type": "local", - "command": [ - "npx", - "-y", - "@modelcontextprotocol/server-filesystem", - "{{FS_MCP_WHITELIST}}" - ], - "enabled": true - }, - "fetch": { - "type": "local", - "command": [ - "uvx", - "mcp-server-fetch" - ], - "enabled": true - }, - "qdrant": { - "type": "remote", - "url": "http://localhost:8782/mcp/", - "enabled": true - } - }, + "mcp": {}, "agent": {} } diff --git a/protocols/context/static/by-category/browser-automation.md b/protocols/context/static/by-category/browser-automation.md deleted file mode 100644 index c2356611..00000000 --- a/protocols/context/static/by-category/browser-automation.md +++ /dev/null @@ -1,58 +0,0 @@ -# MCPs: Browser Automation - -## Overview - -Tools for automating browser interactions, testing web applications, and extracting content from dynamic JavaScript-heavy websites. - -## Available MCPs - -### Playwright MCP ⭐ PRIMARY - -**What it does:** Browser automation via accessibility tree - -**Key capabilities:** -- Navigate to URLs and interact with pages -- Click buttons, fill forms, submit data -- Extract structured content from dynamic pages -- Take screenshots for debugging -- Save/restore authentication state -- Multi-browser support (Chrome, Firefox, WebKit) - -**When to use:** -- Testing web applications (E2E tests) -- Interacting with JavaScript-heavy SPAs -- Form automation (login, multi-step flows) -- Extracting data from dynamic content -- Automating repetitive browser tasks - -**When NOT to use:** -- Static HTML content → Use Fetch MCP -- API access available → Use direct API calls -- Large-scale scraping → Use Tavily crawl - -**Rate limits:** None (local execution) - -**Configuration:** stdio transport only - -**Links:** -- [Playwright deep dive](../deep-dives/playwright.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) - -## Decision Tree - -``` -Need to interact with a website? - ↓ -Is content static (no JavaScript)? - ├─ YES → Use Fetch MCP (simpler, faster) - └─ NO → Need to click/type/navigate? - ├─ YES → Use Playwright MCP - └─ NO → Just need final HTML? - └─ Try Tavily extract first - └─ Then Playwright if needed -``` - -## Related Categories - -- [Web Research](web-research.md) - For simple content fetching -- [Code Search](code-search.md) - For finding code examples diff --git a/protocols/context/static/by-category/code-search.md b/protocols/context/static/by-category/code-search.md deleted file mode 100644 index deeae793..00000000 --- a/protocols/context/static/by-category/code-search.md +++ /dev/null @@ -1,223 +0,0 @@ -# Code search tools: comparison & usage guides - -## Quick selection table - -| Tool | Best For | Scope | When to Use | Strengths | -|------|----------|-------|-------------|-----------| -| **Sourcegraph** | Public repo search | Public open-source code (sourcegraph.com); private via Sourcegraph instance | Find code examples/patterns | Regex (RE2), language, file path, revision; Deep Search | -| **Serena** | Local semantic ops | Your codebase | Refactor, navigate symbols | LSP-powered, 20+ langs | -| **Grep / ripgrep (rg)** | Fast text search | Local files | Known pattern, simple search | Very fast, glob/type filters (rg), context lines | - -## Tool usage guides - -### Sourcegraph (primary for public code) - -**What it does:** "Google for code" across public open-source repositories on sourcegraph.com; also supports private repositories when you run a Sourcegraph instance (self-hosted or cloud) and connect your repos. - -**Strengths:** -- Powerful filters: regex (RE2), language, file path, revision (`repo@rev` or `rev:`) -- Deep Search (agentic, natural language → precise queries) -- Returns exact code snippets with line numbers -- Controls for exhaustive/large searches: `count:` and `timeout:`; use `src-cli` for very large result sets - -**Common Patterns:** -``` -repo:github\.com/facebook/react file:\.tsx$ useState -lang:python "async def.*request" -file:Dockerfile EXPOSE -"func SendMessage" lang:go -``` - -**Deep Search (guided):** -- Ask natural-language questions (e.g., "find all HTTP client 5s timeouts") -- Converts to precise queries and iterates to refine -- Outputs relevant matches and reasoning - -**Examples:** -``` -Find React hooks usage → repo:react file:\.tsx$ use.*Hook -Find Go HTTP servers → lang:go "http.Server{" -Find Dockerfile patterns → file:Dockerfile FROM.*alpine -``` - -**When to use:** -- Learning how libraries/APIs are used in practice -- Finding real-world implementations -- Discovering algorithms/patterns -- Researching best practices - -**Limitations:** -- On sourcegraph.com you search public code; private code search requires running a Sourcegraph instance and connecting your private repos (plan-dependent) -- Interactive searches are subject to time and match limits; the web UI displays up to 500 results; use `count:all`, `timeout:`, and/or `src-cli` for exhaustive results - -**Supports many code hosts for private repos:** GitHub, GitLab, Bitbucket, Azure DevOps, Perforce, and more (when connected to your Sourcegraph instance). - -### Serena (primary for local code) - -**What it does:** Language-server-powered semantic navigation/refactoring - -**Strengths:** -- IDE-grade symbol understanding (functions, classes, methods) -- Symbol-level operations (not whole-file) -- Find references across codebase -- Rename with all references updated -- 20+ languages (Python, TypeScript, Go, Rust, Java, etc.) -- Local, no rate limits - -**Key Tools:** -- `find_symbol` - Locate by name/path -- `find_referencing_symbols` - Who calls this? -- `rename_symbol` - Refactor safely -- `insert_after_symbol` / `insert_before_symbol` - Precise insertion -- `replace_symbol_body` - Swap implementation - -**Examples:** -``` -find_symbol("UserService/authenticate") → find method -find_referencing_symbols → see all callers -rename_symbol("getUserData" → "fetchUserProfile") -replace_symbol_body → swap implementation -``` - -**When to use:** -- Refactoring operations -- Understanding code structure -- Finding all usages of symbol -- Safe renames across codebase -- Structural edits (not text replacement) - -**When NOT to use:** Simple text find/replace (use Grep/Edit) - -### CLI text search (ripgrep `rg` and `grep`) - -**What it does:** Fast text/regex search in local files. - -**Strengths:** -- ripgrep (`rg`): very fast; respects `.gitignore`; supports glob filters (`-g/--glob`) and file type filters (`-t/--type`, see `rg --type-list`); context lines (`-A/-B/-C`) -- GNU `grep`: ubiquitous; supports recursive search (`-R`), include/exclude globs, and context lines (`-A/-B/-C`) - -**Regex notes:** -- `grep` uses POSIX BRE/ERE by default; PCRE features (e.g., `\w`, lookarounds, lazy quantifiers) require GNU grep with `-P` and are not available in BSD/macOS `grep` by default. -- `rg` uses Rust regex (no lookaround) by default; many builds also support PCRE2 via `-P` for advanced regex features. - -**ripgrep examples:** -``` -# Find Python service classes -rg -n -t py 'class \w+Service' - -# Find TODOs under src/ -rg -n 'TODO' -g 'src/**' - -# Find API calls like fetch(…api -rg -n 'fetch\(.*api' - -# Find PORT assignments in .env files -rg -n -g '*.env' '^PORT=.*' -``` - -**grep (GNU) equivalents:** -``` -# Find Python service classes -grep -R -n -E 'class [[:alnum:]_]+Service' --include='*.py' . - -# Find TODOs under src/ -grep -R -n 'TODO' --include='*' src/ - -# Find API calls like fetch(…api -grep -R -n -E 'fetch\(.*api' . - -# Find PORT assignments in .env files -grep -R -n -E '^PORT=.*' --include='*.env' . -``` - -**When to use:** -- Known pattern, simple search -- Text-based find (not semantic) -- Quick lookups, strings/comments - -**When NOT to use:** Need semantic understanding or refactoring (use Serena) - -## Decision tree - -``` -Need code search? - ↓ -Public repos or local? -├─ PUBLIC → Sourcegraph -│ (unlimited, real-world examples) -└─ LOCAL → Need semantic understanding? - ├─ YES → Serena - │ (symbols, refactoring, references) - └─ NO → Grep - (fast text/regex search) - -Refactoring needed? - └─ Use Serena (safe, aware of references) - -Find usage examples? - └─ Use Sourcegraph (public repos) - -Simple text search? - └─ Use Grep (instant, no overhead) -``` - -## Comparison: when to use each - -**Use Sourcegraph when:** -- Need examples from public repos -- Learning library usage -- Researching patterns/algorithms -- Want to see real-world code - -**Use Serena when:** -- Working in your codebase -- Refactoring operations -- Need symbol-level understanding -- Finding/updating references -- Structural code changes - -**Use Grep when:** -- Simple text/pattern search -- Known string to find -- Quick lookups -- Searching comments/strings -- No semantic analysis needed - -## Best practices - -**Start with appropriate scope:** -- Public examples? → Sourcegraph -- Local refactor? → Serena -- Quick find? → Grep - -**Leverage strengths:** -- Sourcegraph: Use guided search for complex queries -- Serena: Use for any operation involving symbols -- Grep: Use for speed, text patterns - -**Avoid common mistakes:** -- ❌ Using Grep for refactoring (use Serena) -- ❌ Using Serena for simple text search (use Grep) -- ❌ Not using Sourcegraph guided prompts - -## Common use cases - -**Finding implementation patterns:** -→ Sourcegraph (`repo:.*react.* useEffect`) - -**Renaming function across codebase:** -→ Serena (`rename_symbol`) - -**Finding TODOs/FIXMEs:** -→ Grep (`pattern:"TODO|FIXME"`) - -**Understanding symbol relationships:** -→ Serena (`find_referencing_symbols`) - -**Learning API usage:** -→ Sourcegraph (`lang:python requests.post`) - -## Links to deep dives - -- [Sourcegraph deep dive](../deep-dives/sourcegraph.md) -- [Serena deep dive](../deep-dives/serena.md) diff --git a/protocols/context/static/by-category/documentation.md b/protocols/context/static/by-category/documentation.md deleted file mode 100644 index 92949da1..00000000 --- a/protocols/context/static/by-category/documentation.md +++ /dev/null @@ -1,180 +0,0 @@ -# API documentation tools: Context7 usage guide - -## Overview - -**Context7** is the primary tool for retrieving up-to-date, version-specific API documentation and code examples. - -## Quick reference - -| Aspect | Details | -|--------|---------| -| **Best for** | Official docs, library syntax, API examples | -| **Coverage** | Public repos only (free tier) | -| **Rate limits** | Plan-based: Free lower, Pro higher, Enterprise custom | -| **Strengths** | Version-specific, official sources, code examples | - -## What Context7 does - -**Fetches:** -- Up-to-date API documentation -- Version-specific syntax and examples -- Official library/framework docs -- Code snippets from official sources - -**Works with:** -- Public repositories (GitHub, npm, PyPI, etc.) -- Major frameworks (React, Vue, Angular, Express, etc.) -- Libraries and SDKs -- API documentation sites - -## When to use - -**Primary use cases:** -- Learning a new library/framework -- Checking current API syntax -- Getting official usage examples -- Understanding library capabilities -- Verifying breaking changes between versions - -**Examples:** -``` -"Get React 18 hooks documentation" -→ Returns official React docs for version 18 hooks - -"Show Express.js router API" -→ Fetches Express router documentation - -"Next.js 14 app router examples" -→ Gets version-specific App Router docs - -"Pandas DataFrame API" -→ Returns pandas DataFrame official docs -``` - -## Workflow - -**1. Resolve library ID (required first step):** -``` -resolve-library-id → Get Context7-compatible ID -``` - -**2. Fetch documentation:** -``` -get-library-docs with: -- context7CompatibleLibraryID (from step 1) -- topic (optional, focuses results) -- tokens (default: 5000, max controls length) -``` - -**Example flow:** -``` -Step 1: resolve-library-id("Next.js") - → Returns: "/vercel/next.js" - -Step 2: get-library-docs("/vercel/next.js", topic="routing") - → Returns routing documentation -``` - -## Parameters - -**resolve-library-id:** -- `libraryName` - Package/library name to search - -**get-library-docs:** -- `context7CompatibleLibraryID` - ID from resolve step (required) -- `topic` - Focus area (e.g., "hooks", "routing") (optional) -- `tokens` - Max tokens to return (default: 5000) (optional) - -## Best practices - -**Always resolve first:** Must call `resolve-library-id` before `get-library-docs` (unless user provides ID in `/org/project` format) - -**Be specific with topics:** Use focused topics to get relevant docs -- ✅ `topic="hooks"` → React hooks docs -- ✅ `topic="middleware"` → Express middleware docs -- ❌ `topic="everything"` → Too broad - -**Adjust token limit wisely:** -- Default 5000 works for most cases -- Increase for comprehensive docs -- Decrease for quick lookups - -**Version-specific queries:** -- Specify version when known: "React 18", "Next.js 14" -- Context7 returns version-appropriate docs - -## When *not* to use - -**Use other tools when:** -- Need code examples from real projects → Sourcegraph -- Want community solutions → Web research (Tavily/Brave) -- Looking for blog posts/tutorials → Web research -- Need private repo docs → Not supported (requires paid tier) -- Simple web search → Web research tools - -## Common use cases - -**API syntax lookup:** -→ Context7 (`resolve-library-id` + `get-library-docs`) - -**Real-world usage examples:** -→ Sourcegraph (search public repos) - -**Best practices and patterns:** -→ Web research (Tavily for articles) - -**Version migration guides:** -→ Context7 (official docs) + Web research (community guides) - -## Integration with other tools - -**Typical workflow:** -1. Context7 → Get official API docs -2. Sourcegraph → Find real-world usage examples -3. Web research → Find tutorials/guides if needed - -**Example:** -``` -Task: "Learn how to use React Server Components" - -1. Context7: Get official React 18 server components docs -2. Sourcegraph: Find real implementations in public repos -3. Tavily: Find blog posts explaining concepts -``` - -## Limitations - -**Free tier restrictions:** -- Public repositories only -- Private/org repos require paid plan - -**Not a substitute for:** -- Code search (use Sourcegraph) -- Web tutorials (use Tavily/Brave) -- Community Q&A (use web research) - -## Quick decision tree - -``` -Need documentation? - ↓ -Official API docs? - ├─ YES → Context7 - └─ NO → Community examples? - ├─ YES → Sourcegraph - └─ NO → Tutorials/guides? - └─ YES → Web research (Tavily) -``` - -## Common mistakes to avoid - -❌ Skipping `resolve-library-id` (required first step) -❌ Using for community content (use web research) -❌ Using for private repos (not supported on free tier) -❌ Not specifying topic (get irrelevant broad docs) -❌ Using for real-world repo examples (use Sourcegraph). Context7 is for official docs and examples - -## Links to deep dives - -- [Context7: *full guide*](../deep-dives/context7.md) - diff --git a/protocols/context/static/by-category/memory.md b/protocols/context/static/by-category/memory.md deleted file mode 100644 index ead1edea..00000000 --- a/protocols/context/static/by-category/memory.md +++ /dev/null @@ -1,204 +0,0 @@ -# Memory tools: comparison & usage guides - -## Quick selection table - -| Tool | Type | Best For | Access Pattern | Availability | -|------|------|----------|----------------|--------------| -| **Qdrant** | Semantic | Find by meaning | "Similar to X" | MCP-enabled clients (e.g., Claude Desktop, VS Code) | -| **Memory MCP** | Knowledge graph | Track relationships | "X relates to Y" | MCP-enabled clients (configured per client) | -| **claude-mem** | Auto context (example) | Persistent sessions | Automatic (implementation-specific) | Commonly used with Claude clients | - -## Tool usage guides - -### Qdrant (semantic memory) - -**What it does:** Vector-based semantic memory using embeddings - -**Strengths:** -- Find information by *meaning*, not keywords -- Uses FastEmbed for embeddings (default: BAAI/bge-small-en-v1.5; supports sentence-transformers/all-MiniLM-L6-v2) -- Local Docker or remote Qdrant instances -- Optional metadata alongside text -- Storage bounded by local disk/system resources - -**Tools Available:** -- `qdrant-store` - Save information with metadata -- `qdrant-find` - Retrieve by semantic similarity - -**Examples:** -``` -Store: "JWT authentication with refresh tokens in Express.js" -Find: "auth patterns" → returns JWT, OAuth, session-based - -Store: "React useEffect cleanup functions prevent memory leaks" -Find: "preventing React memory issues" → finds it - -Store: "PostgreSQL JSONB indexing for fast queries" -Metadata: {"type": "database", "language": "sql"} -``` - -**When to use:** -- Build personal knowledge base -- Find by concept/similarity -- Store code snippets for reuse -- Remember solutions to problems -- Semantic search over your data - -**When NOT to use:** -- Need explicit relationships → Use Memory MCP -- Graph queries needed → Use Memory MCP -- Simple keyword search → Use filesystem/grep -- One-time lookup → Don't store - -**Best Practices:** -- Store atomic concepts (one idea per item) -- Use descriptive text (include context, not just code) -- Add metadata for structure (type, language, project) -- Good for: patterns, solutions, links, learnings - -### Memory MCP (Knowledge Graphs) - -**What it does:** Persistent knowledge graph with entities/relations/observations - -**Strengths:** -- Track *who/what* relates to *how* -- Explicit directed relationships (active voice) -- Structured storage (local JSONL file; configurable path) -- Official Anthropic implementation -- Storage bounded by filesystem limits - -**Tools Available:** -- `create_entities` - Create nodes (people, orgs, concepts) -- `create_relations` - Define relationships (from → to) -- `add_observations` - Add facts to entities -- `delete_entities/observations/relations` - Remove items -- `read_graph` - View entire graph -- `search_nodes` - Search by name/type/observation -- `open_nodes` - Get specific entities - -**Example Structure:** -``` -Entity: John_Smith (type: person) - Observations: ["Speaks Spanish", "Async communication"] - Relations: - John_Smith --works_at--> Anthropic - John_Smith --contributes_to--> ProjectX - -Entity: ProjectX (type: project) - Observations: ["React-based", "Uses TypeScript"] - Relations: - ProjectX --depends_on--> React - ProjectX --deployed_on--> Vercel -``` - -**When to use:** -- Track relationships between concepts -- Build structured knowledge base -- Maintain project context (who, what, how) -- Personal CRM or project memory -- Query relationships ("who works at X?") -- Map dependencies/connections - -**When NOT to use:** -- Need semantic/similarity search → Use Qdrant -- Relationships don't matter → Use Qdrant -- Simple notes → Use filesystem -- Temporary (single session) → Keep in context - -**Best Practices:** -- Unique entity names (John_Smith, ProjectX) -- Active voice relations (works_at, depends_on) -- Atomic observations (one fact each) -- Consistent entity types (person, company, project) -- Order matters in relations (directed graph) - -### claude-mem (Example Automatic Context) - -Note: "claude-mem" is used here to describe a common, custom pattern for automatic context/memory built around Claude clients. It is not an official Anthropic product; features vary by implementation. - -**What it does (example):** Implements automatic persistent memory using client/event hooks to capture observations and generate summaries. - -**Potential strengths (implementation-dependent):** -- Fully automatic (minimal manual intervention) -- Captures observations after tool use or key actions -- Generates session summaries and offers progressive disclosure at session start - -**Example lifecycle (varies by client):** -1. Session start → surface recent summaries for quick recall -2. Post tool use → capture observations automatically -3. Stop/end → generate summary (request/completed/learned) -4. Session end → persist across session resets - -**Progressive disclosure pattern (recommended):** -- Layer 1: Compact index/summary -- Layer 2: Full details for selected items -- Layer 3: Source code and transcripts when needed - -**Availability:** Implementation-specific; commonly built for Claude Desktop/Code - -## Qdrant vs Memory: Quick Decision - -**Use Qdrant when:** -- "Find things similar to this" -- Semantic search is main pattern -- Relationships don't matter -- Building retrieval system - -**Use Memory when:** -- "Show me what relates to X" -- Explicit relationships critical -- Need graph queries -- Building context management - -**Use both when:** -- Complex knowledge base needs similarity + relationships -- Example: Qdrant for code snippets, Memory for tracking which projects use them - -## Replicating Automatic Context Flows - -If your client lacks automatic hooks, manually implement: - -**Manual observation logging:** -- After reading code → Save to Qdrant + Memory MCP -- After decisions → Create entities with relations -- After fixes → Store in Qdrant with metadata - -**Session summaries:** -- Before ending → Create summary and store it (e.g., in Qdrant) -- Include: request, completed, learned, next_steps - -**Context recovery:** -- Start sessions → Search Qdrant for past work -- Fetch related entities from Memory MCP - -**Key differences:** -- Custom automatic flows: automated capture/surfacing -- Qdrant+Memory: manual capture unless automation is added -- Qdrant: vector search; Memory MCP: graph queries/relations -- Typed observations/tags often require manual conventions - -## Best Practices - -**Choose right tool:** -- Semantic search → Qdrant -- Relationships → Memory MCP -- Both → Use both (complementary) - -**Storage strategy:** -- Qdrant: Searchable content, solutions, patterns -- Memory: Relationships, context, project structure - -**Search efficiently:** -- Custom automatic memory: Start with an index/summary view -- Qdrant: Use descriptive queries -- Memory: Search then open specific nodes - -**Common mistakes:** -- ❌ Using Qdrant for relationships (use Memory) -- ❌ Using Memory for similarity search (use Qdrant) -- ❌ Fetching full details without using an index/summary first - -## Quick Reference Links - -- [Full decision guide](../../tools/tools-decision-guide.md#memory) -- [Compact tool list](../tools-guide.md) *(tier 1)* diff --git a/protocols/context/static/by-category/web-research.md b/protocols/context/static/by-category/web-research.md deleted file mode 100644 index 07b13d10..00000000 --- a/protocols/context/static/by-category/web-research.md +++ /dev/null @@ -1,113 +0,0 @@ -# Web research tools: comparison & usage guides - -## Quick selection table - -| Tool | Best For | Limit | When to Use | Avoid When | -|------|----------|-------|-------------|-----------| -| **Tavily** | General web + citations | 1k/mo | Current info, multi-source research | Credits low | -| **Brave** | Privacy search | 2k/mo | Tavily exhausted, basic search | Need advanced features | -| **Fetch** | Simple URL fetch | ∞ | Known URL, simple content | Need search/crawl | - -## Tool usage guides - -### Tavily (primary choice) - -**Strengths:** -- Citations included (critical for credibility) -- Search + extract + map + crawl in one tool -- Generous 1k credits/month (resets 1st) -- Handles news, current events, general info - -**Common Parameters:** -- `search` - Query with citations (1–2 credits; basic=1, advanced=2) -- `extract` - Get content from URLs (varies) -- `map` - Discover site structure -- `crawl` - Multi-page content - -**Examples:** -``` -"Latest React 19 features" → search with citations -"Extract content from example.com/article" → extract -"Map all docs at docs.example.com" → map -``` - -**When NOT to use:** Credits < 100 remaining → switch to Brave - -### Brave (secondary choice) - -**Strengths:** -- Privacy-focused, no tracking -- 2k queries/month -- Multiple search types (web/news/image/video; Local API on Pro) -- Good general-purpose fallback - -**Limitations:** -- Free plan includes Web/Images/Videos/News/Discussions/FAQ; Local API requires Pro -- No advanced crawling/extraction - -**Examples:** -``` -"Python asyncio tutorial" → web search -"Restaurants near me" → local search (if Pro) -"AI news today" → news search -``` - -**When to use:** After Tavily credits exhausted, need basic search - -### Fetch (simple fallback) - -**Strengths:** -- Unlimited usage -- HTML → Markdown conversion -- Chunk reading via start_index -- No rate limits - -**Limitations:** -- For raw file contents from GitHub, use raw.githubusercontent.com or gh CLI; GitHub HTML pages are fetchable -- No search/crawl features -- One URL at a time - -**Examples:** -``` -fetch("https://raw.githubusercontent.com/user/repo/main/README.md") -fetch("https://example.com/article.html") → returns markdown -``` - -**When to use:** Simple one-off URL fetch, all other tools overkill - -## Decision tree - -``` -Need web content? - ↓ -Known URL(s)? -├─ YES → Single URL? → Fetch -│ Multiple or complex? → Tavily extract -└─ NO → Need search - ↓ - Tavily credits OK? - ├─ YES → Use Tavily (citations!) - └─ NO → Use Brave (2k/mo) -``` - -## Best practices - -**Always start with:** Tavily (best balance of features + limits) - -**Use Brave when:** Tavily credits low, need basic search - -**Use Fetch when:** Known URL, simple content, other tools overkill - -**Track limits:** Check monthly resets (Tavily: 1st, Brave: varies) - -## Common mistakes to avoid - -❌ Ignoring Tavily's citation feature (always prefer cited sources) -❌ Not checking credit balance before complex Tavily operations -❌ Using Fetch against GitHub when you need raw file contents (use raw.githubusercontent.com or gh CLI). For HTML repo pages, Fetch is fine. - -## Links to deep dives - -- [Tavily deep dive](../deep-dives/tavily.md) -- [Brave deep dive](../deep-dives/brave.md) -- [Fetch deep dive](../deep-dives/fetch.md) diff --git a/protocols/context/static/code-standards.md b/protocols/context/static/code-standards.md new file mode 100644 index 00000000..5f2df462 --- /dev/null +++ b/protocols/context/static/code-standards.md @@ -0,0 +1,371 @@ +# Coding style & standards + + + +***Contents:*** + +- [Comments](#comments) + - [Depth (key standard)](#depth-key-standard) + - [Formatting](#formatting) +- [Naming](#naming) +- [Structure and organization](#structure-and-organization) + - [File-level](#file-level) + - [Function-level](#function-level) +- [Error handling](#error-handling) +- [Logging and observability](#logging-and-observability) +- [DRY and abstraction](#dry-and-abstraction) +- [Types and data modeling](#types-and-data-modeling) +- [Correctness and defensiveness](#correctness-and-defensiveness) +- [Testing](#testing) +- [Dependencies and coupling](#dependencies-and-coupling) +- [Pragmatism](#pragmatism) + + +## Comments + +### Depth (key standard) + +#### Tier 1: *always* required + +- **Design rationale block** at the top of any file or major section implementing non-trivial logic + + - What the component does (1-2 sentences) + - Why this approach was chosen (if alternatives exist) + - Key invariants the reader must keep in mind + +> [!NOTE] +> +> The design rationale block is implemented as a **comment block**, *distinct* from any language-level docstrings. + +- **"Why, not what"** on every non-obvious branch, conditional, or control-flow decision + + - Explain the *reasoning* behind the decision, not what the code literally does + +- **Struct field / config value contracts** for any field whose purpose isn't obvious from its name and type alone + + - State what it controls, why it exists, and any constraints on valid values + +#### Tier 2: required when applicable + +- **Rejected-alternatives** documentation when a design decision has non-obvious trade-offs + + - Name the alternative, say why it was rejected + +- **Safety / correctness / starvation** comments when code implements a protective mechanism + + - Describe the threat (what breaks without this code) + - Describe the invariant being enforced + +- **Protocol step narration** for any multi-step algorithm or protocol implementation + + - Number the steps or use a clear sequential narrative + +- **Constant justification** for any magic number, threshold, or tuning parameter + + - State whether the value is empirical or formally derived, where it came from, and whether it can be tuned + +#### Tier 3: should use in complex systems code + +- **Formal spec / standard references** when implementing a protocol, standard, or well-defined algorithm + + - Cite the spec section, RFC, or formal model + +- **Locking / concurrency discipline** comments when multiple locks or atomic operations are involved + + - State what lock is held and what ordering is required + +### Formatting + +#### Inline comments + +- Start with **lowercase** (unless beginning with a proper noun) +- **Omit trailing periods** +- Lead with **action verbs** like "format", "skip", "patch", "retrieve" +- Use a `note` prefix for non-obvious implementation details or external dependencies + +#### What to avoid + +- Do not leave commented-out code +- Do not add type annotations in comments (put them in signatures) +- Do not add author/date stamps +- Avoid trailing comments at the end of code lines (unless concise) + +#### Section headers (in long files) + +- Use commented separator lines of many `─` characters to divide logical sections (and occasionally `━` for more differentiation or to delineate major sections, if there are many) + +## Naming + +- Where applicable, use **domain terms** from the project's ubiquitous language in types, functions, variables, and tests + + - If the domain calls it a "replica", don't name it `follower_node` or `secondary` + - If two bounded contexts use the same word differently, disambiguate explicitly (e.g. `billing.Account` vs `auth.Account`) + +- Name functions and methods as **actions** (verbs): `drain_queue`, `elect_leader`, `reconcile_state` + + - *Exception:* pure accessors and predicates read better as nouns/adjectives: `is_quorum`, `leader_id` + +- Name tests like **mini-specs**: `should_reject_expired_token`, `given_partition_when_write_then_timeout` + + - The name alone should tell you what broke when the test fails + +- **Name length should be proportional to scope** + + - Narrow scope (loop variable, short closure) → short names are fine: `i`, `n`, `ch`, `err` + - Wide scope (module-level, exported, config) → precise and unambiguous: `checkpoint_interval_ms`, `max_retry_attempts` + - A longer name that's instantly clear beats a short name that requires context — but `current_iteration_index` in a 3-line loop is noise + +- **Constants and thresholds** should get descriptive names; *never* bare literals or magic numbers + + - e.g. `QUORUM_TIMEOUT_MS = 500`, and not `500` inline + - See the *Constant justification* directive in [tier 2 of the commenting depth standard](#tier-2-required-when-applicable) for the accompanying comment requirement + +## Structure and organization + +### File-level + +- **One concept per file** in most cases + + - A file should have a single, articulable reason to exist + - *Exception:* tightly-coupled small types (e.g. a value object and its builder, or an enum and its parser) can coexist if separating them adds navigation overhead without clarity + - In C/C++, the natural unit is a **data structure and its operations** (the `.c`/`.h` pair); this may span several "concepts" in the OOP sense, and that's idiomatic + - In Go, the natural unit is a **package**, which may contain multiple files; keep each file focused on a single type or functional group within the package + +- **Design rationale at the top** (as specified in [Tier 1 of the commenting depth standard](#tier-1-always-required)) + +- **Dependency direction matters** (see [Dependencies and coupling](#dependencies-and-coupling) for the full directive) + +- **Section separators** (`─` / `━`) for files longer than ~200 lines (see *Comments > Section headers*) + +### Function-level + +- **Small surface area**: functions should do *one thing* and accept only the arguments they *actually need* + + - If a function is longer than ~40-50 lines, look for extractable sub-operations + - If a function takes 5+ parameters, it's probably doing too much or needs a config/options object + + - In languages without keyword arguments or builder patterns (C, Go), functions routinely take 5-6 parameters; judge by *conceptual* cohesion, not raw count + +

+ + > **Note:** these are *guidelines*, not laws. A 60-line function that reads linearly and does one coherent thing is better than three 20-line functions with tangled control flow between them. + +- **Early returns** for guard clauses and precondition checks + + - Validate inputs and bail at the top; keep the happy path at the lowest nesting level + - Avoid deep nesting: if you're 4+ levels deep, refactor + +- **Explicit state transitions** for objects with lifecycle states + + - Model workflows *as methods* (`activate()`, `cancel()`, `promote_to_leader()`) and *not* bare field assignments + - Plain data structures (config structs, packet buffers, intermediate results) don't need this; direct field writes are idiomatic and correct there + +## Error handling + +- **Handle errors *close* to where they occur**, with context sufficient for debugging + + - Wrap/annotate errors as they propagate upward so the final message reconstructs the call chain (e.g. Go's `fmt.Errorf("…: %w", err)`, Rust's `anyhow::Context`, Python's exception chaining, Java's cause chains) + - Never swallow errors silently; if intentionally ignoring, comment *why* + +- **Distinguish recoverable errors from fatal ones** in the type system or API (where the language allows) + + - The caller should know from the signature whether an error is retriable, permanent, or a bug + +- **Error paths deserve the same rigor as happy paths** + + - Invalid inputs, network errors, timeouts, permission denials, resource exhaustion, etc. should all be *explicitly* handled + - Seeing 20 "should succeed when..." tests and only 2 "should fail when..." tests ⇒ failure modes are undertested ⇒ code smell + +- **Fail fast, fail loud** on invariant violations + + - If a precondition that *must* hold is violated, crash or panic with a clear message — don't attempt "best effort" recovery from a state that should be impossible + - Reserve graceful degradation for *expected* failures (network partitions, timeouts), not *bugs* + +## Logging and observability + +- **Structured logging over unstructured strings** + + - Log entries should be machine-parseable (key-value pairs or JSON), not `printf`-style prose + - Include **context fields** that enable filtering and correlation: request ID, node ID, operation name, relevant entity IDs + +- **Log levels have precise semantics**; don't blur them + + - **`ERROR`**: something is broken and needs human attention (page-worthy in production) + - **`WARN`**: unexpected condition that the system handled, but should be investigated if it recurs + - **`INFO`**: expected lifecycle events (startup, shutdown, config reload, leader election, connection established) + - **`DEBUG`**: development-time detail, never expected in production log volume + +- **Correlation IDs for distributed traces** + + - Every request or operation that crosses a process/service boundary should carry a trace/correlation ID + - Propagate it through all downstream calls and include it in every log entry for that operation + +- **Log at boundaries, not at every function call** + + - Log when: + + - entering/exiting a module boundary + - on errors, and + - on significant state transitions + + - Interior helper functions generally shouldn't log; instead, they should return errors to callers who have enough context to log meaningfully + - If a function is logging *and* returning an error, one of them is redundant. The caller should log the error with more context. + +## DRY and abstraction + +- **Extract only when the duplication is *real* and *likely to co-evolve*** + + - Two code blocks that *look* similar but serve different domains or change for different reasons are *not* duplication — they're coincidence + - Three occurrences is a reasonable threshold; two is often premature + +- **Abstractions must earn their keep** + + - Every layer, interface, or indirection should have a *concrete justification*: testability, swappability, or encapsulation of a genuinely volatile decision + - If a function is called from exactly one place and exists only to "keep things clean", inline it + - A helper with 6 parameters used to avoid 3 lines of repetition is a net loss + +- **Inline is fine when it's clearer** + + - Three similar lines of straightforward code is often better than a premature abstraction + - Code is read far more often than written; optimize for the reader, not the writer's DRY instinct +

+ + > **Critical distinction:** + > + > The enemy is not repetition, per se, but rather **divergence risk**: places where the same invariant is enforced in *multiple* spots and one of them will inevitably fall out of sync with the others (without strenuous and diligent maintenance). *That* is the duplication worth killing. + +- **When you *do* abstract, make the abstraction *obvious*** + + - Name it after *what it does for the caller*, not *how it does it* + - If the caller still needs to understand the internals to use it correctly, the abstraction is leaking + +## Types and data modeling + +- **Make invalid states *unrepresentable*** + + - Use the type system to prevent nonsense: `Money`, `Email`, `NonEmptyList`, `NodeId`. + + - A type system's value is severely diminished if the types are merely raw primitives with validation scattered across callers + + - Constructors/factory methods enforce invariants; if it exists, it's valid + +- **Value objects for data with rules** + + - If a value has units, formatting, validation, or equality semantics beyond raw comparison, give it a type + - Value objects should have **value semantics**: prefer immutability; in languages where full immutability is impractical (C, Go), enforce it by convention and document the contract + +- **Reference other aggregates by ID, not by object** + + - `order_id: OrderId` — not `order: Order` + - This keeps aggregate boundaries crisp and prevents accidental coupling + +- **Prefer narrow types over wide ones** + + - `NodeId` > `string`, `Port` > `int`, `Duration` > `float` + - Why: + + - This is portable across tiny scripts and million-line codebases + - It catches bugs at compile time that would otherwise surface in production + +## Correctness and defensiveness + +- **Encode invariants close to the data they constrain** + + - A constraint documented in a comment three files away is a constraint that *will* be violated + - The best invariant is one the compiler/runtime enforces automatically + +- **Handle concurrency with explicit discipline** + + - **Document protected state at the lock declaration site**: the comment on a mutex should list exactly which fields it guards + - **Establish and document lock ordering** to prevent deadlocks. For example, if lock A must be acquired before lock B, say so *once* at the declaration and enforce everywhere + - State what the *threat* is if the discipline is violated (see the *Locking/concurrency discipline* section in [Tier 3 of the commenting depth standards](#tier-3-recommended-for-complex-systems-code)) + - Prefer the **weakest sufficient memory ordering** (`acquire`/`release` over `seq_cst`) when the correctness argument is clear; default to stronger ordering when unsure + - Prefer message-passing or immutable data over shared mutable state when the design allows + +- **No `sleep()` in tests; no real clocks in test assertions** + + - Inject a clock/scheduler/virtual time source + - Timing-sensitive assertions are the *#1* source of flaky tests + +- **Edge cases are first-class citizens, *not* afterthoughts** + + - Partitions, crash loops, empty collections, zero-length inputs, maximum-size inputs, unicode edge cases, concurrent mutations, etc.... consider them during **design**, *not* after the happy path is "done" + +## Testing + +- **Test behavior, not implementation** + + - Assert on *observable outcomes* (return values, state changes, side effects), not on which internal methods were called or in what order + - A refactor that preserves behavior should not break any test; if it does, the test is coupled to implementation, not to correctness + +- **One reason to fail per test** + + - Each test should verify *one logical assertion* so that a failure pinpoints exactly what broke + - Multiple assertions are fine when they verify facets of the *same* behavior (e.g. checking both the status code and the response body of an API call) + +- **Use consistent test structure** + + - Follow **Arrange → Act → Assert** (or equivalently, Given → When → Then) in every test + - This makes tests scannable: setup is visually separated from the action and the verification + +- **Prefer real dependencies over mocks when practical** + + - Mocks are appropriate at *module boundaries* (network, disk, external APIs) but not for internal collaborators + - Over-mocking produces tests that pass while the real system is broken: the tests are verifying a fantasy, not the actual wiring + - When you *do* mock, mock *interfaces* (not concrete classes), and assert on *contracts*, not call counts + +- **Property-based tests for invariants** + + - When a function has a well-defined invariant (idempotency, commutativity, encode-then-decode round-trip, sort stability), express it as a *property* and let the framework generate inputs + - Property tests find edge cases that hand-written examples miss, especially around boundary values, empty inputs, and unicode + +- **Test failures must be self-diagnosing** + + - A failed test should tell you *what* failed, *what was expected*, and *what was actually observed*, all *without* requiring you to attach a debugger + - Include descriptive messages in assertions; avoid bare `assert x` without context + - Name tests like mini-specs (see [Naming](#naming)) so the test name alone tells you what broke + +## Dependencies and coupling + +- **Depend on interfaces at module boundaries; depend on implementations inside modules** + + - Within a module, concrete types are fine: over-abstracting internals creates indirection without benefit + - At module boundaries, interfaces allow swapping implementations, fakes for testing, and independent evolution + +- **Dependency direction flows inward**: domain ← application ← infrastructure + + - Domain code has zero external dependencies; in particular, domain models must be **persistence-ignorant** — no ORM annotations, no SQL, no serialization concerns leaking into the domain layer + - Application code orchestrates domain + infra but doesn't contain business rules + - Infrastructure adapters are leaf nodes + +- **Anti-corruption layers at integration boundaries** + + - When integrating with external systems, legacy code, or third-party APIs, translate their model into *your* domain's language at the boundary; don't let foreign concepts leak inward + - The translation layer is the *only* place that knows the external system's schema, naming, and quirks + +- **Keep the blast radius of changes small** + + - A change to module A should not force changes in modules B, C, and D unless they *genuinely share* the changed concern + - If a single-line change causes a cascade of 10 file edits, the coupling is too tight + +## Pragmatism + +- **Correctness > velocity**, but **shipping > perfection** + + - Invest in correctness where failures are *costly* (data loss, security, corruption, distributed state) + - Accept "good enough" where failures are *cheap* (formatting, log messages, dev tooling UX) + +- **Write idiomatic code for the language you're in** + + - Don't import patterns from another language wholesale (e.g. Java-style class hierarchies in Go, OOP patterns in C, Rust borrow-checker thinking in Python) + - Study how the language's best practitioners write code and follow those conventions: idiomatic code is readable to the community that maintains it + +- **Over-engineering is as bad as under-engineering** + + - Don't add feature flags, backward-compatibility shims, or plugin systems for hypothetical future requirements + - Don't create abstractions for one-time operations + - The right amount of complexity is the *minimum* needed for the current task diff --git a/protocols/context/static/deep-dives/brave.md b/protocols/context/static/deep-dives/brave.md deleted file mode 100644 index 960135aa..00000000 --- a/protocols/context/static/deep-dives/brave.md +++ /dev/null @@ -1,225 +0,0 @@ -# Brave MCP: Deep Dive - -## Overview - -Privacy-focused search engine with 5 search types and generous free tier (2,000 queries/month). - -## Available Tools - -### 1. `brave_web_search` - General Web Search - -**What it does:** Standard web search with rich metadata - -**Parameters:** -- `query` (required, max 400 chars) - Search query -- `count` (1-20, default 10) - Number of results -- `offset` (0-9) - Pagination offset -- `country` - 2-char code (US, GB, etc.) -- `search_lang` - Language preference (en, es, fr, etc.) -- `safesearch` - "off" | "moderate" | "strict" -- `result_filter` - Array (e.g. `["web", "images", "news", "videos", "discussions", "faq"]`; other values like `"infobox"`, `"query"`, `"summarizer"`, or `"locations"` may appear depending on plan/endpoint) -- `freshness` - "pd" | "pw" | "pm" | "py" | "YYYY-MM-DDtoYYYY-MM-DD" -- `spellcheck` (bool, default true) - -**Returns:** JSON list with title, description, URL; may include FAQ, discussions, news, videos - -**Best for:** General web searches, alternative to Tavily - -**Rate limits:** 2,000 queries/month (free tier) - -### 2. `brave_local_search` - Location-Based Search - -**What it does:** Search for businesses, places, services - -**Parameters:** Same as web search + location awareness - -**Returns:** Business names, addresses, ratings, hours, phone numbers - -**Best for:** "Near me" queries, local businesses, restaurants - -**Rate limits:** Requires Pro plan (not available on free tier) - -**Fallback:** Automatically falls back to web search if no local results - -### 3. `brave_video_search` - Video Discovery - -**What it does:** Search for videos with metadata - -**Parameters:** -- `query` (required) -- `count` (1-50, default 20) -- `freshness` - Time filters -- `safesearch` - -**Returns:** Videos with title, URL, description, duration, thumbnail - -**Best for:** Finding video content, tutorials, talks - -**Rate limits:** 2,000 queries/month - -### 4. `brave_image_search` - Image Discovery - -**What it does:** Search for images - -**Parameters:** -- `query` (required) -- `count` (1-200, default 50) -- `safesearch` - "off" | "strict" - -**Returns:** Images with URLs, titles, properties - -**Best for:** Finding pictures, design inspiration, visual references - -**Rate limits:** 2,000 queries/month - -### 5. `brave_news_search` - News Articles - -**What it does:** Search recent news - -**Parameters:** -- `query` (required) -- `count` (1-50, default 20) -- `freshness` - "pd" | "pw" | "pm" | "py" | "YYYY-MM-DDtoYYYY-MM-DD" (no documented default) -- `extra_snippets` (bool) - Up to 5 additional excerpts - -**Returns:** News articles with titles, URLs, descriptions, snippets - -**Best for:** Current events, breaking news, recent updates - -**Rate limits:** 2,000 queries/month - -### 6. `brave_summarizer` - AI Summary (Pro Only) - -**What it does:** Generate AI summaries of search results - -**Parameters:** -- `key` (required) - Summary key from web search -- `inline_references` (bool) - Add source citations -- `entity_info` (bool) - Include entity details - -**Returns:** Text summary with optional references - -**Best for:** Quick overviews of complex topics - -**Rate limits:** Requires Pro AI subscription - -## Tradeoffs - -### Advantages -✅ Privacy-focused (no tracking/profiling) -✅ Generous free tier (2k/month) -✅ Multiple search types in one MCP -✅ Good fallback when Tavily exhausted -✅ Clean, structured results - -### Disadvantages -❌ Local Search and Summarizer require Pro plan -❌ No advanced crawling/extraction (use Tavily) -❌ Pro features require paid plan -❌ No citations included (unlike Tavily) - -## Common Pitfalls: When NOT to Use - -### ❌ Need Citations -**Problem:** Brave doesn't include source citations -**Alternative:** Tavily (includes citations by default) - -**Example:** -``` -Bad: brave_web_search("climate change facts") -Good: tavily_search("climate change facts") # Returns with citations -``` - -### ❌ Semantic/Conceptual Search -**Problem:** Brave uses keyword matching, not semantic understanding -**Alternative:** Tavily (better semantic understanding) - -**Example:** -``` -Bad: brave_web_search("concepts similar to event sourcing") -Good: tavily_search("concepts similar to event sourcing") -``` - -### ❌ Content Extraction -**Problem:** Brave only returns search results, not full content -**Alternative:** Tavily extract or Fetch - -**Example:** -``` -Bad: brave_web_search + manual extraction -Good: tavily_extract(["url1", "url2"]) -``` - -### ❌ Multi-Page Crawling -**Problem:** Brave doesn't crawl websites -**Alternative:** Tavily crawl - -**Example:** -``` -Bad: brave_web_search + trying to crawl results -Good: tavily_crawl("https://docs.example.com") -``` - -### ❌ Local Search (Free Tier) -**Problem:** Local search requires Pro plan -**Alternative:** Use web search with location terms - -**Example:** -``` -Bad: brave_local_search("restaurants") # Fails on free tier -Good: brave_web_search("restaurants in [city]") -``` - -## When Brave IS the Right Choice - -✅ **Tavily credits exhausted** (monthly reset) -✅ **Privacy-focused results** needed -✅ **Basic web search** sufficient -✅ **Video/image discovery** required -✅ **News search** for current events - -**Decision rule:** "Is Tavily available? If not, use Brave." - -## Alternatives Summary - -| Task | Instead of Brave | Use This | -|------|-----------------|----------| -| Search with citations | web_search | Tavily | -| Semantic search | web_search | Exa | -| Content extraction | web_search | Tavily extract / Fetch | -| Multi-page crawl | web_search | Tavily crawl | -| Local businesses | local_search (free) | web_search with location | - -## Best Practices - -**Use Brave as backup:** -- Primary: Tavily (citations, extraction, crawl) -- Secondary: Brave (basic search when Tavily low) -- Tertiary: Exa (semantic search) - -**Optimize queries:** -- Use specific keywords (not semantic) -- Add location terms for local results -- Use `freshness` for time-sensitive queries -- Filter with `result_filter` for specific content types - -**Monitor limits:** -- 2k/month resets monthly -- Track usage to avoid exhaustion -- Switch to Fetch for known URLs - -**Free tier constraints:** -- Web, Image, Video, and News endpoints are available -- Local Search and Summarizer require Pro; calls to these will fail on free tier -- You can use `result_filter` to scope web results; for full metadata use the dedicated endpoints - -## Quick Reference - -**Total budget:** 2,000 queries/month -**Rate limits:** Monthly reset -**Reset:** Monthly (varies by account) -**Cost:** Free tier, Pro for advanced features - -**Links:** -- [Category guide: Web research](../category/web-research.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/claude-mem.md b/protocols/context/static/deep-dives/claude-mem.md deleted file mode 100644 index ce27d2c8..00000000 --- a/protocols/context/static/deep-dives/claude-mem.md +++ /dev/null @@ -1,362 +0,0 @@ -# claude-mem: Deep Dive - -## Critical Note - -**Claude Code Only** - Requires Claude Code's hook system. Not available for Codex/Gemini CLIs. - -## Overview - -Automatic persistent memory compression via Claude Code plugin hooks. Zero manual intervention. 7 search tools available. - -## How It Works (Automatic) - -### 5 Lifecycle Hooks (No Manual Calls) - -**1. SessionStart:** -- Injects summaries from last 10 sessions -- Progressive disclosure (token costs visible) -- Layered timeline with color-coded priority - -**2. UserPromptSubmit:** -- Creates session record automatically -- Saves raw user prompts for search -- No agent action required - -**3. PostToolUse:** -- Fires after EVERY tool execution (Read, Write, Edit, Bash, etc.) -- Captures observations automatically -- No agent intervention needed - -**4. Stop:** -- Generates session summaries -- Includes: request, completed, learned, next_steps -- Runs automatically when session pauses - -**5. SessionEnd:** -- Marks sessions complete -- Graceful cleanup -- Preserves work across `/clear` - -### Worker Service (PM2-Managed) - -- Express server on port 37777 -- Processes observations via Claude Agent SDK -- Extracts structured learnings (decisions, bugfixes, features, refactors, discoveries, changes) -- Auto-starts when first session begins - -### SQLite Database - -- Location: `~/.claude-mem/claude-mem.db` -- FTS5 full-text search with SQL injection protection -- Tracks files read/modified, concepts, types, relationships - -## Available Search Tools (7 Manual Tools) - -### 1. `search_observations` - Full-Text Observation Search - -**What it does:** Search across observation titles, narratives, facts, concepts - -**Parameters:** -- `query` (required) - FTS5 search query -- `format` (default "index") - "index" | "full" -- `type` - Filter: "decision", "bugfix", "feature", "refactor", "discovery", "change" -- `concepts` - Filter by concept tags -- `files` - Filter by file paths (partial match) -- `project` - Filter by project name -- `dateRange` - Filter by date range -- `limit` (default 20, max 100) -- `offset` (default 0) -- `orderBy` (default "relevance") - "relevance" | "date_desc" | "date_asc" - -**Returns:** Observations with metadata - -**Best for:** Finding past work, decisions, discoveries - -**CRITICAL:** Always start with `format: "index"` (50-100 tokens/result) before `format: "full"` (500-1000 tokens/result) - -### 2. `search_sessions` - Full-Text Session Search - -**What it does:** Search across session summaries (requests, completions, learnings, notes) - -**Parameters:** -- `query` (required) -- `format` (default "index") -- `project`, `dateRange`, `limit`, `offset`, `orderBy` - -**Returns:** Session summaries - -**Best for:** Finding past sessions, understanding project history - -### 3. `search_user_prompts` - Raw User Request Search - -**What it does:** Search what user actually said/requested - -**Parameters:** -- `query` (required) -- `format` (default "index") -- `project`, `dateRange`, `limit`, `offset`, `orderBy` - -**Returns:** User prompts (truncated in index, full in full format) - -**Best for:** Tracing user intent → implementation - -### 4. `find_by_concept` - Filter by Concept Tags - -**What it does:** Find observations tagged with specific concept - -**Parameters:** -- `concept` (required) - Concept tag (e.g., "architecture", "security") -- `format` (default "index") -- `project`, `dateRange`, `limit`, `offset`, `orderBy` - -**Returns:** Observations with that concept - -**Best for:** Topic-based discovery - -**CRITICAL:** Start with `limit: 3-5` even in index mode (avoid MCP token limits) - -### 5. `find_by_file` - Find Work on Specific Files - -**What it does:** Find all observations/sessions referencing a file - -**Parameters:** -- `filePath` (required) - File path (supports partial matching) -- `format` (default "index") -- `project`, `dateRange`, `limit`, `offset`, `orderBy` - -**Returns:** Observations and sessions related to file - -**Best for:** File history, understanding changes - -### 6. `find_by_type` - Filter by Observation Type - -**What it does:** Find observations of specific type - -**Parameters:** -- `type` (required) - "decision" | "bugfix" | "feature" | "refactor" | "discovery" | "change" -- `format` (default "index") -- `project`, `dateRange`, `limit`, `offset`, `orderBy` - -**Returns:** Typed observations - -**Best for:** Finding all decisions, bugfixes, etc. - -### 7. `get_recent_context` - Recent Session Context - -**What it does:** Get recent session context for debugging/recovery - -**Parameters:** -- `project` (optional, defaults to current directory basename) -- `limit` (default 3, max 10) - Number of recent sessions - -**Returns:** Recent sessions and observations - -**Best for:** Recovery, understanding recent work - -## Tradeoffs - -### Advantages -✅ **Fully automatic** (zero intervention) -✅ **Captures everything** (all tool executions) -✅ **Progressive disclosure** (index → full) -✅ **FTS5 full-text search** (powerful queries) -✅ **Typed observations** (decisions, bugfixes, etc.) -✅ **Claude Agent SDK** (AI extraction) - -### Disadvantages -❌ **Claude Code only** (requires hooks) -❌ **No manual control** (automatic everything) -❌ **Different from Qdrant/Memory** (FTS5 vs vector/graph) -❌ **Token limits** on search results (use index first) - -## Common Pitfalls: When NOT to Use - -### ❌ Not Using Index Format First -**Problem:** Full format consumes 10x tokens -**Alternative:** Always start with format: "index" - -**Example:** -``` -Bad: search_observations("auth", format: "full") - → 500-1000 tokens per result - -Good: search_observations("auth", format: "index") - → 50-100 tokens per result - → Then fetch specific items with full format -``` - -### ❌ High Limit Without Index -**Problem:** Exceeds MCP token limits -**Alternative:** Start with limit: 3-5, even in index mode - -**Example:** -``` -Bad: find_by_concept("architecture", limit: 20) - → May exceed token limits - -Good: find_by_concept("architecture", limit: 5, format: "index") - → Check results, increase if needed -``` - -### ❌ Using on Codex/Gemini -**Problem:** claude-mem requires Claude Code hooks -**Alternative:** Manual Qdrant + Memory MCP workflow - -**Example:** -``` -Bad: search_observations on Codex - → Not available - -Good: Manual qdrant-store + memory.create_entities workflow -``` - -### ❌ Expecting Manual Saves -**Problem:** claude-mem is automatic, no manual save -**Alternative:** Qdrant/Memory for manual control - -**Example:** -``` -Bad: Trying to manually trigger observation capture - → Automatic via hooks - -Good: Use Qdrant for manual, selective saves -``` - -### ❌ Graph Queries -**Problem:** claude-mem doesn't support relationship graphs -**Alternative:** Memory MCP - -**Example:** -``` -Bad: search_observations for "who created what" -Good: memory.search_nodes for relationship queries -``` - -## When claude-mem IS the Right Choice - -✅ **Claude Code users** (required) -✅ **Automatic context** preservation -✅ **Cross-session memory** without manual saves -✅ **Finding past work/decisions** -✅ **Tracing user intent** → implementation - -**Decision rule:** "Am I using Claude Code and need automatic context?" - -## Usage Patterns - -**Progressive disclosure workflow:** -``` -1. search_observations("authentication", format: "index", limit: 5) - → See titles, dates, concepts (50-100 tokens each) - -2. Review index results, pick relevant IDs - -3. search_observations("authentication", format: "full") - → Get full details for specific items (500-1000 tokens each) -``` - -**Find recent work:** -``` -get_recent_context(project: "my-app", limit: 3) -→ Last 3 sessions with summaries -``` - -**Trace user requests:** -``` -search_user_prompts("implement JWT auth") -→ Find when user requested JWT implementation -→ Trace to observations/sessions -``` - -**Find by type:** -``` -find_by_type("decision", limit: 5, format: "index") -→ Recent architectural decisions - -find_by_type("bugfix", limit: 5, format: "index") -→ Recent bug fixes -``` - -**File history:** -``` -find_by_file("auth.ts", format: "index") -→ All work on auth.ts file -``` - -**Concept-based:** -``` -find_by_concept("security", limit: 5, format: "index") -→ Security-related observations -``` - -## Codex/Gemini Replication - -**Since claude-mem requires hooks, manually replicate with Qdrant + Memory:** - -**Manual observation logging:** -- After reading code → qdrant-store discoveries -- After decisions → memory.create_entities with relations -- After fixes → qdrant-store bugfix notes - -**Session summaries:** -- Before ending → Create summary, qdrant-store -- Include: request, completed, learned, next_steps - -**Context recovery:** -- Start sessions → qdrant-find for past work -- Fetch relations from Memory MCP - -**Key differences:** -- claude-mem: automatic, zero intervention -- Qdrant+Memory: manual, requires discipline -- claude-mem: FTS5 search, Qdrant: vector search -- claude-mem: typed observations, manual tagging needed - -## Alternatives Summary - -| Task | Instead of claude-mem | Use This | -|------|--------------------|----------| -| Manual control | claude-mem | Qdrant / Memory | -| Codex/Gemini | claude-mem | Qdrant + Memory workflow | -| Graph queries | search | Memory MCP | -| Semantic search | search | Qdrant | - -## Best Practices - -**Always use index first:** -- Start with `format: "index"` -- Review results -- Fetch full details only for relevant items - -**Limit wisely:** -- Even in index mode, start with `limit: 3-5` -- Increase if needed -- Watch for token limit warnings - -**Use appropriate search:** -- `search_observations`: General work/decisions/discoveries -- `search_sessions`: Session-level context -- `search_user_prompts`: Trace user requests -- `find_by_concept`: Topic-based -- `find_by_file`: File-specific history -- `find_by_type`: Decision/bugfix/feature/etc. - -**Citations:** -- Results use `claude-mem://` URIs -- Include in outputs for traceability - -## Quick Reference - -**Availability:** Claude Code only (requires hooks) -**Storage:** SQLite (`~/.claude-mem/claude-mem.db`) -**Search:** FTS5 full-text search -**Best for:** Automatic context preservation (Claude Code users) -**Avoid for:** Manual control, Codex/Gemini, graph queries - -**Critical pattern:** Always `format: "index"` first, then `format: "full"` -**Token costs:** Index 50-100/result, Full 500-1000/result - -**Links:** -- [GitHub repo: thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) -- [Category guide: Memory](../category/memory.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/context7.md b/protocols/context/static/deep-dives/context7.md deleted file mode 100644 index ceeaa1a1..00000000 --- a/protocols/context/static/deep-dives/context7.md +++ /dev/null @@ -1,252 +0,0 @@ -# Context7 MCP: Deep Dive - -## Overview - -Fetches up-to-date, version-specific API documentation and code examples from official sources. Public repos only on free tier. - -## Available Tools - -### 1. `resolve-library-id` - Get Context7 Library ID - -**What it does:** Converts package/library name to Context7-compatible ID - -**Parameters:** -- `libraryName` (required) - Name to search (e.g., "Next.js", "React", "pandas") - -**Returns:** List of matching libraries with Context7 IDs - -**Best for:** Finding the correct library ID before fetching docs - -**Rate limits:** None apparent - -**Required:** Must call before `get-library-docs` (unless user provides ID in `/org/project` format) - -### 2. `get-library-docs` - Fetch Documentation - -**What it does:** Retrieves official documentation for a library - -**Parameters:** -- `context7CompatibleLibraryID` (required) - ID from resolve step or `/org/project` format -- `topic` (optional) - Focus area (e.g., "hooks", "routing", "authentication") -- `tokens` (optional, default 5000) - Max tokens to return - -**Returns:** Official documentation with code examples - -**Best for:** Learning APIs, checking syntax, getting official examples - -**Rate limits:** None apparent (free tier) - -## Tradeoffs - -### Advantages -✅ **Version-specific docs** (accurate for specific versions) -✅ **Official sources** (authoritative) -✅ **Code examples included** (from official docs) -✅ **No apparent limits** (free tier) -✅ **Focused topics** (narrow results to relevant sections) - -### Disadvantages -❌ **Public repos only** (free tier restriction) -❌ **Not for code search** (use Sourcegraph) -❌ **Not for tutorials** (use web search) -❌ **Requires two-step** process (resolve → fetch) - -## Common Pitfalls: When NOT to Use - -### ❌ Real-World Code Examples -**Problem:** Context7 returns official docs, not real usage -**Alternative:** Sourcegraph - -**Example:** -``` -Bad: get-library-docs for real-world examples -Good: sourcegraph.search("repo:.*react.* useEffect") -``` - -### ❌ Community Tutorials/Guides -**Problem:** Context7 fetches official docs, not blog posts -**Alternative:** Tavily - -**Example:** -``` -Bad: resolve-library-id("React tutorial") -Good: tavily_search("React hooks tutorial") -``` - -### ❌ Private/Organization Libraries -**Problem:** Free tier only supports public repos -**Alternative:** Clone docs + Read tool - -**Example:** -``` -Bad: resolve-library-id("internal-company-lib") -Good: Clone docs repo → Read files -``` - -### ❌ Non-Documentation Content -**Problem:** Context7 fetches docs, not code/tests/examples -**Alternative:** Sourcegraph or clone repo - -**Example:** -``` -Bad: get-library-docs for test examples -Good: sourcegraph.search("file:test repo:library") -``` - -### ❌ General Web Search -**Problem:** Context7 is for specific library docs -**Alternative:** Tavily or Brave - -**Example:** -``` -Bad: resolve-library-id("best practices for React") -Good: tavily_search("React best practices 2024") -``` - -### ❌ Skip Resolve Step -**Problem:** get-library-docs requires exact Context7 ID -**Alternative:** Always call resolve-library-id first - -**Example:** -``` -Bad: get-library-docs("React") # Wrong ID format -Good: resolve-library-id("React") → get-library-docs("/facebook/react") -``` - -## When Context7 IS the Right Choice - -✅ **Official API documentation** needed -✅ **Version-specific syntax** required -✅ **Learning new library/framework** -✅ **Checking current API** -✅ **Getting official examples** - -**Decision rule:** "Do I need official, authoritative documentation?" - -## Usage Patterns - -**Two-step workflow (required):** -``` -Step 1: resolve-library-id("Next.js") - → Returns: {id: "/vercel/next.js", ...} - -Step 2: get-library-docs("/vercel/next.js", topic="routing") - → Returns: Official Next.js routing documentation -``` - -**Version-specific queries:** -``` -resolve-library-id("React 18") -→ get-library-docs with React 18 docs - -resolve-library-id("Next.js 14") -→ get-library-docs with Next.js 14 docs -``` - -**Focused topic retrieval:** -``` -get-library-docs("/vercel/next.js", topic="app router") -get-library-docs("/facebook/react", topic="hooks") -get-library-docs("/expressjs/express", topic="middleware") -``` - -**Adjust token limit:** -``` -get-library-docs(id, topic, tokens=10000) # More context -get-library-docs(id, topic, tokens=2000) # Quick lookup -``` - -**User-provided ID (skip resolve):** -``` -If user says: "Get docs for /vercel/next.js" -→ Skip resolve-library-id -→ Direct: get-library-docs("/vercel/next.js") -``` - -## Integration Workflow - -**Typical learning workflow:** -``` -1. Context7: Get official API docs -2. Sourcegraph: Find real-world usage examples -3. Tavily: Find tutorials/guides if needed -``` - -**Example:** -``` -Task: "Learn React Server Components" - -Step 1: resolve-library-id("React") - → /facebook/react - -Step 2: get-library-docs("/facebook/react", topic="server components") - → Official React docs on Server Components - -Step 3: sourcegraph.search("repo:.*react.* server component") - → Real implementations in public repos - -Step 4: tavily_search("React server components tutorial") - → Blog posts explaining concepts -``` - -## Selection Logic for resolve-library-id - -**When results have multiple matches:** -- Prioritize exact name matches -- Consider description relevance -- Check documentation coverage (higher is better) -- Verify trust score (7-10 more authoritative) - -**Return format:** -- Selected library ID clearly marked -- Brief explanation for choice -- Acknowledge other matches if relevant - -## Alternatives Summary - -| Task | Instead of Context7 | Use This | -|------|-------------------|----------| -| Real-world examples | get-library-docs | Sourcegraph | -| Tutorials/guides | get-library-docs | Tavily | -| Private libraries | get-library-docs | Clone + Read | -| Code search | get-library-docs | Sourcegraph | -| General web search | resolve-library-id | Tavily / Brave | - -## Best Practices - -**Always resolve first:** -- Call `resolve-library-id` before `get-library-docs` -- Exception: User provides `/org/project` format - -**Use focused topics:** -- Narrow results to relevant sections -- Examples: "hooks", "routing", "authentication" -- More specific = more relevant results - -**Adjust token limits:** -- Default 5000 works for most cases -- Increase for comprehensive docs (up to 50000) -- Decrease for quick lookups (down to 1000) - -**Version-specific queries:** -- Include version in library name when known -- "React 18", "Next.js 14", "Vue 3" -- Context7 returns version-appropriate docs - -**Combine with other tools:** -- Context7 → Official docs -- Sourcegraph → Real usage -- Tavily → Tutorials/best practices - -## Quick Reference - -**Rate limits:** None apparent (free tier) -**Coverage:** Public repos only -**Best for:** Official API docs, version-specific syntax -**Avoid for:** Real examples, tutorials, private libs - -**Required workflow:** resolve-library-id → get-library-docs - -**Links:** -- [Category guide: Documentation](../category/documentation.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/memory.md b/protocols/context/static/deep-dives/memory.md deleted file mode 100644 index 781d709e..00000000 --- a/protocols/context/static/deep-dives/memory.md +++ /dev/null @@ -1,351 +0,0 @@ -# Memory MCP: Deep Dive - -## Overview - -Knowledge graph with entities, relations, and observations. Tracks explicit relationships between concepts. Unlimited local JSONL storage. - -## Available Tools - -### 1. `create_entities` - Create Graph Nodes - -**What it does:** Creates entities (nodes) in knowledge graph - -**Parameters:** -- `entities` (required) - Array of entities to create - - Each: `{name, entityType, observations[]}` - -**Returns:** Confirmation of creation - -**Best for:** Adding people, projects, concepts, organizations - -**Rate limits:** None - -### 2. `create_relations` - Define Relationships - -**What it does:** Creates directed relationships between entities - -**Parameters:** -- `relations` (required) - Array of relations - - Each: `{from, to, relationType}` (active voice) - -**Returns:** Confirmation of creation - -**Best for:** Linking entities with explicit relationships - -**Rate limits:** None - -### 3. `add_observations` - Add Facts to Entities - -**What it does:** Adds observations (facts) to existing entities - -**Parameters:** -- `observations` (required) - Array of observations - - Each: `{entityName, contents[]}` - -**Returns:** Confirmation of additions - -**Best for:** Adding new facts to existing entities - -**Rate limits:** None - -### 4. `delete_entities` - Remove Entities - -**What it does:** Deletes entities and their relations - -**Parameters:** -- `entityNames` (required) - Array of entity names to delete - -**Returns:** Confirmation of deletion - -**Best for:** Cleanup, removing outdated entities - -**Rate limits:** None - -### 5. `delete_observations` - Remove Specific Facts - -**What it does:** Deletes specific observations from entities - -**Parameters:** -- `deletions` (required) - Array of deletions - - Each: `{entityName, observations[]}` - -**Returns:** Confirmation of deletion - -**Best for:** Removing outdated or incorrect facts - -**Rate limits:** None - -### 6. `delete_relations` - Remove Relationships - -**What it does:** Deletes specific relationships - -**Parameters:** -- `relations` (required) - Array of relations to delete - - Each: `{from, to, relationType}` - -**Returns:** Confirmation of deletion - -**Best for:** Removing incorrect or outdated relationships - -**Rate limits:** None - -### 7. `read_graph` - View Entire Graph - -**What it does:** Returns complete knowledge graph - -**Parameters:** None - -**Returns:** All entities, relations, observations - -**Best for:** Full graph visualization, debugging - -**Rate limits:** None - -### 8. `search_nodes` - Search Entities - -**What it does:** Searches entities by name, type, or observation content - -**Parameters:** -- `query` (required) - Search query - -**Returns:** Matching entities with details - -**Best for:** Finding specific entities or concepts - -**Rate limits:** None - -### 9. `open_nodes` - Get Specific Entities - -**What it does:** Retrieves specific entities by name - -**Parameters:** -- `names` (required) - Array of entity names - -**Returns:** Requested entities with all details - -**Best for:** Getting known entities - -**Rate limits:** None - -## Tradeoffs - -### Advantages -✅ **Explicit relationships** (track who/what/how) -✅ **Graph queries** (find relationships) -✅ **Structured storage** (entities/relations/observations) -✅ **Unlimited** (local JSONL file) -✅ **Persistent** (cross-session) -✅ **Official Anthropic** implementation - -### Disadvantages -❌ **No semantic search** (use Qdrant for that) -❌ **Manual management** (not automatic like claude-mem) -❌ **Requires discipline** (must create/update manually) -❌ **No similarity** matching (exact names/queries) - -## Common Pitfalls: When NOT to Use - -### ❌ Need Semantic/Similarity Search -**Problem:** Memory uses exact matching, not semantic -**Alternative:** Qdrant - -**Example:** -``` -Bad: memory.search_nodes("authentication patterns") - → Requires exact observation text match - -Good: qdrant-find("authentication patterns") - → Semantic search finds JWT, OAuth, etc. -``` - -### ❌ Simple Note-Taking -**Problem:** Graph structure overkill for unstructured notes -**Alternative:** Qdrant or filesystem - -**Example:** -``` -Bad: create_entities for quick notes -Good: qdrant-store or Write to file -``` - -### ❌ Temporary Context -**Problem:** Permanent storage for ephemeral data -**Alternative:** Conversation context - -**Example:** -``` -Bad: create_entities for current session calculations -Good: Keep in conversation context -``` - -### ❌ Relationships Don't Matter -**Problem:** Graph overhead when relationships aren't needed -**Alternative:** Qdrant - -**Example:** -``` -Bad: memory for independent code snippets -Good: qdrant-store for searchable snippets -``` - -### ❌ Using Passive Voice Relations -**Problem:** Confusing relationship direction -**Alternative:** Active voice - -**Example:** -``` -Bad: create_relations({from: "Company", to: "John", relationType: "employs"}) - → Backwards - -Good: create_relations({from: "John", to: "Company", relationType: "works_at"}) - → Active voice, clear direction -``` - -## When Memory MCP IS the Right Choice - -✅ **Track relationships** (who → works_at → where) -✅ **Project context** (components, dependencies, owners) -✅ **Personal CRM** (people, companies, connections) -✅ **Dependency graphs** (X → depends_on → Y) -✅ **Structured knowledge** with clear connections - -**Decision rule:** "Do relationships between items matter?" - -## Usage Patterns - -**Create entity with observations:** -``` -create_entities([ - { - name: "John_Smith", - entityType: "person", - observations: ["Speaks Spanish", "Prefers async communication"] - } -]) -``` - -**Create relationships:** -``` -create_relations([ - {from: "John_Smith", to: "Anthropic", relationType: "works_at"}, - {from: "John_Smith", to: "ProjectX", relationType: "contributes_to"} -]) -``` - -**Add new facts:** -``` -add_observations([ - { - entityName: "John_Smith", - contents: ["Expert in TypeScript", "Located in San Francisco"] - } -]) -``` - -**Query relationships:** -``` -search_nodes("works_at Anthropic") -→ Find all people who work at Anthropic - -search_nodes("ProjectX") -→ Find all entities related to ProjectX - -open_nodes(["John_Smith"]) -→ Get John_Smith with all observations and relations -``` - -**Example graph structure:** -``` -Entity: John_Smith (type: person) - Observations: ["Speaks Spanish", "Async communication"] - Relations: - John_Smith --works_at--> Anthropic - John_Smith --contributes_to--> ProjectX - -Entity: ProjectX (type: project) - Observations: ["React-based", "TypeScript", "Open source"] - Relations: - ProjectX --depends_on--> React - ProjectX --deployed_on--> Vercel - -Entity: Anthropic (type: company) - Observations: ["AI safety research", "San Francisco"] -``` - -## Best Practices - -**Entity naming:** -- Use underscores: `John_Smith`, `ProjectX` -- Unique names: `ProjectX_v2` vs `ProjectX_v1` -- Consistent: Don't mix `John Smith` and `John_Smith` - -**Relation types (active voice):** -- ✅ `works_at`, `manages`, `depends_on`, `created_by` -- ❌ `employed_by`, `managed_by`, `is_used_by` - -**Observations (atomic):** -- ✅ ["Speaks Spanish", "Graduated 2019"] -- ❌ ["Speaks Spanish and graduated in 2019"] - -**Entity types (consistent):** -- Use standard types: `person`, `company`, `project`, `concept` -- Be consistent across graph - -**Check before creating:** -``` -search_nodes("John_Smith") -→ Check if exists before creating duplicate -``` - -**Relations are directed:** -``` -from: "John" to: "Company" type: "works_at" -≠ from: "Company" to: "John" type: "works_at" -``` - -## Combining with Qdrant - -**Use both for rich knowledge:** -``` -Qdrant: Store searchable content (find by meaning) -Memory: Track relationships (X relates to Y) - -Example: - Qdrant: Store "JWT authentication implementation guide" - Memory: ProjectA --uses--> JWT_pattern --created_by--> John -``` - -**Workflow:** -``` -1. memory: Create entities (people, projects, concepts) -2. memory: Create relations (map dependencies, ownership) -3. qdrant: Store detailed content (code, docs, notes) -4. search: memory for relationships, qdrant for content -``` - -## Alternatives Summary - -| Task | Instead of Memory | Use This | -|------|------------------|----------| -| Semantic search | search_nodes | Qdrant | -| Simple notes | create_entities | Qdrant / Filesystem | -| Temporary data | create_entities | Conversation context | -| No relationships | memory tools | Qdrant | - -## Quick Reference - -**Rate limits:** None (local JSONL) -**Storage:** JSONL file (local filesystem) -**Persistence:** Across sessions (explicit deletion needed) -**Best for:** Relationships, structured knowledge, graph queries -**Avoid for:** Semantic search, temporary data, simple notes - -**Relation direction:** Always active voice (from → to) -**Entity names:** Use underscores, unique identifiers -**Observations:** Atomic facts (one per observation) - -**Storage location:** `MEMORY_MCP_STORAGE_PATH` env var or default `~/.memory-mcp/memory.jsonl` - -**Links:** -- [Category guide: Memory](../category/memory.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/pal.md b/protocols/context/static/deep-dives/pal.md deleted file mode 100644 index 27d4b9bb..00000000 --- a/protocols/context/static/deep-dives/pal.md +++ /dev/null @@ -1,291 +0,0 @@ -# PAL MCP: Deep Dive - -## Critical Note - -**`clink`-only setup** — This guide documents our usage of PAL MCP strictly via clink to bridge to external AI CLIs. PAL MCP is accessed by agent clients, and in this configuration we do not configure provider API keys in PAL; we rely solely on clink. - -## Overview - -CLI-to-CLI bridge enabling multi-model orchestration, sub-agent spawning, and context threading across different AI CLIs. - -## Available Tools - -### 1. `clink` - Cross-Model CLI Orchestration - -**What it does:** Links current request to external AI CLI (Gemini, Codex, Claude) through PAL MCP - -**Parameters:** -- `prompt` (required) - User request forwarded to CLI -- `cli_name` (optional; required if multiple CLIs are configured) - Which CLI to use: "claude" | "codex" | "gemini". Defaults to "gemini" if present, otherwise the first configured CLI. -- `role` (optional, default "default") - Role preset for selected CLI -- `continuation_id` (optional) - Thread ID for multi-turn conversations -- `absolute_file_paths` (optional) - Array of absolute file/folder paths (must be full absolute paths; do not shorten) -- `images` (optional) - Array of image paths or base64 blobs - -**Available roles per CLI:** -- **claude**: codereviewer, default, planner -- **codex**: codereviewer, default, planner -- **gemini**: codereviewer, default, planner - -**Returns:** Response from external CLI with context preserved - -**Best for:** Cross-model orchestration, specialized tasks per model - -**Rate limits:** None (depends on target CLI's limits) - -**CRITICAL:** Always reuse last continuation_id to preserve conversation context - -### 2. `listmodels` - Show Available Models - -**What it does:** Lists configured AI model providers, names, aliases, capabilities - -**Parameters:** None - -**Returns:** Model providers, model names, aliases, capabilities - -**Best for:** Understanding available models before using clink - -**Rate limits:** None - -### 3. `version` - Server Information - -**What it does:** Shows server version, config, available tools - -**Parameters:** None - -**Returns:** Version info, configuration details, tool list - -**Best for:** Debugging, verifying setup - -**Rate limits:** None - -## Tradeoffs - -### Advantages -✅ **Multi-model orchestration** (use best model for each task) -✅ **Cross-CLI context** (continuation_id preserves full history) -✅ **Role-based delegation** (codereviewer, planner, etc.) -✅ **File/image context** (pass context to other CLIs) -✅ **Seamless handoff** (agent resumes with full context) - -### Disadvantages -❌ **clink bridge required** (invoked from your agent CLI) -❌ **Requires PAL setup** (local server running) -❌ **Accessed via agent clients** (e.g., Claude Code, Gemini CLI) - -## Common Pitfalls: When NOT to Use - -### ❌ Single-Model Tasks -**Problem:** clink adds overhead for same-model operations -**Alternative:** Use current CLI directly - -**Example:** -``` -Bad: clink to same CLI for simple task -Good: Handle directly in current CLI -``` - -### ❌ Not Reusing continuation_id -**Problem:** Loses conversation context across calls -**Alternative:** Always reuse last continuation_id - -**Example:** -``` -Bad: clink without continuation_id (new conversation each time) -Good: clink(continuation_id: "last_id") → preserves full context -``` - -### ❌ Tasks Not Requiring Other Models -**Problem:** Unnecessary cross-CLI call -**Alternative:** Keep work in current CLI - -**Example:** -``` -Bad: clink for task current model handles well -Good: Complete task in current CLI -``` - -### ❌ Using Without Role Specification -**Problem:** Misses specialized role capabilities -**Alternative:** Specify role for focused execution - -**Example:** -``` -Bad: clink(cli_name: "codex") # Default role -Good: clink(cli_name: "codex", role: "codereviewer") -``` - -### ❌ Not Passing Relevant Context -**Problem:** Other CLI lacks necessary files/images -**Alternative:** Include files/images parameters - -**Example:** -``` -Bad: clink without files for code review task -Good: clink(files: ["/path/to/code.js"], role: "codereviewer") -``` - -## When PAL MCP IS the Right Choice - -✅ **Multi-model orchestration** needed -✅ **Specialized roles** per model (review, plan, implement) -✅ **Cross-CLI workflows** (research → plan → implement) -✅ **Model-specific strengths** (Gemini for X, Codex for Y) - -**Decision rule:** "Does this task benefit from a different model/role?" - -## Usage Patterns - -**Basic clink call:** -``` -clink( - prompt: "Review this code for security issues", - cli_name: "codex", - role: "codereviewer", - files: ["/path/to/auth.js"] -) -``` - -**Multi-turn conversation (CRITICAL):** -``` -First call: -clink( - prompt: "Plan refactoring for auth system", - cli_name: "claude", - role: "planner" -) -→ Returns: status: continuation_available with continuation_offer.continuation_id: "abc123" - -Second call (REUSE continuation_id): -clink( - prompt: "Now implement step 1 from the plan", - cli_name: "codex", - role: "default", - continuation_id: "abc123" # CRITICAL: Preserves context -) -``` - -**Role-specific delegation:** -``` -Code review: -clink(cli_name: "codex", role: "codereviewer", files: [...]) - -Planning: -clink(cli_name: "claude", role: "planner", files: [...]) - -Implementation: -clink(cli_name: "gemini", role: "default", files: [...]) -``` - -**With images:** -``` -clink( - prompt: "Analyze this diagram", - cli_name: "gemini", - images: ["/path/to/diagram.png"] -) -``` - -**Check available models:** -``` -listmodels() -→ See all configured models, aliases, capabilities -``` - -## Multi-Model Workflow Example - -**Research → Plan → Implement:** -``` -Step 1 (Research): -clink( - prompt: "Research best practices for microservices auth", - cli_name: "claude", - role: "default" -) -→ continuation_id: "id1" - -Step 2 (Plan): -clink( - prompt: "Create implementation plan based on research", - cli_name: "claude", - role: "planner", - continuation_id: "id1" -) -→ continuation_id: "id2" - -Step 3 (Implement): -clink( - prompt: "Implement auth service from plan", - cli_name: "codex", - role: "default", - continuation_id: "id2", - files: ["/project/auth/"] -) -→ continuation_id: "id3" - -Step 4 (Review): -clink( - prompt: "Review implemented code for security", - cli_name: "codex", - role: "codereviewer", - continuation_id: "id3", - files: ["/project/auth/service.js"] -) -``` - -## Best Practices - -**Always preserve context:** -- **CRITICAL:** Reuse continuation_id from previous calls -- Full conversation history preserved -- Files, findings, decisions carry forward - -**Choose right model per task:** -- Claude: Planning, architecture, complex reasoning -- Codex: Code implementation, debugging -- Gemini: Visual analysis, broad tasks - -**Use appropriate roles:** -- `codereviewer`: Security, quality, best practices -- `planner`: Architecture, migration, phased plans -- `default`: General implementation - -**Pass necessary context:** -- Files: Absolute paths (don't shorten) -- Images: Paths or base64 blobs -- Prompt: Clear task description - -**Workflow patterns:** -``` -Research (Claude) → Plan (Claude/planner) → -Implement (Codex) → Review (Codex/codereviewer) -``` - -## Alternatives Summary - -| Task | Instead of PAL/clink | Use This | -|------|---------------------|----------| -| Single-model task | clink | Current CLI directly | -| Same CLI, same role | clink | Direct execution | -| No multi-turn needed | clink | Single CLI call | - -## Quick Reference - -**Availability:** clink orchestration layer only -**Rate limits:** Depends on target CLI -**Best for:** Multi-model workflows, specialized roles -**Avoid for:** Single-model tasks, no role specialization needed - -**CLIs available:** claude, codex, gemini -**Roles available:** codereviewer, default, planner (per CLI) - -**CRITICAL:** Always reuse continuation_id for context preservation - -**Typical workflow:** -1. Research (Claude) → -2. Plan (Claude/planner) → -3. Implement (Codex) → -4. Review (Codex/codereviewer) - -**Links:** -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/playwright.md b/protocols/context/static/deep-dives/playwright.md deleted file mode 100644 index 7389f782..00000000 --- a/protocols/context/static/deep-dives/playwright.md +++ /dev/null @@ -1,198 +0,0 @@ -# Playwright MCP: Deep Dive - -## Overview - -- Fast, lightweight browser automation using Playwright's accessibility tree. -- No vision models needed; operates on structured data for deterministic, reliable web interactions. - -## Available Tools - -### Core Navigation & Interaction Tools - -1. **`playwright_navigate`** - Navigate to URL -2. **`playwright_click`** - Click elements by selector/text -3. **`playwright_type`** - Type text into input fields -4. **`playwright_evaluate`** - Execute JavaScript in page context -5. **`playwright_screenshot`** - Capture screenshots (optional, for debugging) - -### Content Extraction Tools - -6. **`playwright_extract`** - Extract structured data from page -7. **`playwright_get_accessibility_tree`** - Get full accessibility snapshot - -### Session Management - -8. **`playwright_new_page`** - Create new browser page/tab -9. **`playwright_close_page`** - Close current page -10. **`playwright_save_storage_state`** - Save cookies/auth for reuse - -## Key Features - -### Fast & Lightweight -- Uses accessibility tree, not pixel-based input -- No vision models required -- Deterministic tool application - -### Multi-Browser Support -- Chrome/Chromium -- Firefox -- WebKit (Safari) -- Microsoft Edge - -### Configuration Options - -Via CLI args in config (add to `"args"` array): - -**Browser Selection:** -- `--browser chrome|firefox|webkit|msedge` - -**Execution Mode:** -- `--headless` - Run without UI (default: headed) - -**Device Emulation:** -- `--device "iPhone 15"` - Emulate specific devices - -**Session Persistence:** -- `--user-data-dir ` - Persistent browser profile -- `--storage-state ` - Load saved auth state -- `--save-session` - Save session to output directory -- `--isolated` - In-memory profile (no disk writes) - -**Security:** -- `--ignore-https-errors` - Skip certificate validation -- `--no-sandbox` - Disable sandbox (use cautiously) - -**Network:** -- `--proxy-server ` - Use proxy -- `--blocked-origins ` - Block specific origins -- `--allowed-origins ` - Whitelist origins only - -**Debugging:** -- `--save-trace` - Record Playwright trace -- `--save-video ` - Record video (e.g., "800x600") -- `--output-dir ` - Output directory - -**Timeouts:** -- `--timeout-action ` - Action timeout (default: 5000ms) -- `--timeout-navigation ` - Nav timeout (default: 60000ms) - -## Tradeoffs - -### Advantages -✅ No vision models needed (structured data only) -✅ Deterministic, reliable automation -✅ Fast execution via accessibility tree -✅ Multi-browser support -✅ Session persistence and auth reuse -✅ Local execution (privacy-friendly) -✅ Free and unlimited - -### Disadvantages -❌ Requires Node.js runtime (npx) -❌ Only supports stdio transport (per-agent instances) -❌ Can't interact with canvas/WebGL (accessibility-based) -❌ May struggle with highly dynamic shadow DOM -❌ Requires learning Playwright selectors - -## Common Pitfalls: When NOT to Use - -### ❌ Simple Static HTML Fetch -**Problem:** Playwright is overkill for static content -**Alternative:** Fetch MCP or Tavily extract - -**Example:** -``` -Bad: playwright_navigate + playwright_extract -Good: fetch("https://example.com/static-page") -``` - -### ❌ Content Already Accessible via API -**Problem:** Direct API calls are faster and more reliable -**Alternative:** Use Fetch MCP with API endpoint - -**Example:** -``` -Bad: Automate website to scrape data -Good: Call REST API directly -``` - -### ❌ Large-Scale Scraping -**Problem:** Running browser instances is resource-intensive -**Alternative:** Use Tavily crawl or Fetch for bulk operations - -**Example:** -``` -Bad: playwright loop over 100 URLs -Good: tavily_crawl or parallel fetch calls -``` - -### ❌ Complex JavaScript Reverse Engineering -**Problem:** Accessibility tree may miss dynamically generated content -**Alternative:** Use browser DevTools or screenshot-based approaches - -## When Playwright IS the Right Choice - -✅ **Form interactions** (login, submit, multi-step flows) -✅ **JavaScript-heavy sites** (SPAs, React/Vue apps) -✅ **Testing web applications** (E2E tests) -✅ **Dynamic content extraction** (infinite scroll, lazy loading) -✅ **Browser automation workflows** (repetitive tasks) -✅ **Authenticated sessions** (persist login state) - -**Decision rule:** "Does this require clicking, typing, or waiting for JS? → Use Playwright" - -## Best Practices - -**Start headed, switch to headless:** -- Debug with `--headless` omitted (see what's happening) -- Production: add `--headless` for faster execution - -**Reuse browser sessions:** -```bash -# Save auth once ---save-session --output-dir ./sessions/ - -# Reuse later ---storage-state ./sessions/storage.json -``` - -**Handle timeouts appropriately:** -- Fast actions: `--timeout-action 3000` -- Slow sites: `--timeout-navigation 120000` - -**Use device emulation for mobile testing:** -```bash ---device "iPhone 15" -``` - -**Enable traces for debugging:** -```bash ---save-trace --output-dir ./traces/ -``` - -**Selector best practices:** -- Prefer text selectors: `"button:has-text('Submit')"` -- Use test IDs: `[data-testid="login-button"]` -- Avoid brittle CSS classes -- Use `--test-id-attribute` for custom test ID attributes - -## Alternatives Summary - -| Task | Instead of Playwright | Use This | -|------|----------------------|----------| -| Static HTML | playwright_navigate | Fetch MCP | -| API access | Browser automation | Direct API call | -| Bulk scraping | Multiple Playwright instances | Tavily crawl | -| Simple extraction | Accessibility tree | Tavily extract | -| Visual testing | Accessibility only | Screenshot-based tools | - -## Quick Reference - -**Transport:** stdio only (per-agent instance) -**Rate limits:** None (local execution) -**Cost:** Free (open source) - -**Links:** -- [Official Playwright MCP README](https://github.com/microsoft/playwright-mcp) -- [Playwright Documentation](https://playwright.dev) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/qdrant.md b/protocols/context/static/deep-dives/qdrant.md deleted file mode 100644 index f4714aa6..00000000 --- a/protocols/context/static/deep-dives/qdrant.md +++ /dev/null @@ -1,267 +0,0 @@ -# Qdrant MCP: Deep Dive - -## Overview - - Semantic memory layer using vector embeddings for meaning-based retrieval. Persistent local storage via Docker volume (limited by host disk). - -## Available Tools - -### 1. `qdrant-store` - Save Information - -**What it does:** Stores information with embeddings for semantic search - -**Parameters:** -- `information` (required) - Text to store -- `metadata` (optional) - JSON object with additional context - -**Returns:** Confirmation of storage - -**Best for:** Building personal knowledge base, storing code snippets, saving learnings - -**Rate limits:** None (local Docker or cloud instance) - -### 2. `qdrant-find` - Retrieve by Meaning - -**What it does:** Finds semantically similar information by query - -**Parameters:** -- `query` (required) - What to search for (by meaning) - -**Returns:** Relevant stored information ranked by semantic similarity - -**Best for:** Finding information by concept, discovering related content - -**Rate limits:** None - -## How It Works - -**Vector embeddings:** -- Uses FastEmbed models (default: all-MiniLM-L6-v2) -- Converts text to mathematical vectors -- HNSW index for fast similarity search -- Finds by meaning, not keywords - -**Storage:** -- Local: Docker container with persistent volume -- Cloud: Qdrant cloud instances -- Data persists across sessions - -## Tradeoffs - -### Advantages -✅ **Semantic search** (find by meaning, not keywords) -✅ **Persistent local storage** (Docker volume; limited by host disk) -✅ **Persistent** (survives restarts, cross-session) -✅ **Metadata support** (add structure to searches) -✅ **Fast retrieval** (HNSW index) -✅ **No rate limits** (local or self-hosted) - -### Disadvantages -❌ **No relationship tracking** (use Memory MCP for that) -❌ **Requires manual saving** (not automatic like claude-mem) -❌ **No graph queries** (use Memory MCP) -❌ **Setup required** (Docker or cloud instance) - -## Common Pitfalls: When NOT to Use - -### ❌ Need Explicit Relationships -**Problem:** Qdrant finds similarity, not relationships -**Alternative:** Memory MCP - -**Example:** -``` -Bad: qdrant-store("John works at Anthropic") - qdrant-find("who works at Anthropic") - → Might miss due to wording differences - -Good: memory.create_entities + create_relations - memory.search_nodes("works_at") - → Explicit relationship query -``` - -### ❌ One-Time Information -**Problem:** Storing wastes space for temporary data -**Alternative:** Keep in conversation context - -**Example:** -``` -Bad: qdrant-store("Today's meeting notes for one-time review") -Good: Just keep in current conversation -``` - -### ❌ Structured Graph Queries -**Problem:** Qdrant doesn't support "X relates to Y" queries -**Alternative:** Memory MCP - -**Example:** -``` -Bad: qdrant for tracking project dependencies -Good: memory.create_relations for dependency graph -``` - -### ❌ Keyword-Only Search -**Problem:** Simple keyword search doesn't need vector embeddings -**Alternative:** Grep or filesystem search - -**Example:** -``` -Bad: qdrant-store then qdrant-find for exact string match -Good: grep or filesystem search for keywords -``` - -### ❌ Temporary Session Data -**Problem:** Permanent storage for non-persistent data -**Alternative:** Conversation context - -**Example:** -``` -Bad: qdrant-store("Intermediate calculation results") -Good: Keep in conversation, ephemeral -``` - -## When Qdrant IS the Right Choice - -✅ **Find by meaning** ("auth patterns" → finds JWT, OAuth, sessions) -✅ **Cross-session knowledge base** -✅ **Code snippet library** (semantic search) -✅ **Remember solutions** to problems -✅ **Discover related concepts** - -**Decision rule:** "Do I need to find this by meaning later?" - -## Usage Patterns - -**Store code snippets:** -``` -qdrant-store( - "React useEffect cleanup prevents memory leaks by returning function", - metadata: {"type": "code", "language": "react", "topic": "hooks"} -) -``` - -**Store learnings:** -``` -qdrant-store( - "JWT tokens for stateless auth, refresh tokens for security", - metadata: {"type": "architecture", "topic": "authentication"} -) -``` - -**Store solutions:** -``` -qdrant-store( - "Fixed CORS by adding Access-Control-Allow-Origin header in Express", - metadata: {"type": "solution", "problem": "CORS", "technology": "express"} -) -``` - -**Find by concept:** -``` -qdrant-find("how to prevent memory leaks in React") -→ Returns useEffect cleanup snippet (semantic match) - -qdrant-find("stateless authentication patterns") -→ Returns JWT/OAuth notes (concept match) - -qdrant-find("fixing cross-origin issues") -→ Returns CORS solution (semantic similarity) -``` - -**With metadata (when filters enabled):** -Requires enabling filters (e.g., set `QDRANT_ALLOW_ARBITRARY_FILTER=true` or configure `filterable_fields`). -``` -qdrant-find("authentication") + filter metadata.type="architecture" -``` - -## Combining with Memory MCP - -**Use both for complex knowledge:** -``` -Qdrant: Store searchable content -Memory: Track relationships - -Example: - Qdrant: Store "JWT authentication implementation details" - Memory: Track project_A --uses--> JWT_pattern -``` - -**Workflow:** -``` -1. qdrant-store: Save code snippets, solutions -2. memory.create_entities: Create projects, concepts -3. memory.create_relations: Link projects to patterns -4. qdrant-find: Discover by meaning -5. memory.search_nodes: Query relationships -``` - -## Best Practices - -**What to store:** -- Code patterns and snippets -- Solutions to problems encountered -- Architectural decisions and rationale -- API usage examples -- Learned concepts and facts - -**Storage strategy:** -- **Atomic pieces** (one concept per store) -- **Descriptive text** (include context, not just code) -- **Metadata** for structure (type, language, project, topic) -- **Rich descriptions** help semantic search - -**Search optimization:** -- **Conceptual queries** (describe what you want) -- **Varied wording** (semantic search handles synonyms) -- **Specific enough** to narrow results - -**Common mistakes:** -- ❌ Storing entire files (store key snippets) -- ❌ Minimal text (add context for better retrieval) -- ❌ No metadata (harder to filter) -- ❌ Expecting exact keyword match (it's semantic) - -## Qdrant vs Memory Quick Decision - -| Scenario | Use Qdrant | Use Memory | Use Both | -|----------|-----------|------------|----------| -| Store code snippets for "find similar" | ✅ | ❌ | Optional | -| Track who created what code | ❌ | ✅ | Recommended | -| Personal knowledge base | ✅ | ❌ | Optional | -| Project relationship map | ❌ | ✅ | N/A | -| Searchable docs + author tracking | ✅ | ✅ | ✅ | - -## Alternatives Summary - -| Task | Instead of Qdrant | Use This | -|------|------------------|----------| -| Explicit relationships | qdrant | Memory MCP | -| One-time information | qdrant-store | Conversation context | -| Graph queries | qdrant-find | Memory MCP | -| Keyword search | qdrant-find | Grep / Filesystem | -| Temporary data | qdrant-store | Conversation context | - -## Local Setup Notes - -For the provided setup script (bureau/tools/scripts/set-up-tools.sh): - -- Qdrant DB: `http://127.0.0.1:8780` (Docker maps host 8780 → container 6333) -- Persistence: Docker bind mount to `~/.qdrant/storage/` (by default) -- MCP server (HTTP): `http://localhost:8782/mcp/` (transport: streamable-http) -- Default collection: `coding-memory` -- Embeddings: provider `fastembed`; model default `sentence-transformers/all-MiniLM-L6-v2` -- Filters: Metadata filtering is off by default (no `QDRANT_ALLOW_ARBITRARY_FILTER` and no `filterable_fields` configured). Enable one of these to filter by payload fields. - -## Quick Reference - -**Rate limits:** None (local/self-hosted) -**Storage:** Docker volume or cloud -**Persistence:** Across sessions (explicit deletion needed) -**Best for:** Semantic search, find by meaning -**Avoid for:** Relationships, temporary data, keyword-only search - -**Embedding model:** sentence-transformers/all-MiniLM-L6-v2 (default) -**Search method:** HNSW vector similarity - -**Links:** -- [Category guide: Memory](../category/memory.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/semgrep.md b/protocols/context/static/deep-dives/semgrep.md deleted file mode 100644 index dfaf0252..00000000 --- a/protocols/context/static/deep-dives/semgrep.md +++ /dev/null @@ -1,319 +0,0 @@ -# Semgrep MCP: Deep Dive - -## Overview - -AST-aware security/bug/anti-pattern scanning. Local scanning (code never leaves machine). Free community edition with autofix suggestions. - -## Available Tools - -### 1. `semgrep_scan` - Local Code Scanning - -**What it does:** Scans code files using built-in or custom Semgrep rules - -**Parameters:** -- `code_files` (required) - Array of dicts with `{path: "absolute_path"}` - -**Returns:** Structured findings (file/line, rule ID, severity, message, code snippet) - -**Best for:** Security audits, bug detection, code quality checks - -**Rate limits:** None (local scanning, free community edition) - -### 2. `semgrep_scan_with_custom_rule` - Custom Rule Scanning - -**What it does:** Scans code with user-defined YAML rule - -**Parameters:** -- `code_files` (required) - Array of dicts with `{path, content}` -- `rule` (required) - Semgrep YAML rule string - -**Returns:** Findings matching custom rule - -**Best for:** Project-specific patterns, custom anti-patterns, organization standards - -**Rate limits:** None - -### 3. `semgrep_rule_schema` - Get Rule Schema - -**What it does:** Returns schema for writing Semgrep rules - -**Parameters:** None - -**Returns:** Rule schema with available fields - -**Best for:** Learning to write custom rules, verifying rule syntax - -**Rate limits:** None - -### 4. `get_supported_languages` - List Supported Languages - -**What it does:** Returns list of languages Semgrep supports - -**Parameters:** None - -**Returns:** Supported languages - -**Best for:** Checking language support before scanning - -**Rate limits:** None - -### 5. `get_abstract_syntax_tree` - View AST - -**What it does:** Returns Abstract Syntax Tree for code - -**Parameters:** -- `code` (required) - Code to parse -- `language` (required) - Programming language - -**Returns:** JSON AST representation - -**Best for:** Understanding code structure, debugging rules, seeing what parser sees - -**Rate limits:** None - - - -## Tradeoffs - -### Advantages -✅ **AST-aware** (understands code structure, not just regex) -✅ **Local scanning** (code never leaves machine) -✅ **Multi-language** (20+ languages) -✅ **Autofix suggestions** (when rules define fixes) -✅ **Custom rules** (code-like pattern syntax) -✅ **Free community edition** - - -### Disadvantages -❌ **Not runtime analysis** (static only) -❌ **Community edition limits** (see Semgrep docs for feature comparison) -❌ **False positives** (tune rules to reduce) -❌ **Requires rule knowledge** for custom patterns - -## Common Pitfalls: When NOT to Use - -### ❌ Runtime Behavior Analysis -**Problem:** Semgrep is static analysis only -**Alternative:** Runtime profiling, dynamic analysis tools - -**Example:** -``` -Bad: semgrep for detecting runtime memory leaks -Good: Profiling tools, heap analysis -``` - -### ❌ Code Style/Formatting -**Problem:** Semgrep for code quality, not formatting -**Alternative:** Prettier, ESLint, Black - -**Example:** -``` -Bad: semgrep for indentation issues -Good: prettier / eslint --fix -``` - -### ❌ Comprehensive Type Checking -**Problem:** Semgrep not a full type checker -**Alternative:** Language-native type checkers - -**Example:** -``` -Bad: semgrep for complete type validation -Good: TypeScript compiler, mypy, Go compiler -``` - -### ❌ Performance Optimization -**Problem:** Semgrep finds patterns, not performance issues -**Alternative:** Profiling, benchmarking tools - -**Example:** -``` -Bad: semgrep for finding slow code -Good: Profiler, benchmark tools -``` - - - -## When Semgrep IS the Right Choice - -✅ **Security audits** (find vulnerabilities) -✅ **Bug detection** (common anti-patterns) -✅ **Code quality checks** (enforce standards) -✅ **Custom rule enforcement** (org-specific patterns) -✅ **Supply chain scanning** (dep vulnerabilities) -✅ **Pre-commit checks** (catch issues early) - -**Decision rule:** "Do I need to find security/bug/pattern issues?" - -## Usage Patterns - -**Basic security scan:** -``` -semgrep_scan( - code_files: [ - {path: "/absolute/path/to/auth.js"}, - {path: "/absolute/path/to/api.py"} - ] -) -→ Findings with severity, location, fix suggestions -``` - -**Custom rule scan:** -``` -semgrep_scan_with_custom_rule( - code_files: [{path: "...", content: "..."}], - rule: """ -rules: - - id: check-hardcoded-secrets - pattern: | - password = "..." - message: Hardcoded password detected - severity: ERROR - languages: [python] -""" -) -``` - - - -**Get AST for rule writing:** -``` -get_abstract_syntax_tree( - code: "function foo() { return bar(); }", - language: "javascript" -) -→ JSON AST showing parser structure -``` - - - -**Check language support:** -``` -get_supported_languages() -→ List of supported languages -``` - -## Rule Writing Workflow - -**1. Understand target pattern:** -``` -get_abstract_syntax_tree(code, language) -→ See how parser views code -``` - -**2. Get rule schema:** -``` -semgrep_rule_schema() -→ Available fields for rules -``` - -**3. Write custom rule:** -```yaml -rules: - - id: my-custom-check - pattern: | - dangerous_function(...) - message: "Avoid dangerous_function" - severity: WARNING - languages: [python] - fix: safe_function(...) -``` - -**4. Test rule:** -``` -semgrep_scan_with_custom_rule( - code_files: [...], - rule: "..." # Custom YAML -) -``` - -## Best Practices - -**When to scan:** -- **Pre-commit:** Scan changed files -- **Post-dependency-update:** Supply chain scan -- **Security review:** Full repo scan -- **CI/CD:** Automated scanning - -**Severity triage:** -- Critical/High: Address immediately -- Medium: Plan fixes -- Low/Info: Backlog or suppress - -**Custom rules:** -- Start with semgrep registry (existing rules) -- Write org-specific patterns -- Include autofix when possible -- Test on sample code first - -**Supply chain:** -- Scan after any dependency change -- Check lockfile updates -- Review transitive dependencies - - - -## Integration Workflows - -**Security review workflow:** -``` -1. semgrep_scan(all_files) → Find issues -2. Review findings by severity -3. Apply autofix suggestions -4. semgrep_scan(fixed_files) → Verify fixes -``` - -**Custom pattern enforcement:** -``` -1. semgrep_rule_schema() → Learn schema -2. get_abstract_syntax_tree() → Understand patterns -3. Write custom rule (YAML) -4. semgrep_scan_with_custom_rule() → Test -5. Add to CI/CD -``` - - - -## Alternatives Summary - -| Task | Instead of Semgrep | Use This | -|------|--------------------|----------| -| Runtime analysis | semgrep | Profiling tools | -| Formatting | semgrep | Prettier, ESLint | -| Type checking | semgrep | Language type checkers | -| Performance | semgrep | Profilers, benchmarks | - - -## Quick Reference - -**Rate limits:** None (local CE scans) -**Best for:** Security, bugs, anti-patterns -**Avoid for:** Runtime, formatting, types, performance - -**Languages:** 20+ (check with get_supported_languages) -**Rule types:** Built-in registry + custom YAML -**Autofix:** Available when rules define fixes - -**Links:** -- [Semgrep Pro vs OSS features](https://semgrep.dev/docs/semgrep-pro-vs-oss) -- [Full decision guide](../../../tools/tools-decision-guide.md) - -## Local Reporting (CE) - -Use the Semgrep CE CLI to export results for local reporting dashboards: - -``` -# JSON output -semgrep scan --json --json-output=semgrep.json - -# SARIF output (for code scanning integrations) -semgrep scan --sarif --sarif-output=semgrep.sarif - -# Plain text output -semgrep scan --text --text-output=semgrep.txt - -# Combine outputs (prints SARIF to stdout, writes JSON to file) -semgrep scan --sarif --json-output=findings.json -``` - -Tip: You can mix formats with --output/---output to produce multiple files in one run. diff --git a/protocols/context/static/deep-dives/serena.md b/protocols/context/static/deep-dives/serena.md deleted file mode 100644 index 657d85cd..00000000 --- a/protocols/context/static/deep-dives/serena.md +++ /dev/null @@ -1,432 +0,0 @@ -# Serena MCP: Deep Dive - -## Overview - -Language-server-powered semantic code navigation, refactoring, and editing. IDE-grade symbol operations across 20+ languages. Complements Filesystem/Git MCP with semantic-level understanding. - -## Available Tools - -### Symbol Navigation - -**1. `find_symbol` - Locate Symbols** - -**What it does:** Finds symbols (classes, methods, functions) by name path - -**Parameters:** -- `name_path` (required) - Pattern like "class/method" or "/class/method" -- `relative_path` (optional) - Restrict to file/directory -- `depth` (default 0) - Retrieve descendants (1 for class methods/attributes) -- `include_body` (default false) - Include source code -- `substring_matching` (default false) - Match partial names -- `include_kinds` / `exclude_kinds` - Filter by LSP symbol kinds - -**Name path matching:** -- `"method"` → Matches anywhere (class/method, nested/class/method, etc.) -- `"class/method"` → Matches ancestors (class/method, nested/class/method) -- `"/class/method"` → Absolute (only top-level class/method) - -**Returns:** Symbols with locations (and optionally bodies) - -**Best for:** Finding symbols, understanding structure - -### Symbol References - -**2. `find_referencing_symbols` - Find References** - -**What it does:** Finds all references to a symbol - -**Parameters:** -- `name_path` (required) - Symbol to find references for -- `relative_path` (required) - File containing the symbol -- `include_kinds` / `exclude_kinds` -- `max_answer_chars` - -**Returns:** Symbols referencing the target with code snippets - -**Best for:** Understanding dependencies, impact analysis - -### File Overview - -**3. `get_symbols_overview` - High-Level Structure** - -**What it does:** Returns top-level symbols in a file - -**Parameters:** -- `relative_path` (required) - File to analyze -- `max_answer_chars` (default -1) - -**Returns:** Top-level symbols (classes, functions, etc.) - -**Best for:** First look at new file, quick structure understanding - -### Editing Tools - -**4. `replace_symbol_body` - Replace Implementation** - -**What it does:** Replaces symbol's body (implementation) - -**Parameters:** -- `name_path` (required) - Symbol to replace -- `relative_path` (required) - File containing symbol -- `body` (required) - New implementation (includes signature) - -**Returns:** Confirmation - -**Best for:** Swapping implementations, refactoring logic - -**CRITICAL:** Body INCLUDES signature, NOT just function content - -**5. `insert_after_symbol` - Insert After** - -**What it does:** Inserts content after symbol definition - -**Parameters:** -- `name_path` (required) -- `relative_path` (required) -- `body` (required) - Content to insert - -**Returns:** Confirmation - -**Best for:** Adding new methods, functions, classes - -**6. `insert_before_symbol` - Insert Before** - -**What it does:** Inserts content before symbol definition - -**Parameters:** -- `name_path` (required) -- `relative_path` (required) -- `body` (required) - Content to insert - -**Returns:** Confirmation - -**Best for:** Adding imports, new symbols before existing - -**7. `rename_symbol` - Rename Across Codebase** - -**What it does:** Renames symbol and all references - -**Parameters:** -- `name_path` (required) -- `relative_path` (required) -- `new_name` (required) - -**Returns:** Result summary - -**Best for:** Safe refactoring, renaming with reference updates - -**Note:** For overloaded methods (Java), may need signature in name_path - -### Text-Based Editing - -**8. `replace_regex` - Regex-Based Replacement** - -**What it does:** Replaces regex matches in file - -**Parameters:** -- `relative_path` (required) -- `regex` (required) - Python regex (dot matches newlines, multiline enabled) -- `repl` (required) - Replacement string (supports \\1, \\2 backreferences) -- `allow_multiple_occurrences` (default false) - Replace all matches - -**Returns:** Confirmation - -**Best for:** Small edits within symbols, text replacements - -**CRITICAL:** Use wildcards! Minimize regex length! - -### Other Tools - -**9. `search_for_pattern` - Flexible Code Search** - -**What it does:** Searches for regex patterns in files - -**Parameters:** -- `substring_pattern` (required) - Regex to search -- `relative_path` (default "") - Restrict to file/directory -- `restrict_search_to_code_files` (default false) -- `paths_include_glob` / `paths_exclude_glob` -- `context_lines_before` / `context_lines_after` -- `max_answer_chars` - -**Returns:** Matches with context lines - -**Best for:** Finding patterns when symbol search insufficient - -**10. `read_file` - Read File Content** - -**What it does:** Reads file (potentially chunked) - -**Parameters:** -- `relative_path` (required) -- `start_line` (optional, 0-based) - First line index to read -- `end_line` (optional, 0-based, inclusive) - Last line index to read -- `max_answer_chars` - -**Returns:** File content text (line numbers not guaranteed) - -**Best for:** Reading after finding symbols - -**11. `create_text_file` - Write New File** - -**What it does:** Creates or overwrites file - -**Parameters:** -- `relative_path` (required) -- `content` (required) - -**Returns:** Confirmation - -**Best for:** Creating new files - -**12. `list_dir` / `find_file` - File System Operations** - -Standard directory listing and file search operations. - -### Memory & Project Tools - -**13-16. Memory operations**: write_memory, read_memory, list_memories, delete_memory - -**17. execute_shell_command** - Run commands safely - -**18-19. activate_project, switch_modes** - Project/mode management - -**20-22. Thinking tools** - think_about_task_adherence, think_about_collected_information, think_about_whether_you_are_done - -## Tradeoffs - -### Advantages -✅ **Semantic understanding** (not just text) -✅ **IDE-grade operations** (symbol-aware) -✅ **Safe refactoring** (rename updates references) -✅ **20+ languages** (Python, TypeScript, Go, Rust, Java, etc.) -✅ **LSP-powered** (uses language servers) -✅ **Symbol-level edits** (precise, not whole-file) - -### Disadvantages -❌ **Learning curve** (name paths, symbol concepts) -❌ **Requires language server** (downloads as needed) -❌ **Not for simple text** (use Filesystem MCP or built-ins) -❌ **Complex for small edits** (regex tool better for few lines) - -## Common Pitfalls: When NOT to Use - -### ❌ Simple Text Replacements -**Problem:** Symbol tools overkill for text edits -**Alternative:** Built-in Edit or Filesystem edit_file - -**Example:** -``` -Bad: find_symbol + replace_symbol_body for one-line change -Good: Edit tool (string replacement) -``` - -### ❌ Whole-File Reads -**Problem:** Serena read_file adds overhead -**Alternative:** Built-in Read tool - -**Example:** -``` -Bad: serena.read_file for basic file reading -Good: Read (built-in, optimized) -``` - -### ❌ Non-Code Files -**Problem:** Symbol operations require code structure -**Alternative:** Filesystem or built-in tools - -**Example:** -``` -Bad: find_symbol in JSON/YAML/Markdown -Good: Read or Grep for non-code files -``` - -### ❌ Keyword Search (Not Symbols) -**Problem:** Symbol search requires knowing symbol names -**Alternative:** Grep for content search - -**Example:** -``` -Bad: find_symbol for finding TODO comments -Good: Grep("TODO:", glob="**/*.js") -``` - -### ❌ Without Knowing Structure First -**Problem:** Using symbol tools blindly -**Alternative:** get_symbols_overview first - -**Example:** -``` -Bad: find_symbol without knowing class/method structure -Good: get_symbols_overview → understand structure → find_symbol -``` - -## When Serena IS the Right Choice - -✅ **Semantic refactoring** (rename, restructure) -✅ **Symbol-level operations** (find, edit methods/classes) -✅ **Understanding code structure** -✅ **Finding references** (who calls this?) -✅ **Safe renames** across codebase - -**Decision rule:** "Do I need semantic understanding of code?" - -## Usage Patterns - -**Understand new file:** -``` -1. get_symbols_overview("src/auth.ts") - → See top-level classes, functions - -2. find_symbol("UserService", depth=1) - → Get methods of UserService class - -3. find_symbol("UserService/authenticate", include_body=true) - → Read specific method -``` - -**Refactor symbol:** -``` -1. find_symbol("UserService/authenticate", include_body=true) - → Get current implementation - -2. replace_symbol_body( - name_path="UserService/authenticate", - relative_path="src/auth.ts", - body="async authenticate(username, password) { ... }" - ) - → Replace implementation (includes signature!) -``` - -**Add new method:** -``` -insert_after_symbol( - name_path="UserService/authenticate", - relative_path="src/auth.ts", - body="\n async logout(userId) {\n ...\n }\n" -) -→ Inserts inside the UserService class, immediately after authenticate() -``` - -**Rename safely:** -``` -rename_symbol( - name_path="getUserData", - relative_path="src/api.ts", - new_name="fetchUserProfile" -) -→ Renames function and all references codebase-wide -``` - -**Find who uses a function:** -``` -find_referencing_symbols( - name_path="authenticate", - relative_path="src/auth.ts" -) -→ All places that call authenticate() -``` - -**Small edit with regex:** -``` -replace_regex( - relative_path="src/config.ts", - regex="PORT.*=.*5000", - repl="PORT = 3000" -) -→ Change port number -``` - -## Symbol vs Regex Editing - -**Use symbol tools when:** -- Replacing entire method/class/function -- Adding new symbols (methods, functions, classes) -- Renaming symbols codebase-wide - -**Use regex tool when:** -- Editing few lines within symbol -- Text replacements smaller than symbol -- Pattern-based small changes - -**Example comparison:** -``` -Task: Change one line in 50-line method - -Symbol approach: - find_symbol(include_body=true) → Read 50 lines - replace_symbol_body → Write all 50 lines back - -Regex approach (better): - replace_regex("old line", "new line") - → Target just the changed line -``` - -## Best Practices - -**Exploration pattern:** -``` -1. get_symbols_overview (high-level) -2. find_symbol with depth=0 (symbol metadata) -3. find_symbol with depth=1 (children) -4. find_symbol with include_body=true (implementation) -``` - -**Name path usage:** -- Simple: `"method"` → Matches anywhere -- Relative: `"class/method"` → Requires ancestors -- Absolute: `"/class/method"` → Top-level only -- Substring: Add `substring_matching=true` - -**Minimize reads:** -- Don't read bodies unless needed -- Use depth wisely (0 for metadata, 1 for children) -- Read structure first, implementations later - -**Editing strategy:** -- Symbol-level: Use symbol tools -- Line-level: Use replace_regex -- Multi-file rename: Use rename_symbol -- Verify with find_referencing_symbols - -**Regex best practices:** -``` -✅ Use wildcards: "start.*?end" -❌ Write long regexes: "line1\nline2\nline3..." -✅ Minimal matches with context -❌ Matching entire sections verbatim -``` - -## Alternatives Summary - -| Task | Instead of Serena | Use This | -|------|------------------|----------| -| Simple text replace | replace_symbol_body | Edit (built-in) | -| Whole-file read | read_file | Read (built-in) | -| Non-code files | symbol tools | Read / Grep | -| Keyword search | find_symbol | Grep | -| Without structure | find_symbol | get_symbols_overview first | - -## Quick Reference - -**Rate limits:** None (local language servers) -**Languages:** 20+ (Python, TS, Go, Rust, Java, C++, etc.) -**Best for:** Semantic refactoring, symbol operations -**Avoid for:** Simple text edits, non-code files, keyword search - -**Symbol kinds (LSP):** -1=file, 2=module, 3=namespace, 4=package, 5=class, 6=method, 7=property, 8=field, 9=constructor, 10=enum, 11=interface, 12=function, 13=variable, 14=constant, etc. - -**Name path patterns:** -- `"name"` → anywhere -- `"parent/name"` → with ancestors -- `"/top/name"` → absolute (top-level) - -**Critical reminders:** -- replace_symbol_body: Body INCLUDES signature -- replace_regex: Use wildcards (minimize length) -- Always get_symbols_overview first for new files - -**Links:** -- [Category guide: Code search](../category/code-search.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/sourcegraph.md b/protocols/context/static/deep-dives/sourcegraph.md deleted file mode 100644 index c8443c63..00000000 --- a/protocols/context/static/deep-dives/sourcegraph.md +++ /dev/null @@ -1,264 +0,0 @@ -# Sourcegraph MCP: Deep Dive - -## Overview - -"Google for code" - Search across public open-source repositories (indexed on Sourcegraph Cloud) with powerful filters. Unlimited searches on free tier. - -## Available Tools - -### 1. `search_prompt_guide` - Query Construction Helper - -**What it does:** Generates Sourcegraph-specific query guide - -**Parameters:** -- `objective` (required) - What you're trying to find - -**Returns:** Custom guide for constructing effective searches - -**Best for:** Learning query syntax, complex search construction - -**Rate limits:** None - -**Important:** MUST call this ONCE at beginning before any search or fetch_content - -### 2. `search` - Code Search Across Repos - -**What it does:** Search code with powerful filters - -**Parameters:** -- `query` (required) - Search query with filters - -**Query Syntax:** -- Default: Searches file content -- `repo:pattern` - Filter by repo (regex) -- `file:pattern` - Filter by file path -- `lang:language` - Filter by language -- `type:symbol` - Search for symbols (functions/classes) -- Boolean: `AND`, `OR`, `-` (negation) -- Quotes: `"exact phrase"` -- Regex: Full regex support - -**Returns:** Search results with file paths, line numbers, code snippets - -**Best for:** Finding code examples, patterns, implementations - -**Rate limits:** None (free tier for public repos) - -### 3. `fetch_content` - Get File Content - -**What it does:** Fetch file or directory content from repository - -**Parameters:** -- `repo` (required) - Repository path (e.g., "github.com/org/project") -- `path` - File or directory path (empty for root) - -**Returns:** -- If file: File content -- If directory/empty: Directory tree (depth 2) - -**Best for:** Exploring repo structure, reading specific files - -**Rate limits:** None - -**Workflow:** -1. Search to find repos -2. Fetch root ("") to see structure -3. Fetch specific files - -## Tradeoffs - -### Advantages -✅ **Unlimited searches** (no rate limits on free tier) -✅ **Powerful regex filters** (repo, file, lang) -✅ **Guided search prompts** (natural → precise queries) -✅ **Returns exact code** with line numbers -✅ **Symbol search** (find functions/classes) -✅ **No credit tracking** needed - -### Disadvantages -❌ **Public repos only** (free tier) -❌ **No web search** (code only) -❌ **No content extraction** from web -❌ **Private repos** require paid plan - -## Common Pitfalls: When NOT to Use - -### ❌ Web Content Search -**Problem:** Sourcegraph searches code, not web pages -**Alternative:** Tavily or Brave - -**Example:** -``` -Bad: search("React hooks tutorial blog") -Good: tavily_search("React hooks tutorial blog") -``` - -### ❌ Official Documentation -**Problem:** Sourcegraph finds code examples, not structured docs -**Alternative:** Context7 - -**Example:** -``` -Bad: search("Next.js API documentation") -Good: context7.get_library_docs("/vercel/next.js") -``` - -### ❌ Private Repository Search -**Problem:** Free tier only searches public repos -**Alternative:** Clone locally + Serena MCP - -**Example:** -``` -Bad: search("repo:private-org/private-repo") -Good: Clone locally → serena.find_symbol -``` - -### ❌ Non-Code Content -**Problem:** Sourcegraph indexes code, not docs/images/data -**Alternative:** Clone repo + Filesystem MCP - -**Example:** -``` -Bad: search("README content") -Good: fetch("https://raw.githubusercontent.com/user/repo/main/README.md") -``` - -### ❌ Semantic/Conceptual Search -**Problem:** Sourcegraph uses regex/text matching, not semantic -**Alternative:** Use natural language in guide or Tavily for broader search - -**Example:** -``` -Bad: search("patterns similar to observer") -Good: Use search_prompt_guide with objective, get specific query -``` - -## When Sourcegraph IS the Right Choice - -✅ **Finding code examples** from public repos -✅ **Learning library usage** in the wild -✅ **Discovering implementations** of algorithms -✅ **Researching patterns** across projects -✅ **Symbol search** (functions, classes) -✅ **Unlimited usage** needed (no limits) - -**Decision rule:** "Am I looking for code examples from public repos?" - -## Usage Patterns - -**Always start with guide:** -``` -search_prompt_guide("find React hooks with cleanup functions") -→ Returns query syntax guidance -→ Use returned pattern in search -``` - -**Basic code search:** -``` -search("useState lang:typescript") -→ Finds useState usage in TypeScript -``` - -**Repository filter:** -``` -search("repo:github\\.com/facebook/react file:\\.tsx$ useState") -→ React repo, .tsx files, useState usage -``` - -**Symbol search:** -``` -search("type:symbol func SendMessage lang:go") -→ Find SendMessage function definitions in Go -``` - -**Exclude patterns:** -``` -search("authentication -file:test lang:python") -→ Find auth code, exclude test files -``` - -**Complex query:** -``` -search("\"async def.*request\" lang:python file:api/") -→ Async functions with "request" in name, in api/ directory -``` - -**Workflow with fetch_content:** -``` -1. search("repo:user/project type:repo") - → Find repository - -2. fetch_content(repo="github.com/user/project", path="") - → Get root structure - -3. fetch_content(repo="github.com/user/project", path="src/main.py") - → Read specific file -``` - -## Search Operators Reference - -| Operator | Example | Purpose | -|----------|---------|---------| -| `repo:` | `repo:github\\.com/org/project` | Filter by repository | -| `file:` | `file:\\.go$` or `file:src/` | Filter by file path | -| `lang:` | `lang:python` | Filter by language | -| `type:symbol` | `type:symbol` | Search symbols only | -| `-` | `-file:test` | Exclude pattern | -| `AND` / `OR` | `error AND handler` | Boolean logic | -| `""` | `"func SendMessage"` | Exact phrase | - -## Alternatives Summary - -| Task | Instead of Sourcegraph | Use This | -|------|----------------------|----------| -| Web search | search | Tavily / Brave | -| API documentation | search | Context7 | -| Private repos | search | Clone + Serena | -| Non-code files | search | Clone + Filesystem | -| Semantic search | search | Tavily / use guided prompts | - -## Best Practices - -**Leverage guided prompts:** -- Call `search_prompt_guide` first (required once) -- Describe what you want in natural language -- Get precise query syntax -- Iterate with guide if needed - -**Build effective queries:** -- Start broad, narrow with filters -- Use regex for precise patterns -- Combine operators (repo + lang + file) -- Exclude test files with `-file:test` - -**Explore before extracting:** -1. Search repos: `repo:name type:repo` -2. Fetch root to see structure -3. Search within repo -4. Fetch specific files - -**Common patterns:** -``` -Find function definitions: - type:symbol "func functionName" lang:language - -Find usage examples: - "libraryName.method" lang:language -file:test - -Find patterns in specific repo: - repo:org/project pattern lang:language -``` - -## Quick Reference - -**Rate limits:** None (unlimited on free tier) -**Coverage:** Public open-source repositories (Sourcegraph Cloud public index) -**Best for:** Code examples, learning usage, finding implementations -**Avoid for:** Web content, docs, private repos, semantic search - -**Required first step:** Call `search_prompt_guide` once at start - -**Links:** -- [Sourcegraph Public Code Search](https://sourcegraph.com/search) -- [Category guide: Code search](../category/code-search.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/deep-dives/tavily.md b/protocols/context/static/deep-dives/tavily.md deleted file mode 100644 index 2a22472b..00000000 --- a/protocols/context/static/deep-dives/tavily.md +++ /dev/null @@ -1,226 +0,0 @@ -# Tavily MCP: Deep Dive - -## Overview - -Primary web research tool with search, extract, map, and crawl capabilities. Includes citations (critical for credibility). 1,000 credits/month, resets on 1st. - -## Available Tools - -### 1. `tavily_search` - Web Search with Citations - -**What it does:** Search web and return results with source citations - -**Parameters:** -- `query` (required) - Search query -- `max_results` (default 5) - Number of results -- `search_depth` - "basic" | "advanced" -- `topic` - "general" | "news" | "finance" -- `include_domains` / `exclude_domains` - Filter by domain -- `include_raw_content` (bool | "markdown" | "text") - Include cleaned HTML -- `include_images` / `include_image_descriptions` (bool) -- `time_range` - "day" | "week" | "month" | "year" -- `start_date` / `end_date` - "YYYY-MM-DD" format -- `country` - Boost results from specific country - -**Returns:** Search results with citations, snippets, URLs - -**Best for:** General web research with credible sources - -**Credits:** Basic = 1 credit; Advanced = 2 credits - -### 2. `tavily_extract` - Content Extraction from URLs - -**What it does:** Extract clean content from specific URLs - -**Parameters:** -- `urls` (required) - Array of URLs to extract -- `extract_depth` - "basic" | "advanced" -- `format` - "markdown" | "text" -- `include_images` (bool) - -**Returns:** Extracted content in specified format - -**Best for:** Getting full content from known URLs - -**Credits:** Varies by complexity and depth - -### 3. `tavily_crawl` - Multi-Page Website Crawling - -**What it does:** Crawl multiple pages from a website - -**Parameters:** -- `url` (required) - Starting URL -- `limit` (default 50) - Total pages to process -- `max_depth` (default 1) - How far from base URL -- `max_breadth` (default 20) - Links per page level -- `select_domains` / `exclude_domains` - Domain filters (regex) -- `select_paths` / `exclude_paths` - Path filters (regex) -- `instructions` - Natural language guidance for crawler -- `extract_depth` - "basic" | "advanced" -- `format` - "markdown" | "text" - -**Returns:** Content from multiple pages (truncated to 500 chars/page) -**Returns:** Extracted content (raw content) from multiple pages - -**Best for:** Gathering info from related pages (5-20 pages) - -**Credits:** Mapping cost + Extraction cost (sum) - -**Note:** Crawl combines mapping and extraction. For tighter control, use `tavily_map` to discover URLs first, then `tavily_extract` on selected pages. - -### 4. `tavily_map` - Website Structure Discovery - -**What it does:** Discover URLs without extracting content - -**Parameters:** -- `url` (required) - Root URL -- `limit` - Max URLs to discover -- `max_depth` / `max_breadth` - Discovery limits -- `select_domains` / `exclude_domains` - Domain filters -- `instructions` - Crawler guidance - -**Returns:** Array of discovered URLs - -**Best for:** Understanding site structure before extraction - -**Credits:** Lower than crawl (no content extraction) - -## Tradeoffs - -### Advantages -✅ **Includes citations** (critical for credibility) -✅ Generous 1k credits/month (resets 1st) -✅ Multiple capabilities (search/extract/crawl/map) -✅ Handles news, general info, current events -✅ Advanced depth options for complex tasks - -### Disadvantages -❌ Monthly limit (1k credits) -❌ Credits vary by operation complexity -❌ Not a code/repository search engine (use Sourcegraph) - -## Common Pitfalls: When NOT to Use - -### ❌ Simple Known URL Fetch -**Problem:** Wastes credits on unlimited-alternative task -**Alternative:** Fetch MCP (unlimited) - -**Example:** -``` -Bad: tavily_extract(["https://example.com/article"]) -Good: fetch("https://example.com/article") -``` - -### ❌ Highly Specific Semantic Queries -**Problem:** While Tavily has good general understanding, extremely specific semantic queries may need refinement -**Alternative:** Refine query with more keywords or use broader search - -**Example:** -``` -Less ideal: tavily_search("concepts similar to CQRS") -Better: tavily_search("CQRS event sourcing saga pattern microservices") -``` - -### ❌ Deep Multi-Page Crawl (50+ pages) -**Problem:** Crawl truncates content to 500 chars/page -**Alternative:** Map first, then extract specific pages for better control - -**Example:** -``` -Bad: tavily_crawl(url, limit=100) # Less control, higher cost -Good: tavily_map(url) → tavily_extract([specific_urls]) -``` - -### ❌ Code Search -**Problem:** Tavily searches web, not code repositories -**Alternative:** Sourcegraph - -**Example:** -``` -Bad: tavily_search("React hooks examples") -Good: sourcegraph.search("repo:.*react.* use.*Hook") -``` - -### ❌ Official API Documentation -**Problem:** Tavily returns web results, not structured docs -**Alternative:** Context7 - -**Example:** -``` -Bad: tavily_search("Next.js App Router API") -Good: context7.get_library_docs("/vercel/next.js", topic="routing") -``` - -### ❌ Credits Near Exhaustion (<100 remaining) -**Problem:** Risk running out mid-month -**Alternative:** Switch to Brave (2k/month) - -**Example:** -``` -Bad: tavily_search (when at 90/1000 credits) -Good: brave_web_search -``` - -## When Tavily IS the Right Choice - -✅ **General web research** needing citations -✅ **Multi-source synthesis** (search + extract) -✅ **Current events** and news -✅ **Site mapping** before selective extraction -✅ **Moderate crawling** (5-20 pages with full extraction workflow) - -**Decision rule:** "Do I need citations and have credits available?" - -## Alternatives Summary - -| Task | Instead of Tavily | Use This | -|------|------------------|----------| -| Single URL | extract | Fetch MCP | -| Deep crawl (50+) | crawl | Map + extract for better control | -| Code search | search | Sourcegraph | -| API docs | search | Context7 | -| Credits low | any tool | Brave | - -## Best Practices - -**Optimize credit usage:** -- Use `max_results` wisely (default 5 is often enough) -- Start with "basic" depth, escalate to "advanced" if needed -- Use Fetch for known URLs (save credits) -- Monitor credit balance throughout month - -**Smart crawling workflow:** -1. `tavily_map` → Discover URLs (lower cost) -2. Filter relevant URLs -3. `tavily_extract` → Get full content from specific pages -4. Use `tavily_crawl` to combine mapping+extraction when you want an end-to-end crawl; prefer Map → Extract for selective, full-content control - -**When to use each tool:** -- **Search:** Discovery, research, current info -- **Extract:** Known URLs, full content needed -- **Map:** Site structure, URL discovery -- **Crawl:** Quick overview (accept truncation) - -**Citation handling:** -- Always use Tavily when citations matter -- Include source URLs in outputs -- Verify claims against multiple sources - -## Quick Reference - -**Total budget:** 1,000 credits/month -**Rate limits:** Development 100 RPM; Production 1,000 RPM (Production keys require paid plan or PAYGO) -**Reset:** 1st of every month -**Cost:** Free tier - -**Credit costs:** -- Basic search: 1 credit -- Advanced search: 2 credits -- Extract: Basic 1 credit per 5 successful URLs; Advanced 2 credits per 5 successful URLs -- Map: 1 credit per 10 pages; with `instructions`: 2 credits per 10 pages -- Crawl: Mapping cost + Extraction cost (sum) - -**Links:** -- [Credit costs detail](https://docs.tavily.com/documentation/api-credits#api-credits-costs) -- [Category guide: Web research](../category/web-research.md) -- [Full decision guide](../../../tools/tools-decision-guide.md) diff --git a/protocols/context/static/handoff-guide.md b/protocols/context/static/handoff-guide.md index f87e99b8..23ac34ff 100644 --- a/protocols/context/static/handoff-guide.md +++ b/protocols/context/static/handoff-guide.md @@ -1,419 +1,151 @@ # *Handoff guidelines:* how and when to delegate to subagents -> **Purpose**: a concise guide for when to delegate work vs. when to ask the user for guidance. - -## Table of contents - -- [Table of contents](#table-of-contents) -- [Terminology](#terminology) -- [Core delegation strategies](#core-delegation-strategies) - - [Delegation mechanisms available](#delegation-mechanisms-available) - - [Delegation principles](#delegation-principles) - - [When NOT to delegate](#when-not-to-delegate) - - [Integration with *Superpowers* skills *(Claude and Codex only)*](#integration-with-superpowers-skills-claude-and-codex-only) -- [Delegation mechanisms (detailed)](#delegation-mechanisms-detailed) - - [Quick comparison](#quick-comparison) - - [Using `clink`](#using-clink) - - [Using `Task` tool *(Claude Code)* / native subagents *(OpenCode)*](#using-task-tool-claude-code--native-subagents-opencode) - - [Using `AskUserQuestion`](#using-askuserquestion) -- [Parallel delegation strategies](#parallel-delegation-strategies) - - [When to parallelize](#when-to-parallelize) - - [How to parallelize](#how-to-parallelize) - - [Common parallelization patterns](#common-parallelization-patterns) - - [Default mindset](#default-mindset) -- [Merge \& verify results](#merge--verify-results) -- [Subagent context management](#subagent-context-management) - - [Setting your spawned subagents up for success](#setting-your-spawned-subagents-up-for-success) - - [Tool-specific guidance](#tool-specific-guidance) - - [Common context handoff mistakes](#common-context-handoff-mistakes) -- [Choosing *models* when spawning subagents](#choosing-models-when-spawning-subagents) - - [Decision tree (quick reference)](#decision-tree-quick-reference) - - [Task category → model/role mapping](#task-category--modelrole-mapping) - - [Thinking level optimization](#thinking-level-optimization) -- [Handoff patterns](#handoff-patterns) -- [When to ask the user (AskUserQuestion)](#when-to-ask-the-user-askuserquestion) - - [When to ask](#when-to-ask) - - [Best practices](#best-practices) - - [What requires explicit approval, always](#what-requires-explicit-approval-always) - -## Terminology - -| Concept | Definition | -| :------ | :--------- | -| **Skill** | *Superpowers* workflow (e.g., `superpowers:test-driven-development`) + any extra Claude/Codex skills files you define | -| **Agent** | Personas used by agents (e.g., `debugger`, `architect`), invoked either directly in your main chat or as subagents | -| **Subagent** | A child **agent**, isolated from the main chat, spawned to complete a particular task by *either* (1) Claude Code or OpenCode native subagents feature or (2) `clink` | -| **MCP** | MCP servers made available for you to use; see `tools-guide.md` (on your list of must-read files) for a full guide | - -## Core delegation strategies - -### Delegation mechanisms available - -1. `clink` (from PAL MCP, available everywhere): Cross-model orchestration to Claude, Codex (GPT‑5.1‑Codex), Gemini -2. `Task` tool (Claude) or native subagents (OpenCode): Spawn specialized subagents -3. `AskUserQuestion`: Stop and obtain explicit user guidance - -### Delegation principles - -- Delegate when another model or role materially improves accuracy, speed, cost, or context handling. -- Ask the user when requirements are ambiguous, multiple valid options exist, or explicit approval is required. -- Handle directly when the task is within your capability and scope is clear. - -### When NOT to delegate - -**Handle tasks directly (don't delegate) when:** - -- **Task is simple and well-understood:** 1-2 file edits with clear requirements; faster to do than explain -- **Requires tight iteration loops:** Debugging with frequent hypothesis testing; trial-and-error exploration -- **Context loss would be expensive:** Deeply nested state that's hard to summarize; extensive prior conversation history -- **You already have necessary context loaded:** Files read, relationships understood; delegation = wasteful reloading -- **Explanation overhead > execution time:** If describing the task takes longer than doing it - -**Cost of delegation:** -- Context summarization overhead (lossy compression of your current understanding) -- Potential information loss (nuances don't survive handoff) -- Coordination time (waiting for subagent, reviewing results) -- Risk of misunderstanding requirements (ambiguity in your prompt) - -**Rule of thumb:** If you can complete the task in <2 minutes with context you already have → handle directly. - -### Integration with *Superpowers* skills *(Claude and Codex only)* - -#### Precedence rule when both systems apply - -1. **Process enforcement (*Superpowers*)**: If a *Superpowers* skill mandates a specific workflow (e.g., TDD, systematic debugging), you MUST follow it; this is non-negotiable per the skill's requirements. -2. **Capability selection**: Within that mandated workflow, use handoff guidelines (this file) and model selection guide to choose: - - - Which model/CLI to use (`clink` with appropriate role) - - Which MCP tools to leverage (per `tools-guide.md`) - - When to delegate vs. handle directly - -#### Both systems are complementary - -- *Superpowers* defines ***how** to work* -- `handoff-guide.md` defines ***who** does it* -- `mcp-compact-list.md` defines ***which tools** are used* - -For example, suppose you are performing a debugging task. Examples of what could be enforced by each system: - -| System | What it enforces/leads you to do | -| :-- | :-- | -| Superpowers | Use `systematic-debugging` skill (4-phase investigation process) | -| "Handoff guidelines" and "Compact MCP list" guides | Use Codex via `clink` with `debugger` role + Serena MCP for code search | - -## Delegation mechanisms (detailed) - -### Quick comparison - -| Mechanism | CLI Support | Primary Use Cases | When to Use | -| :--- | :--- | :--- | :--- | -| **`clink`** | All CLIs | Cross-model delegation, specialized roles | Different model needed (see [decision tree](#decision-tree-quick-reference)), isolated context required | -| **`Task`** / native subagents | Claude, OpenCode | Specialized subagents, exploration, research | Codebase exploration, parallel searches, multi-step research | -| **`AskUserQuestion`** | All CLIs | User guidance, approval gates | Ambiguity, multiple valid approaches, explicit approval needed | +> **Purpose**: a concise guide for when and how to delegate work vs. when to ask the user for guidance. -### Using `clink` +## Delegation decision flow -**When to use:** -- Cross-model collaboration is beneficial (see [decision tree](#decision-tree-quick-reference) for which model to choose) -- Spawning a specialized role with isolated context -- Task requires capabilities better suited to a different model +Apply these gates in order. Stop at the *first* gate that determines the next action. -**CLI-specific rules:** -- **Claude Code**: Use `clink` for delegation to Codex or Gemini (use `Task` for Claude subagents) -- **Codex/Gemini**: Use `clink` for all delegation (no `Task` tool available) +### Gate 1: Ask the user first -**How to use:** -- Provide `cli_name` (`claude`, `codex`, `gemini`), clear `prompt`, and `role` (from clink role prompts directory) -- Always reuse `continuation_id` to preserve conversation context -- Include absolute file paths when relevant +Ask the user when: -**Special: Explore and Plan agents** +- requirements are ambiguous +- multiple valid approaches exist and trade-offs are non-obvious +- explicit approval is required -For `Explore` and `Plan` agents, specify thoroughness in your prompt: +If none apply, continue to Gate 2. -| Thoroughness | Result | -| :--- | :--- | -| `"quick"` | Basic pattern matching, fast lookups | -| `"medium"` | Moderate exploration across multiple files | -| `"very thorough"` | Comprehensive analysis across entire codebase | +### Gate 2: Handle directly or delegate -### Using `Task` tool *(Claude Code)* / native subagents *(OpenCode)* +Handle directly when one or more of these are true: -**When to use:** -- Codebase exploration (Explore agent with configurable thoroughness) -- Multi-step research tasks and token-conserving searches -- Parallelizing independent search/analysis tasks +- task is simple and well-understood (for example, 1-2 clear file edits) +- task needs tight iteration loops (debugging with rapid hypothesis testing) +- explanation overhead is higher than execution time -**When NOT to use:** -- Single known file reads or known symbol queries (use direct tools instead) -- Cross-model delegation (use `clink` instead) +Delegate when a different model or role materially improves accuracy, speed, cost, or context handling. -### Using `AskUserQuestion` +Before delegating, account for delegation costs: -See [When to ask the user](#when-to-ask-the-user-askuserquestion) section for complete guidance. +- summarization overhead +- potential loss of nuance +- coordination and review delay +- risk of misunderstood prompt requirements -## Parallel delegation strategies +### Gate 3: If delegating, single or parallel -### When to parallelize +Use parallel delegation only when all are true: -Spawn multiple subagents **concurrently** (not sequentially) when tasks are: -- **Independent**: No data dependencies between them -- **Parallelizable**: Can execute simultaneously without coordination -- **Time-consuming**: Research, analysis, exploration, code search -- **Mergeable**: Results can be combined afterward +- tasks are independent +- tasks can run concurrently without coordination +- tasks are time-consuming enough to justify split execution +- outputs are mergeable and can be verified together -### How to parallelize +If any condition fails, use a single delegate (or return to direct handling). -**Critical:** Use **multiple tool calls in a single response** to launch parallel subagents. +### Default posture -**Example (correct - parallel execution):** -``` -Let me analyze these 4 modules in parallel using clink: -[Launches 4 clink calls in one message] -``` +Ask: "Can this be split into 2+ independent subtasks?" -**Anti-pattern (sequential execution):** -``` -Let me analyze module A first... -[waits for response] -Now let me analyze module B... -[waits for response] -``` +- if yes, parallelize +- if no, use single delegate or direct handling -### Common parallelization patterns +Err toward parallelization only after Gate 2 confirms delegation is worth the overhead. -1. **Multi-module analysis** - - Spawn N subagents, one per module/package/service - - Merge findings after all complete +## Subagent lifecycle checklist -2. **Dependency research** - - Research 5 libraries in parallel (features, security, compatibility) - - Compare results in single summary +Use this lifecycle for every delegated task. -3. **Cross-codebase search** - - Search multiple repos/branches concurrently - - Aggregate results +### Phase A: Prompt contract (before spawning) -4. **Independent refactors** - - Refactor 3 unrelated files in parallel - - Review all changes together +Always include in subagent prompts: -5. **Multi-step investigation** - - Investigate 4 potential root causes simultaneously - - Converge on likely culprit +1. **Relevant file paths** (absolute, not relative) +- Provide exact paths to files the subagent will need +- Example: `/Users/you/project/src/module/file.ts` NOT `./file.ts` -### Default mindset +2. **Summarized context** (what you've learned that's relevant) +- Key findings from your investigation so far +- Important constraints or requirements discovered +- Relevant architectural decisions or patterns -**Before starting work:** Ask "Can I break this into 2+ independent subtasks?" -- If **YES** → Spawn multiple subagents in parallel -- If **NO** → Handle directly or spawn single subagent +3. **Clear success criteria** (what "done" looks like) +- Specific deliverable expected from the subagent +- How you'll verify the work is complete +- What format you need the results in -**Err toward parallelization.** Coordination overhead is minimal compared to sequential execution time. +4. **Explicit constraints** (what NOT to do) +- Actions requiring approval (don't commit, don't delete, etc.) +- Areas to avoid modifying +- Specific approaches to reject -## Merge & verify results +### Phase B: Result reconciliation (after execution) -Parallel execution only helps if you consolidate the answers rigorously. After every parallel batch, run this checklist: +After every delegated batch, run this checklist: 1. **Collect & normalize:** pull every subagent summary into one workspace (table/list/doc) and record CLI/model/thinking levels plus citations so claims stay traceable. 2. **Compare & detect conflicts:** highlight overlaps, find contradictions or duplicated work, and confirm no component was overlooked. -3. **Validate critical claims:** spot-check referenced files, rerun key commands/tests, and double‑check web/API citations before accepting conclusions. -4. **Decide outcomes:** +3. **Validate critical claims:** spot-check referenced files, rerun key commands/tests, and double-check web/API citations before accepting conclusions. +4. **Decide outcomes:** mark each subtask `Accepted` / `Needs follow-up` / `Rejected`; if blockers remain, spawn a focused follow-up subagent or use the [trigger matrix](#trigger-matrix). - - Mark each subtask `Accepted` / `Needs follow-up` / `Rejected` - - If blockers remain, spawn a focused follow-up subagent or use `AskUserQuestion` +### Phase C: Close the loop -5. **Record & broadcast:** update Memory MCP (relationships) and Qdrant (insights/gotchas) with the reconciled truth (and not raw subagent dump) and summarize decisions back to the main thread/user. -6. **Plan next actions:** turn accepted recommendations into concrete edits/tests/commits and explicitly close the loop on rejected paths so future agents don’t retry them. +1. **Record & broadcast:** update memory tools with the reconciled, distilled truth (and not raw subagent dump) and summarize decisions back to the main thread/user. +2. **Plan next actions:** turn accepted recommendations into concrete edits/tests/commits and explicitly close the loop on rejected paths so future agents don't retry them. -> **Reminder:** Never ship or store memories until merged results are verified. Raw, unvetted subagent output must not flow into persistent systems. +### Hard rule on memory writes -## Subagent context management - -### Setting your spawned subagents up for success - -**Always include in subagent prompts:** - -1. **Relevant file paths** (absolute, not relative) - - Provide exact paths to files the subagent will need - - Example: `/Users/you/project/src/module/file.ts` NOT `./file.ts` - -2. **Summarized context** (what you've learned that's relevant) - - Key findings from your investigation so far - - Important constraints or requirements discovered - - Relevant architectural decisions or patterns - -3. **Clear success criteria** (what "done" looks like) - - Specific deliverable expected from the subagent - - How you'll verify the work is complete - - What format you need the results in - -4. **Explicit constraints** (what NOT to do) - - Actions requiring approval (don't commit, don't delete, etc.) - - Areas to avoid modifying - - Specific approaches to reject - -### Tool-specific guidance - -**For `clink`:** -- **Reuse `continuation_id`** to preserve conversation context across multiple turns -- Provide **absolute file paths**, never relative paths -- Specify **role explicitly** (e.g., `debugger`, `architect`); don't assume defaults -- Include **images** parameter if visual context is needed (screenshots, diagrams) - -**For `Task` tool (Claude Code):** -- Choose appropriate **thoroughness level** for Explore/Plan agents (`quick`, `medium`, `very thorough`) -- Use **parallel Task calls** for independent exploration/research tasks -- Provide **clear prompt** with specific research questions or exploration goals - -**For native subagents (OpenCode):** -- Press **Tab** to cycle through registered agents and select the appropriate one -- Use natural language to delegate, or explicitly name the agent (e.g., "Have the debugger agent investigate...") -- OpenCode auto-delegates when agent descriptions match the task context - -### Common context handoff mistakes - -**DON'T:** -- ❌ Say "investigate this bug" without providing error messages, stack traces, or relevant files -- ❌ Assume subagent has your conversation history (it doesn't; provide needed context) -- ❌ Use vague success criteria like "make it better" or "fix the issues" -- ❌ Forget to mention approval requirements (subagent might commit/delete without knowing) -- ❌ Provide relative paths when codebase structure is ambiguous - -**DO:** -- ✅ Include specific error messages: "Getting `NullPointerException` at line 42 in `UserService.java`" -- ✅ Summarize findings: "Found 3 similar bugs in `auth/` module, all related to session handling" -- ✅ Set concrete goals: "Refactor `processPayment()` to extract retry logic into separate function" -- ✅ State constraints clearly: "Don't modify database schema; only change application code" -- ✅ Give absolute paths (not relative) - -## Choosing *models* when spawning subagents - -### Decision tree (quick reference) - -Use this flowchart for fast model selection based on **hard constraints** and **proven strengths**: - -1. **Context >400K tokens?** → **Gemini 2.5 Pro** (1M context window; only option) -2. **Math-heavy work** (DB/ML/optimization) **+ context ≤400K?** → **GPT-5.1-Codex** with **High thinking** (superior math/code accuracy) -3. **Mechanical refactors, multi-file changes, testing?** → **GPT-5.1-Codex** (proven strength for repo-wide changes) -4. **Fast iterations, CI fixes, simple tasks?** → **Claude Haiku 4.5** (speed) -5. **Maximum rigour, highest-impact decisions?** → **Claude Opus 4.5** (use sparingly; strict weekly limits) -6. **Default for most other tasks?** → **GPT-5.1-Codex** or **Claude Sonnet 4.5** (both capable; choose based on task table below, personal preference, or availability) - -### Task category → model/role mapping - -| Task Category | First Pick | CLI | Role | Why | -| --- | --- | --- | --- | --- | -| Architecture & system design | Sonnet 4.5 | `claude` | `architect` | Trade‑off aware planning; extended thinking | -| Large migrations & refactors | GPT‑5.1‑Codex | `codex` | `migration-refactoring` | Best for repo‑wide mechanical changes | -| API integration | GPT‑5.1‑Codex | `codex` | `api-integration` | Contract‑first design; idempotency/retry patterns | -| Data engineering / ML | GPT‑5.1‑Codex | `codex` | `ai-ml-eng` | Superior code/math accuracy when ≤400K tokens | -| Database optimization | GPT‑5.1‑Codex | `codex` | `db-internals` | Cross‑file SQL + app code analysis | -| Performance optimization | GPT‑5.1‑Codex | `codex` | `optimization` | Finds N+1s, blocking I/O, complexity issues | -| Frontend / design systems | GPT‑5.1‑Codex | `codex` | `frontend` | Component refactors; accessibility | -| Platform / DevEx / Infra | Haiku 4.5 | `claude` | `platform-eng` | CI fix‑ups; batch hygiene; speed | -| Observability / incident | Sonnet 4.5 | `claude` | `observability` | Runbooks; SLO/error budgets; RCAs | -| Reliability / scalability | Sonnet 4.5 | `claude` | `scalability-reliability` | Capacity plans; coordination | -| Security / privacy | Sonnet 4.5 | `claude` | `security-compliance` | Threat modeling; audits | -| Testing / verification | GPT‑5.1‑Codex | `codex` | `testing` | Unit/property/mutation tests | -| Real‑time systems | GPT‑5.1‑Codex | `codex` | `realtime` | Latency budgets; scheduling | -| Systems: C++ | GPT‑5.1‑Codex | `codex` | `cpp-pro` | Concurrency, FFI, safety | -| Systems: Rust | GPT‑5.1‑Codex | `codex` | `rust-pro` | Ownership, unsafe blocks | -| Systems: Go | GPT‑5.1‑Codex | `codex` | `golang-pro` | Goroutines, channels | -| DevOps / IaC | GPT‑5.1‑Codex | `codex` | `devops-infra-as-code` | Terraform, Kubernetes | -| Implementation helper | GPT-5.1-Codex or Sonnet 4.5 | `codex` for GPT-5.1-Codex; `claude` for Sonnet 4.5 | `implementation-helper` | General coding assistance | -| Code explanation / understanding | Any with good context | Any | `explainer` | Choose model based on codebase size/complexity | -| Architecture audits | Sonnet 4.5 | `claude` | `architecture-audit` | Deep analysis of existing architecture; identify issues | -| Tech debt analysis | Sonnet 4.5 | `claude` | `tech-debt` | Identify and prioritize technical debt systematically | -| Cost optimization | Sonnet 4.5 | `claude` | `cost-optimization-finops` | Cloud costs; resource optimization; FinOps strategies | -| Distributed systems design | Sonnet 4.5 | `claude` | `distributed-systems` | Consensus; consistency; partition tolerance | -| Networking / edge infrastructure | GPT‑5.1‑Codex | `codex` | `networking-edge-infra` | CDN; load balancing; edge compute patterns | -| Mobile engineering | GPT‑5.1‑Codex | `codex` | `mobile-eng-architect` | iOS/Android architecture; mobile‑specific patterns | -| Whole‑repo analysis >400K tokens | Gemini 2.5 Pro | `gemini` | `explainer` | 1M‑token context | -| API client SDK design | GPT‑5.1‑Codex | `codex` | `api-client-designer` | Idiomatic SDKs; retries/pagination/auth; contract‑driven | -| Authentication engineering | Sonnet 4.5 | `claude` | `auth-specialist` | Secure flows (sessions/OAuth/JWT); OWASP; least privilege | -| Caching strategy & implementation | Gemini 2.5 Pro | `gemini` | `caching-specialist` | Layering, TTL/invalidations, cache keys, metrics | -| Chaos engineering & resilience drills | Sonnet 4.5 | `claude` | `chaos-engineer` | Plan experiments, limit blast radius, SLOs; coordinate runs | -| Code review | GPT‑5.1‑Codex | `codex` | `code-reviewer` | Quality, security, maintainability; actionable diffs | -| Data engineering pipelines | Gemini 2.5 Pro | `gemini` | `data-eng` | ETL/ELT, SQL tuning, batch/stream jobs | -| Debugging and triage | GPT‑5.1‑Codex | `codex` | `debugger` | Cross‑file fault isolation; logs/traces; reproduction | -| Event‑driven design/implementation | GPT‑5.1‑Codex | `codex` | `event-driven` | Topics/queues, idempotency, backpressure, ordering | -| Engineering historian | Sonnet 4.5 | `claude` | `historian` | Change timelines, root causes, commit/story synthesis | -| Incident command | Sonnet 4.5 | `claude` | `incident-commander` | Runbooks, comms, decision logs, mitigations | -| Interviewer (requirements) | Sonnet 4.5 | `claude` | `interviewer` | Clarify constraints/trade‑offs; elicit missing info | -| Schema evolution & migrations | GPT‑5.1‑Codex | `codex` | `schema-evolution` | Backward‑compatible changes, migrations, rollout/rollback | -| Repository/code search | Haiku 4.5 | `claude` | `searcher` | Fast exploration; summarize findings; pointers | -| Task decomposition | Sonnet 4.5 | `claude` | `task-decomposer` | Break goals into verifiable steps, deps, risks | - -### Thinking level optimization - -When delegating via `clink`, adjust thinking levels in your prompt for better performance: - -**Codex (effort levels):** -- `"minimal effort"`: formatting, regex, simple text transforms -- `"low effort"`: single-file edits, straightforward changes -- *(default)*: cross-file analysis, moderate complexity -- `"high effort"`: migrations, concurrency, complex refactors, math-heavy work - -**Claude (extended thinking):** - -Add `"extended thinking"` to your prompt for: -- Long-horizon refactors, complex architectural decisions -- Root cause analysis (RCAs), incident investigations -- Multi-service planning, system design -- Complex research requiring deep reasoning +> [!IMPORTANT] +> +> Never ship or store memories until merged results are verified. Raw, unvetted subagent output must not flow into persistent systems. ## Handoff patterns -Follow the sequential flow of phases in the table below (in the order presented): +Use this phase map as a quick navigator. Detailed execution rules live in linked sections. -| Order | Phase name | Objectives | Actions to perform | +| Order | Phase name | Primary objective | Use these sections | | :--- | :--- | :--- | :--- | -| **1** | **Research** | Understand requirements, code, and constraints | Use `Task` Explore (Claude) for codebase exploration; use direct read/glob for known files/symbols; use `clink`→Gemini for long‑context needs (>~400K). If ambiguity remains, see [AskUserQuestion](#when-to-ask-the-user-askuserquestion) | -| **2** | **Planning** | Design the approach, break down tasks, identify risks | Outline steps; note trade‑offs; delegate complex architecture to Claude (`architect`) or refactoring planning to Codex (`migration-refactoring`) via `clink`. For large goals, use `task-decomposer` to produce verifiable substeps and dependencies. Resolve ambiguities via [AskUserQuestion](#when-to-ask-the-user-askuserquestion). See [What requires explicit approval, always](#what-requires-explicit-approval-always) before proceeding. | -| **3** | **Implementation** | Execute the plan and write changes | Track tasks; use Codex for wide refactors/testing; use Claude for coordination; use Gemini for long‑context codebase analysis. Leverage specialist roles as needed (e.g., `schema-evolution`, `auth-specialist`, `caching-specialist`, `event-driven`, `debugger`). Also see [what requires explicit approval, always](#what-requires-explicit-approval-always) before proceeding. | -| **4** | **Review/Verification (optional)** | Verify changes, run tests, prepare for commit | Request code review (use `code-reviewer` for structured guidance); run tests; follow the [Approval pattern](#approval-pattern) as needed. | +| **1** | **Research** | Understand requirements, code, and constraints | [Delegation decision flow](#delegation-decision-flow), [Phase A: Prompt contract](#phase-a-prompt-contract-before-spawning), [Trigger matrix](#trigger-matrix) | +| **2** | **Planning** | Design approach, break down tasks, identify risks | [Delegation decision flow](#delegation-decision-flow), [Phase A: Prompt contract](#phase-a-prompt-contract-before-spawning), [Authorization categories](#authorization-categories-explicit-approval-required) | +| **3** | **Implementation** | Execute changes while controlling risk and coordination overhead | [Delegation decision flow](#delegation-decision-flow), [Phase B: Result reconciliation](#phase-b-result-reconciliation-after-execution), [Authorization categories](#authorization-categories-explicit-approval-required) | +| **4** | **Review/Verification** | Validate outcomes, resolve conflicts, and prepare safe closure | [Phase B: Result reconciliation](#phase-b-result-reconciliation-after-execution), [Phase C: Close the loop](#phase-c-close-the-loop), [Approval workflow](#approval-workflow) | ## When to ask the user (AskUserQuestion) -### When to ask - -Ask the user when: -- Requirements are ambiguous, trade‑offs are uncited, or multiple valid approaches exist. -- High‑impact architectural decisions are involved (system design, stack selection, breaking changes). -- Critical information is missing (configs, environment details). -- Security/compliance sensitivity exists (credentials, access control, retention). -- Before destructive operations or any action listed under “Explicit Approval”. - -### Best practices -- Provide 2–4 clear options with concise trade‑offs; allow multi‑select when appropriate. -- Include context about why you’re asking; keep headers short. - -### What requires explicit approval, always - -- **Version control operations**: creating commits (unless explicitly told), pushing, merging, rebasing, force pushes, amending commits. - - - Exception: User explicitly says "commit this" or "push these changes". - -- **Destructive operations**: deleting files/dirs, truncating databases, dropping tables, purging caches, removing dependencies. +### Trigger matrix - - *Exception*: User explicitly says "delete X" or removal is part of an explicitly approved refactoring task. +Use this matrix to decide whether to ask and what kind of ask to perform. -- **Production/deployment**: prod deploys, prod config changes, service restarts, env var changes, prod migrations. +| Trigger type | Use when | Required action | +| :--- | :--- | :--- | +| **Clarify** | Requirements are ambiguous, or critical information is missing (for example, configs or environment details). | Ask focused clarification questions before proceeding. | +| **Choose** | Multiple valid approaches exist with non-obvious trade-offs (including high-impact architecture or breaking-change decisions). | Present 2-4 options with concise trade-offs and request a decision. | +| **Authorize** | The action is in an explicit-approval category, including security/compliance-sensitive operations. | Follow the [approval workflow](#approval-workflow) and wait for explicit approval. | - - Never assume: always ask even if user says "deploy". +### Asking format -- **Security/access changes**: authN/Z changes, permission grants, exposing endpoints, disabling security features, handling secrets. -- **Breaking changes**: public API removals, signature changes without backward compatibility, schema changes without migrations, config format changes. +- Provide 2-4 clear options with concise trade-offs. +- Allow multi-select when appropriate. +- Include context about why you are asking. +- Keep headers short. - - *Exception*: Proceed only with explicit user approval for the breaking change. +### Authorization categories (explicit approval required) -- **Cost‑impacting changes**: adding cloud resources, increasing instance size, storage tier changes, rate‑limit changes, adding paid third‑party services. +| Category | Includes | Exception | +| :--- | :--- | :--- | +| **Version control operations** | creating commits (unless explicitly told), pushing, merging, rebasing, force pushes, amending commits | User explicitly says "commit this" or "push these changes". | +| **Destructive operations** | deleting files/dirs, truncating databases, dropping tables, purging caches, removing dependencies | User explicitly says "delete X" or removal is part of an explicitly approved refactoring task. | +| **Production/deployment** | prod deploys, prod config changes, service restarts, env var changes, prod migrations | No exception; always ask, even if user says "deploy". | +| **Security/access changes** | authN/Z changes, permission grants, exposing endpoints, disabling security features, handling secrets | No exception. | +| **Breaking changes** | public API removals, signature changes without backward compatibility, schema changes without migrations, config format changes | Proceed only with explicit user approval for the breaking change. | +| **Cost-impacting changes** | adding cloud resources, increasing instance size, storage tier changes, rate-limit changes, adding paid third-party services | No exception. | -#### Approval pattern +### Approval workflow -1. Clearly state what will happen -2. List affected resources/files -3. Explain potential risks/impacts -4. Use AskUserQuestion with clear options -5. Wait for explicit approval -6. Proceed only after approval +1. Clearly state what will happen. +2. List affected resources/files. +3. Explain potential risks/impacts. +4. Use `AskUserQuestion` with clear options. +5. Wait for explicit approval. +6. Proceed only after approval. diff --git a/protocols/context/static/skills/README.md b/protocols/context/static/skills/README.md new file mode 100644 index 00000000..6b9f1182 --- /dev/null +++ b/protocols/context/static/skills/README.md @@ -0,0 +1,23 @@ +# Static skill templates + +## Important note: *avoid* adding `name` fields to `SKILL.md` files' YML frontmatter + +Most coding agents require the `name` field in a given `SKILL.md` to match the name of the folder they're installed to. Now, when installing skills to each supported coding agent's configs, Bureau adds a prefix to the dirnames in which skills are installed; this is to **avoid any potential naming conflicts with other skills already installed to a given coding agent's config**. + +> [!NOTE] +> The prefix added to skills' install directories is set using `skills.sources[].prefix` in the [YML configs](../../../../docs/CONFIGURATION.md); by default, `bureau-` is used. + +> For example, in each coding agent's config's `skills/` directory, the [*micro mode* skill](micro-mode/SKILL.md) gets installed in a subdirectory called `bureau-micro-mode` (i.e., ``). + +> [!IMPORTANT] +> +> Note this also has the effect of **changing each skill's name, as shown in each coding agent's interface, to be displayed as ``** (e.g. `bureau-micro-mode`). + +Hence, if a `name` property is included in a `SKILL.md`'s YAML frontmatter, its value must contain the `bureau-` prefix to match the name of the parent directory (one per enabled coding agent) it will be installed to. However, the parent dirs of the internal Bureau directories where the source/canonical `SKILL.md` files are stored do *not* contain this prefix in their names. Consequently, in VSCode, the line containing the `name` property gets flagged with warnings due to the `bureau-` prefix not matching the parent dir's name. + +This risks creating confusion for Bureau users who: +- add their own custom skills, and/or +- read the included `SKILL.md` files +and see the aforementioned warning, since they will likely not be aware of the explanation above. + +The solution is to simply **avoid using the `name` property in all `SKILL.md` files** since it's entirely optional: the actual source of truth for a skill's name (as shown in coding agents' CLIs) is the **name of the install directory (i.e., within a coding agent's config's `skills/` directory) containing the `SKILL.md` file.** diff --git a/protocols/context/static/skills/assess-mode/SKILL.md b/protocols/context/static/skills/assess-mode/SKILL.md new file mode 100644 index 00000000..b5261318 --- /dev/null +++ b/protocols/context/static/skills/assess-mode/SKILL.md @@ -0,0 +1,333 @@ +--- +description: Two-phase code assessment workflow (architectural comprehension then quality audit) that adapts output to context. Interactive guided tour when running as a main agent; structured markdown report when running as a subagent. Supports four comprehension styles including hunk-by-hunk inline review (comprehension + audit per diff hunk). Activate when user says "assess my changes", "review my changes", "walk me through this code", "audit these files", "assess my changes hunk by hunk", "detailed review", or "ASSESS MODE ON". Configurable standards sources and git diff targets. +--- + +# Assess mode: *protocol* + +> ***Goal:** understand, then audit* +> +> *Two-phase review of code changes: first build a mental model of what changed and why, then audit every file against configured quality standards. Adapt delivery to context: interactive tour if running as a main agent, and written report if running as an isolated subagent.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Formatting + +Read and follow `style.md` (bundled with this skill) for **all** output: interactive messages, written reports, and any files you create or edit. No exceptions. + +## Activation + +When the user says anything like: + +- "assess my changes" +- "assess this branch" +- "review my changes" +- "review this branch" +- "walk me through this code" +- "audit these files against my standards" +- "assess my changes hunk by hunk" +- "detailed review" +- "ASSESS MODE ON" + +*Follow this assess mode protocol.* If you are unsure, confirm unambiguously with the user. + +## Determine inputs + +### Target *(what to review)* + +- If the user provides **explicit files or directories** in their prompt, review those +- Otherwise, use **git diff** against the ref configured in `assess_mode.default_diff` (default: `HEAD`) + + - Include both modified (staged + unstaged) and untracked files + - Run `git diff --name-only ` and `git ls-files --others --exclude-standard` to collect the full changeset + +### Standards sources *(what to check against)* + +- Resolve in this order: + + 1. User provides paths in their activation prompt → use those + 2. `code_standards` in config → use those + 3. `~/.config/bureau/protocols/code-standards.md` exists → use that + 4. None found → the quality audit still runs but only checks internal consistency (DRYness, algorithmic efficiency, codebase pattern consistency); explicitly inform the user that no external standards docs were found and suggest: *"For more targeted reviews, you can configure `code_standards` in your Bureau configuration or edit `~/.config/bureau/protocols/code-standards.md` to provide files detailing what you want agents in Assess Mode to focus on. See the documentation for details."* + +- Read all resolved standards documents before beginning the audit phase + +### Mode *(how to deliver results)* + +- **Interactive** → default when running as a main/direct agent in conversation with the user +- **Report** → default when spawned as a subagent; also triggered when the user explicitly requests a report (e.g. "review my changes and write a report") + +## Phase 1: comprehension + +### Internal prep *(do not show to user)* + +1. Collect the full changeset (all files to review) +2. Read every changed/new file +3. Build a dependency graph → which files import, call, or reference which +4. Identify **logical groups** → cluster files by module, directory, or purpose (e.g. "config loading", "MCP setup", "test suite") +5. **Produce a canonical file ordering** → topologically sort files using the dependency graph from step 3 (foundational files first, consumers last) + + - Within each connected component, order by depth: files with no in-edges (pure providers) come first, files that only consume come last + - For files with no dependency relationship to each other (disconnected components), fall back to logical-group clustering from step 4: place foundational groups (utilities, config, data models) before consumer groups (application logic, scripts, tests) + - This ordering is the **single source of truth** for all comprehension styles that present files sequentially (dependency-ordered, hunk-by-hunk) + +6. For each file, extract: purpose, key functions/classes, invariants, design decisions +7. Identify cross-cutting concerns: shared utilities, common patterns, config flow + +### Present style choice *(interactive mode only)* + +> [!NOTE] +> +> In **report mode**, skip this prompt and use the **layered walkthrough** by default (it reads well as a document). The **hunk-by-hunk** style is interactive-mode only and is never used in report mode. + +Present the user with this choice: + +> How would you like me to walk you through these changes? +> +> 1. **Top-down summary** → one cohesive narrative of what the changeset does, then move to quality audit; best when you roughly know what changed and just need confirmation +> 2. **Layered walkthrough** → executive summary, then component map, then per-component deep dive, with pauses between layers; best when you want to build a mental model incrementally +> 3. **Dependency-ordered** → foundational modules first, consumers last, like reading a textbook; best when the code is unfamiliar and you want to understand it in the order it was designed to be understood +> 4. **Hunk-by-hunk** → walk through each diff hunk individually, explaining what changed and why, then audit it inline; best when you want the finest granularity and want to catch issues in context as they appear +> 5. **Skip** → go straight to quality audit + +### Execute comprehension + +#### Observation directive *(applies to all styles)* + +> [!IMPORTANT] +> +> Throughout comprehension, regardless of which style is active, **surface brief observations (1-3 sentences)** when the code reveals something non-obvious: an interesting design trade-off, an architectural decision worth noting, or a genuinely suboptimal pattern that could be improved. +> +> - **Read before you speak:** if your observation depends on code outside the current scope (callers, sibling modules, prior art in the codebase), read that context first. Do not speculate about what surrounding code does +> - **Earn the call-out:** only flag a design as suboptimal if (a) you have ingested enough context to be confident, (b) the improvement is concrete and actionable, and (c) it is not premature abstraction or YAGNI material. A real problem with a real fix — not a hypothetical improvement +> - **Skip silently** when there is nothing worth noting. Forced insight on uninteresting code wastes the user's attention +> +> Each style below specifies *when* observations should appear in that style's flow. + +#### Top-down summary + +- Present a single architectural narrative covering the entire changeset +- Weave observations into the narrative where they arise naturally — don't separate them into a distinct section +- One pause for questions, then proceed to Phase 2 + +#### Layered walkthrough + +- **Layer 1:** one-paragraph executive summary of the entire changeset +- **Layer 2:** component map → the 3-5 logical groups of files, what each group does, how they connect +- **Layer 3:** per-component deep dive → design decisions, data flow, invariants for each group. Surface observations during this layer, attached to the component they concern +- **Pause after each layer**; the user can: + + - Ask questions about the current layer + - Say "continue" or "next" to proceed to the next layer + - Say "skip to audit" to jump to Phase 2 + - Say "go deeper on this" to expand a section + +#### Dependency-ordered + +- Walk through files in the **canonical topological ordering from prep step 5** (foundational modules first, consumers last) +- For each file/module, explain what it does and why. Surface observations inline as each module is presented +- **Pause after each module**; same user controls as layered walkthrough + +#### Hunk-by-hunk + +> [!IMPORTANT] +> +> - This style **merges Phase 1 and Phase 2**: comprehension and audit happen *together*, inline per hunk. +> - There is no separate Phase 2 pass: skip directly to [Wrap-up](#wrap-up) after all hunks are processed. + +- **Parse the diff** into individual hunks via `git diff -U3 ` + + - Group adjacent or overlapping hunks within the same file into a single logical unit + - **Order files using the canonical topological ordering from [internal prep](#internal-prep-do-not-show-to-user)'s step 5** (foundational files first, consumers last) — this is mandatory, not a suggestion; the inline audit's cross-hunk awareness depends on having reviewed provider interfaces before encountering the consumer code that calls them + - Within each file, present hunks in **source order** (top-to-bottom, i.e., ascending line number) to preserve the natural reading flow of the file + +- **For each hunk (or hunk group)**, emit this block: + + ``` + Hunk N/M — :- () + ``` + + 1. **Show the diff** → render the hunk as a fenced diff code block (` ```diff `) + 2. **Explain** → what changed and why (1-3 bullets; reference the dependency graph and logical groups from internal prep) + 3. **Observe** *(optional)* → apply the [observation directive](#observation-directive-applies-to-all-styles) to this hunk. Surface observations between the explanation and the audit; skip silently on trivial hunks (renames, import reordering, comment edits) + + 4. **Inline audit** → run all 6 check categories against *this hunk only*; emit findings using the same global sequential numbering (`#1` through `#N`) and severity levels as Phase 2 + + - **Cross-hunk awareness:** check whether this hunk echoes or contradicts findings from previously reviewed hunks. If a pattern recurs, reference the original finding number rather than re-explaining (e.g. `#7 [should-fix] Same unquoted expansion as #2`). If a prior finding is *resolved* by this hunk, note that too + - If no findings: emit `No findings for this hunk.` + - If findings exist: list each as: + + ``` + # [] + <one-line explanation and suggested fix> + ``` + + 5. **Pause** → emit: + + ``` + User: ">" to advance | "." to skip rest of file | "deeper" to expand | "fix #N" to fix now | or ask a question + ⏸️ + ``` + + 6. **Wait for user signal** before proceeding: + + | User input | Agent action | + | --- | --- | + | `>` or "next" | Advance to next hunk | + | `.` or "skip" | Skip remaining hunks in current file, advance to next file | + | `deeper` | Expand analysis: show data flow, callers/callees, invariants affected by this hunk | + | `fix #N` | Apply the suggested fix for finding `#N` immediately, re-audit the changed lines, then re-show the hunk with the fix applied (and any new findings numbered sequentially) and re-pause | + | A question | Answer in context of the current hunk, then re-pause on the same hunk | + +## Phase 2: quality audit + +### Check categories + +For each file in the changeset, check against these six categories: + +1. **DRYness** → duplicated logic within the file *and* across the changeset; also flag over-abstraction (functions with too many parameters just to avoid trivial repetition) +2. **Algorithmic efficiency** → wrong data structure, unnecessary complexity class, suboptimal library choice, unnecessary iterations +3. **Consistency with codebase patterns** → does the new code follow conventions already established in the project (naming, structure, idioms, error handling)? +4. **Coding style** → checked against the configured standards docs (docstring format, comment style, naming conventions, file organization) +5. **Correctness concerns** → edge cases, potential race conditions, missing error handling, invariant violations (not a full security audit, but obvious issues) +6. **Design fit** → does this module belong where it is, are the abstractions at the right level, is anything over- or under-engineered? + +### Severity levels + +- **Must fix** → correctness bug, security issue, broken invariant +- **Should fix** → DRY violation, algorithmic inefficiency, clear style violation +- **Consider** → subjective design opinion, minor style nit, potential improvement + +### Finding numbering + +- Every finding is **numbered sequentially** across the entire review: `#1` through `#N` +- Numbering is **global** (not per-file, not per-severity) +- This enables quick referencing in follow-up prompts (e.g. "fix #3 and #7, ignore #5") + +### Interactive mode delivery + +> [!NOTE] +> +> If the **hunk-by-hunk** comprehension style was used, audit findings were already delivered inline per hunk. **Skip this entire Phase 2 delivery** and proceed directly to [Wrap-up](#wrap-up). + +- Walk through findings **grouped by file**, in the same order used during comprehension +- For each finding, explain: what the issue is, why it matters, and a suggested fix +- After each finding (or group of findings per file), pause for user response: + + - Agree: "fix it" / "noted" / "will fix" + - Disagree: "that's intentional, here's why" + - Ask for more context: "why is this a problem?" + - Batch response: "fix #3 and #7, ignore #5" + +### Report mode delivery + +- Write a structured markdown document following the template below +- Findings appear in a summary table at the top, then in detail sections grouped by severity + +## Wrap-up + +### Interactive mode + +> [!NOTE] +> +> In **hunk-by-hunk** mode, all findings were delivered inline. The wrap-up still aggregates them into a single summary. Include any findings the user addressed via `fix #N` as resolved. + +- **Verdict** → open with a 1-2 sentence overall assessment: is this changeset ready to ship as-is, ready with minor fixes, or does it need significant rework? Be direct — the user wants a judgment call, not a hedge +- Summarize: "**N** must-fix, **M** should-fix, **K** consider (**R** already fixed inline)" — omit the "already fixed inline" count if zero +- Ask if the user wants any remaining findings addressed now +- If the user expressed **recurring style or design preferences** during the review (e.g. "I prefer early returns," "don't flag missing docstrings on private methods"), list them back and ask: *"You mentioned these preferences during the review — would you like to add them to your quality standards?"* + +### Report mode + +- Write the report to `docs/reviews/YYYY-MM-DD-<branch-or-topic>.md` +- Return a one-paragraph summary to the calling agent/user with an overall verdict and finding counts + +## Report template + +> [!IMPORTANT] +> +> This template defines the **structure** of the report. Follow the `style.md` formatting guidelines for all content within it. + +```markdown +# Code review: <branch or topic> + +> **Generated:** <date> | **Target:** <git diff spec or file list> +> **Standards:** <list of standards source files used> + +## Executive summary + +- <2-3 bullets: what the changeset does, how many files, overall assessment> + +## Architecture + +### Component map + +- **<Group 1 name>** — <purpose> + + - Files: `file1.py`, `file2.py` + - Connects to: <Group 2> via <mechanism> + +- **<Group 2 name>** — <purpose> + + - Files: `file3.sh`, `file4.sh` + +### Per-component detail + +#### <Group 1 name> + +- **Purpose:** <what this group does> +- **Design decisions:** + + - <decision 1 and why> + - <decision 2 and why> + +- **Data flow:** <how data moves through this component> +- **Invariants:** <what must always be true> + +## Findings + +| # | File | Severity | Summary | +| --- | --- | --- | --- | +| 1 | `operations/mcp_catalog.py` | Should fix | Duplicated validation logic | +| 2 | `tools/scripts/set-up-tools.sh` | Must fix | Unquoted variable expansion on line 247 | + +### Must fix + +#### #2 — <title> + +- **File:** `<path>:<line>` +- **Problem:** <explanation> +- **Why it matters:** <impact> +- **Suggested fix:** <what to do> + +### Should fix + +#### #1 — <title> + +- **File:** `<path>:<line>` +- **Problem:** <explanation> +- **Suggested fix:** <what to do> + +### Consider + +#### #3 — <title> + +- **File:** `<path>` +- <explanation> + +## Summary + +**Verdict:** <1-2 sentence overall assessment — ready to ship, ready with minor fixes, or needs significant rework> + +- **Must fix:** N findings +- **Should fix:** M findings +- **Consider:** K findings +``` + +## Compatibility with other Bureau workflows + +- **Micro mode:** review findings can inform DAG construction; fixes become micro edits +- **Systematic debugging:** if a review finding reveals a bug, the user can invoke systematic debugging on that specific finding +- **Handoff guidelines:** in report mode, the skill is designed to run as a subagent; the calling agent can use findings to drive follow-up work +- **TDD skill:** "must fix" correctness findings can become test cases first (red), then fixes (green) diff --git a/protocols/context/static/skills/assess-mode/style.md b/protocols/context/static/skills/assess-mode/style.md new file mode 100644 index 00000000..9bb4f076 --- /dev/null +++ b/protocols/context/static/skills/assess-mode/style.md @@ -0,0 +1,134 @@ +# Style guidance for agents + +> [!IMPORTANT] +> +> - *All* of the following guidelines/directives: +> +> - **must** be read/executed (as appropriate) *before* beginning any task. +> - apply to ***any and all* file additions/edits you make.** +> +> - If a Markdown file you're editing has ***any*** portion that does not obey any of the directives, **fix the issue(s) immediately** +> +> - *Exception: **emojis** (see below)* + +## Formatting and content structure + +### General directives + +> [!IMPORTANT] +> +> ### Key directives +> +> Ensure, above all, that your content is: +> +> - **formatted such that it is easy to *quickly read/scan through*** by humans +> - **written *coherently and cohesively*, such that it is *easy to develop a mental model for*** +> +> The directives below are meant to ensure these 2 outcomes based on the user's preferences: follow them well. + +Content you write should: + +- Be structured, in most cases, as a **bulleted list with nesting of bullets** (to as many levels as you desire) to ensure that the content, via its structure, best fits the *"key directives"* above. + + - Bullets should contain at most **1 full sentence** (unless there is a *very* compelling reason to include more in the same bullet). To add more content past this, place it in nested bullets under the main one. + - Do *not* change/overwrite tables of contents to match this format: these are managed by a custom VSCode extension and should remain as-is. + +- Include **rich formatting** *(but don't overdo it; too much of these, inversely, makes a document **harder** to read!)*: + + - bolds + - italics + - underlines (via `<ins> ... </ins>`) + - GitHub-flavoured Markdown alerts (`[!NOTE]`, `[!IMPORTANT]`, `[!CAUTION]`, etc.) + +- Include tables where appropriate. However: + + - Any given cell of a table should **never** contain more than 20-25 words. + - If you have more content to include in a given row of a table than can be fit/expressed/condensed reasonably into 20-25 words without sounding unnaturally/incomprehensibly terse, then either: + + - Convert your content to use the bulleted-list formatting described above + - Add the excess content related to the table item/row to a separate content section (i.e. anchored by a header of the appropriate level) and link to it from within the table row/column/cell as appropriate + + > Instead of a whole linked section, you can a footer if the excess content is pedantic/low-volume. Use your best judgment w.r.t. this. + +- Any code blocks you include should have one or both of: + + - Extensive comments explaining any non-obvious/obscure code + - Well-structured, accompanying English descriptions that use the bulleted-list format described above to match pseudocode structurally. + +> [!NOTE] +> +> The "bulleted list" directive above can be ignored if: +> +> - the user gives you *direct instructions* to structure the content alternatively you believe there is a *very strong* reason to structure your content in an alternate fashion *(e.g. essay-style, using long paragraphs)* +> - there is a *very compelling/strong* reason to structure the content alternatively +> +> **IMPORTANT: Before proceeding with any alternative content structure, you *must* ask the user and get their approval via a clear confirmation.** + +### Hard directives you *must* follow + +### <ins>Always</ins> use + +- Indents with **4 spaces** *(<ins>not</ins> 2)* +- **Sentence case/downstyle** in any "capitalized" formatting (e.g. headers) +- Empty lines around groups of list bullets, including around nested groups of bullets *within* a list *(i.e. as done in this document)* + + > *<ins>Examplars</ins>* + > + > ```markdown + > Some preceding content, separated from the list by the newline that precedes it... + > + > - Level 1 bullet + > - Level 1 bullet + > - Level 1 bullet + > + > - Level 2 bullet + > + > - Level 1 bullet + > + > 1. Level 2 numbered element + > 2. Level 2 numbered element + > + > - Level 3 bullet + > + > - Level 1 bullet + > - Level 1 bullet + > + > Content following the list, separated from the list by the newline that follows it... + > ``` + > + > ```markdown + > ## Some preceding header, separated from the list below by a newline, as always... + > + > - Level 1 bullet + > - Level 1 bullet + > + > - Level 2 bullet, preceded by newline + > - Level 2 bullet + > + > - Level 3 bullet + > + > - Level 2 bullet + > + > 1. Level 3 numbered list element; our rules apply to bulleted lists, numbered lists, any kind of list! + > 2. Level 3 numbered list element + > - Level 3 bullet + > + > - Level 2 bullet + > - Level 2 bullet + > 1. Level 2 numbered list element (interspersed bullets and numbered elements are rare but possible at any indentation level! Just use your common sense, follow the rules and you'll be fine!) + > + > - Level 3 bullet + > + > Content following the list, separated from the list by the newline that follows it... + > ``` + +### <ins>Never</ins> use + +- Section numbers in headers *(unless explicitly requested by the user)* +- Horizontal separators (i.e. `---`) +- Emojis + + - *If they are already there, do <ins>not</ins> delete them* **(this is an exception to the rule above)** + - Don't add any more of your own without asking first. + + - In particular, only suggest emojis if you believe there is a *very strong* reason to add them *(i.e. to increase salience of key headings/classifications/other content to ensure a more convenient reading experience for the user)*. diff --git a/protocols/context/static/skills/blast-radius-mode/SKILL.md b/protocols/context/static/skills/blast-radius-mode/SKILL.md new file mode 100644 index 00000000..7652ac61 --- /dev/null +++ b/protocols/context/static/skills/blast-radius-mode/SKILL.md @@ -0,0 +1,404 @@ +--- +description: Impact analysis before every code change to enumerate what could break. Activate when user says "BLAST RADIUS MODE ON", "analyze impact", "show me what could break", "careful mode", or "cautious mode". Identifies all callers, dependents, tests, and contracts affected by changes. Classifies changes as safe/review/breaking/blocked and requires approval before applying. Essential for refactoring and API changes. +--- + +# Blast Radius Mode: *protocol* + +> <ins>***Goal:** enumerate everything that could break before touching anything*</ins> +> +> *Systematic impact analysis before every change. You will identify all callers, dependents, tests, and contracts that could be affected, assess the risk, and obtain approval before proceeding.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "BLAST RADIUS MODE ON" +- "analyze impact before changes" +- "show me what could break" +- "careful mode" / "cautious mode" + +*follow this Blast Radius Mode protocol* until you are told anything like: + +- "exit blast radius mode" +- "BLAST RADIUS MODE OFF" +- "skip impact analysis" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, emit: + +``` +═══════════════════════════════════════ +Blast Radius Mode OFF +Changes analyzed: N +Breaking changes detected: M +Approved & applied: K +Blocked by user: J +═══════════════════════════════════════ +``` + +### Depth levels + +Analysis depth can be configured. Default is `standard`. + +| Depth | Caller analysis | Test discovery | Cross-service | +|-------|-----------------|----------------|---------------| +| `shallow` | Direct callers only | Direct test files | No | +| `standard` | 2 levels of callers | Test files + fixtures | No | +| `deep` | Full transitive closure | All test dependencies | Yes | +| `exhaustive` | Entire codebase scan | CI pipeline analysis | Yes + API consumers | + +Activate specific depth: "BLAST RADIUS MODE ON, depth: deep" + +## Core contract + +### The blast radius guarantee + +Before **every** code change that modifies behavior: + +1. **Analyze** all dimensions of potential impact +2. **Classify** the change (safe / review / breaking) +3. **Report** findings with evidence +4. **Gate** on user approval before applying + +### Changes requiring analysis + +| Change type | Analysis required | Rationale | +|-------------|-------------------|-----------| +| Function signature change | **Always** | Callers may break | +| Return type change | **Always** | Type contracts may break | +| Exception type change | **Always** | Error handlers may miss | +| Behavioral change | **Always** | Dependents rely on behavior | +| New parameter (with default) | **Standard** | Usually safe but verify | +| Internal refactor (same behavior) | **Light** | Low risk but confirm | +| Formatting / comments only | **Skip** | No behavioral impact | + +## Analysis dimensions + +For each change, analyze these dimensions: + +### Dimension 1: Caller analysis + +**What**: Functions, methods, and code paths that invoke the target. + +**How to discover**: +- Use `find_referencing_symbols` (Serena MCP) for symbol-level callers +- Use `grep`/`ripgrep` for dynamic calls, string references +- Check for reflection, dependency injection, event handlers + +**Report format**: +``` +CALLERS of update_user_email(): +├── Direct (8 callers): +│ ├── src/api/users.py::handle_email_change [line 45] +│ ├── src/api/users.py::bulk_update [line 112] +│ ├── src/services/auth.py::verify_email [line 78] +│ ├── src/services/onboarding.py::complete_signup [line 34] +│ ├── src/workers/email_sync.py::sync_from_provider [line 89] +│ ├── src/admin/user_management.py::admin_edit_user [line 156] +│ ├── src/cli/user_commands.py::update_email_cmd [line 23] +│ └── tests/test_users.py::test_email_update [line 67] +│ +└── Indirect (12 callers, 2nd level): + ├── src/api/routes.py → handle_email_change + ├── src/api/routes.py → bulk_update + └── ... [10 more] +``` + +### Dimension 2: Import/module dependencies + +**What**: Files that import the module containing the target. + +**Report format**: +``` +IMPORTERS of src/services/user_service.py: +├── Direct imports (5 files): +│ ├── src/api/users.py +│ ├── src/api/admin.py +│ ├── src/workers/user_sync.py +│ ├── src/cli/commands.py +│ └── tests/conftest.py +│ +└── Re-exports via (2 files): + ├── src/services/__init__.py (exposes UserService) + └── src/api/__init__.py (exposes user endpoints) +``` + +### Dimension 3: Test coverage + +**What**: Tests that exercise the target code. + +**Report format**: +``` +TEST COVERAGE for update_user_email(): +├── Direct tests (3 files, 12 test cases): +│ ├── tests/test_user_service.py +│ │ ├── test_update_email_success +│ │ ├── test_update_email_invalid_format +│ │ ├── test_update_email_duplicate +│ │ └── test_update_email_rate_limit +│ ├── tests/test_api_users.py +│ │ ├── test_email_change_endpoint +│ │ └── test_email_change_auth_required +│ └── tests/integration/test_email_flow.py +│ └── test_full_email_change_flow +│ +├── Indirect coverage (via callers): 8 additional test files +│ +└── Coverage gaps identified: + ⚠️ No test for: bulk_update() calling update_user_email() + ⚠️ No test for: concurrent email updates +``` + +### Dimension 4: API contracts + +**What**: Public interfaces, versioned APIs, documented contracts. + +**Report format**: +``` +API CONTRACTS affected: +├── Public API: YES +│ └── Endpoint: PATCH /api/v2/users/{id}/email +│ ├── Documented in: docs/api/users.md +│ ├── OpenAPI spec: openapi/users.yaml +│ └── Breaking change: Requires major version bump +│ +├── Internal API: YES +│ └── Service interface: UserService.update_email() +│ ├── Used by: 3 internal services +│ └── Breaking change: Coordinate with service owners +│ +└── Type contracts: + ├── Input: UpdateEmailRequest (Pydantic model) + ├── Output: User (Pydantic model) + └── Changes to these types: BREAKING +``` + +### Dimension 5: Data dependencies + +**What**: Database tables, schemas, cached data, external state. + +**Report format**: +``` +DATA DEPENDENCIES: +├── Database tables: +│ ├── users (columns: email, email_verified, email_updated_at) +│ ├── email_audit_log (insert on every change) +│ └── user_sessions (may invalidate on email change) +│ +├── Cache keys: +│ ├── user:{id} (must invalidate) +│ └── user_by_email:{email} (must update both old and new) +│ +└── External state: + ├── Email provider (Sendgrid): verification email triggered + └── Analytics (Segment): track event emitted +``` + +### Dimension 6: Cross-service impact (if applicable) + +**What**: Other services, APIs, or systems that depend on this code. + +**Report format**: +``` +CROSS-SERVICE IMPACT: +├── Downstream consumers: +│ ├── billing-service: subscribes to user.email.changed event +│ ├── notification-service: uses email for delivery +│ └── analytics-service: tracks email domain metrics +│ +├── Upstream dependencies: +│ └── auth-service: provides JWT with email claim +│ +└── Event contracts: + ├── user.email.changed (published) + │ └── Schema: { user_id, old_email, new_email, timestamp } + └── Breaking change to event: MAJOR impact +``` + +## Execution protocol + +### Pre-change analysis + +Before applying ANY behavioral change: + +1. **Identify the target**: What function/class/module is being modified? + +2. **Run dimensional analysis**: Gather data for all relevant dimensions + +3. **Classify the change**: + + | Classification | Criteria | Action required | + |----------------|----------|-----------------| + | 🟢 `SAFE` | No callers affected, no contract changes | Inform, proceed | + | 🟡 `REVIEW` | Callers exist but change is backward-compatible | List affected, request approval | + | 🔴 `BREAKING` | Signature/contract change affects callers | Full impact report, explicit approval | + | ⚫ `BLOCKED` | Change would break critical path without migration | Require migration plan first | + +4. **Emit blast radius report** (format below) + +5. **Wait for approval** before applying change + +### Blast radius report format + +``` +══════════════════════════════════════════════════════════════════════ +BLAST RADIUS ANALYSIS +Target: src/services/user_service.py::update_user_email() +Change: Add required parameter `reason: str` +══════════════════════════════════════════════════════════════════════ + +CLASSIFICATION: 🔴 BREAKING +Reason: New required parameter breaks all 8 existing callers + +IMPACT SUMMARY: +┌─────────────────────┬───────┬─────────────────────────────────────┐ +│ Dimension │ Count │ Risk │ +├─────────────────────┼───────┼─────────────────────────────────────┤ +│ Direct callers │ 8 │ HIGH - all must be updated │ +│ Indirect callers │ 12 │ MEDIUM - may need review │ +│ Test files │ 3 │ HIGH - tests will fail │ +│ Public API │ 1 │ HIGH - endpoint contract changes │ +│ Database tables │ 1 │ LOW - no schema change │ +│ Cache keys │ 2 │ MEDIUM - invalidation needed │ +│ Downstream services │ 3 │ MEDIUM - event schema unchanged │ +└─────────────────────┴───────┴─────────────────────────────────────┘ + +AFFECTED FILES (must update): + 1. src/api/users.py (2 call sites) + 2. src/services/auth.py (1 call site) + 3. src/services/onboarding.py (1 call site) + 4. src/workers/email_sync.py (1 call site) + 5. src/admin/user_management.py (1 call site) + 6. src/cli/user_commands.py (1 call site) + 7. tests/test_users.py (1 call site) + +MIGRATION REQUIRED: + Option A: Add default value `reason: str = "not_specified"` (backward-compatible) + Option B: Update all 8 callers to provide reason (breaking, but cleaner) + +══════════════════════════════════════════════════════════════════════ +APPROVAL REQUIRED + +Options: + [A] Proceed with Option A (backward-compatible) + [B] Proceed with Option B (I will update all callers) + [C] Abort - rethink approach + [D] Show me the affected code first + +Your choice: _ +══════════════════════════════════════════════════════════════════════ +``` + +### Approval gates + +| Classification | Approval requirement | +|----------------|---------------------| +| 🟢 `SAFE` | Implicit - inform and proceed | +| 🟡 `REVIEW` | Explicit "proceed" or equivalent | +| 🔴 `BREAKING` | Explicit choice from options provided | +| ⚫ `BLOCKED` | Cannot proceed without migration plan | + +### Post-change verification + +After applying an approved change: + +1. **Verify callers updated**: If breaking change, confirm all callers fixed +2. **Run affected tests**: Execute tests identified in analysis +3. **Report completion**: + ``` + BLAST RADIUS RESOLUTION: + ✅ 8/8 callers updated + ✅ 12/12 tests passing + ✅ API documentation updated + ⚠️ Cache invalidation: manual verification recommended + ``` + +## Breaking change classification + +### What constitutes a breaking change + +| Change | Breaking? | Rationale | +|--------|-----------|-----------| +| Add required parameter | **Yes** | Callers don't provide it | +| Add optional parameter (with default) | No | Backward-compatible | +| Remove parameter | **Yes** | Callers may provide it | +| Change parameter type | **Yes** | Type mismatch | +| Change parameter order | **Yes** | Positional args break | +| Rename parameter | **Yes** (if keyword args used) | Keyword args break | +| Change return type | **Yes** | Callers expect old type | +| Add new exception type | **Maybe** | If callers catch specific exceptions | +| Remove exception type | **Maybe** | If callers rely on it | +| Change behavior (same signature) | **Maybe** | Depends on contract | + +### Severity levels + +| Severity | Criteria | Example | +|----------|----------|---------| +| `CRITICAL` | Breaks public API, affects external consumers | Remove endpoint parameter | +| `HIGH` | Breaks internal API, affects multiple services | Change service interface | +| `MEDIUM` | Breaks module API, affects same codebase | Change function signature | +| `LOW` | Breaks single caller, easily fixed | Rename internal helper | + +## Compatibility with other modes + +### With Micro Mode + +Blast radius analysis triggers **before each micro edit**: + +``` +[Plan micro edit] → [Blast radius analysis] → [Approval] → [Apply edit] → [⏸️] +``` + +For efficiency, batch similar changes: +- If multiple micro edits affect the same function, analyze once for all +- Report cumulative blast radius + +### With Scrimmage Mode + +Run in sequence: +1. Blast radius analysis (before change) - "what could break?" +2. Apply change +3. Scrimmage analysis (after change) - "how could it fail?" + +### With Contract-First Mode + +Blast radius is especially critical for contract changes: +- Any interface modification requires `deep` analysis +- Contract changes are always classified as 🔴 `BREAKING` + +## Quick reference + +### Activation + +``` +BLAST RADIUS MODE ON # Standard depth +BLAST RADIUS MODE ON, depth: deep # Full transitive analysis +BLAST RADIUS MODE ON, depth: exhaustive # Include CI and external consumers +``` + +### During session + +``` +proceed # Approve and apply change (after review) +abort # Cancel change, rethink approach +show code # Display affected code snippets +expand # Show indirect callers (next level) +migration # Generate migration plan for breaking change +``` + +### Shorthand approvals + +After reviewing blast radius report: + +``` +> # Proceed with recommended option +A/B/C/D # Select specific option from report +skip # Skip analysis for this change (requires justification) +``` diff --git a/protocols/context/static/skills/clearance-mode/SKILL.md b/protocols/context/static/skills/clearance-mode/SKILL.md new file mode 100644 index 00000000..ad9b0de7 --- /dev/null +++ b/protocols/context/static/skills/clearance-mode/SKILL.md @@ -0,0 +1,894 @@ +--- +description: Rigorous completion verification by defining measurable "done" criteria upfront. Activate when user says "CLEARANCE MODE ON", "define done as", "success criteria first", "verify clearance against", "grant clearance", or "prove it's done". Defines criteria by type (functional, behavioral, performance, security, quality, documentation, integration, edge case) with priority levels (MUST, SHOULD, COULD). Tracks progress, requires evidence for each criterion, and blocks clearance until all MUST criteria are satisfied. +--- + +# Clearance Mode: *protocol* + +> <ins>***Goal:** define "done" upfront, prove each criterion is met before declaring success*</ins> +> +> *Rigorous completion verification through explicit success criteria. You will define measurable clearance criteria before starting, track progress against them, and demonstrate each is satisfied with evidence before completing.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "CLEARANCE MODE ON" +- "define done as" +- "success criteria first" +- "verify clearance against" +- "grant clearance" +- "prove it's done" + +*follow this Clearance Mode protocol* until you are told anything like: + +- "clearance mode off" +- "CLEARANCE MODE OFF" +- "skip verification" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, emit: + +``` +═══════════════════════════════════════ +Clearance Mode OFF +Criteria defined: N +Criteria satisfied: M +Criteria waived: K +Evidence artifacts: J +Completion status: <COMPLETE | PARTIAL | INCOMPLETE> +═══════════════════════════════════════ +``` + +### Verification rigor + +Verification rigor can be configured. Default is `standard`. + +| Rigor | Evidence required | Verification method | +|-------|-------------------|---------------------| +| `light` | Self-attestation | Code inspection only | +| `standard` | Demonstrable | Tests, examples, or clear code paths | +| `strict` | Reproducible | Automated tests, runnable demos | +| `auditable` | Documented | Full evidence trail with artifacts | + +Activate specific rigor: "CLEARANCE MODE ON, rigor: strict" + +## Core contract + +### The clearance guarantee + +For every task: + +1. **Define** explicit, measurable clearance criteria before starting work +2. **Track** progress against criteria throughout implementation +3. **Verify** each criterion with appropriate evidence +4. **Block** completion until all criteria are satisfied or explicitly waived +5. **Document** the evidence for each satisfied criterion + +### What constitutes a clearance criterion? + +A clearance criterion must be: + +| Property | Description | Good example | Bad example | +|----------|-------------|--------------|-------------| +| **Specific** | Clearly defined outcome | "User can reset password via email" | "Password stuff works" | +| **Measurable** | Can verify true/false | "Response time < 200ms" | "Fast enough" | +| **Testable** | Can demonstrate satisfaction | "All edge cases handled" (with list) | "Robust" | +| **Unambiguous** | One interpretation | "Returns 404 for missing user" | "Handles errors" | + +### Criterion types + +| Type | Symbol | Description | Evidence required | +|------|--------|-------------|-------------------| +| **Functional** | `[F]` | Feature works as specified | Test pass, demo | +| **Behavioral** | `[B]` | System behaves correctly | Test scenarios | +| **Performance** | `[P]` | Meets performance targets | Benchmarks, metrics | +| **Security** | `[S]` | Security requirements met | Audit, tests | +| **Quality** | `[Q]` | Code quality standards | Lint, review | +| **Documentation** | `[D]` | Docs complete and accurate | Doc review | +| **Integration** | `[I]` | Works with other systems | Integration tests | +| **Edge case** | `[E]` | Edge cases handled | Specific tests | + +## Criteria definition + +### Setup phase + +Before starting ANY implementation work, define clearance criteria: + +``` +CLEARANCE CRITERIA SETUP REQUIRED + +Before proceeding, define the criteria for "done". + +Task: <user's task description> + +Please define clearance criteria, or I'll propose some based on the task. + +Format: + [TYPE] <criterion description> + [TYPE] <criterion description> + ... + +Example: + [F] User can request password reset via email + [F] Reset token expires after 1 hour + [B] Invalid token shows clear error message + [E] Rate limited to 3 requests per hour per email + [Q] All new code has test coverage + [D] API documentation updated + +Your criteria (or "propose" for suggestions): _ +``` + +### Criterion definition syntax + +Each criterion must include: + +``` +CRITERION: <id> +Type: <F | B | P | S | Q | D | I | E> +Description: <clear, testable statement> +Priority: <MUST | SHOULD | COULD> +Verification: <how to prove satisfaction> +Evidence: <what artifact demonstrates this> +``` + +**Priority levels:** + +| Priority | Meaning | Completion requirement | +|----------|---------|------------------------| +| `MUST` | Required for completion | Cannot complete without | +| `SHOULD` | Expected, can waive with justification | Requires explicit waiver | +| `COULD` | Nice to have | Can skip without waiver | + +### Proposed criteria format + +When agent proposes criteria: + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CLEARANCE CRITERIA +Task: Implement password reset flow +══════════════════════════════════════════════════════════════════════ + +MUST HAVE (required for completion): +┌────┬──────┬─────────────────────────────────────────────────────────┐ +│ ID │ Type │ Criterion │ +├────┼──────┼─────────────────────────────────────────────────────────┤ +│ C1 │ [F] │ User can request password reset via email │ +│ C2 │ [F] │ Reset link sent to user's registered email │ +│ C3 │ [F] │ User can set new password via reset link │ +│ C4 │ [B] │ Reset token expires after 1 hour │ +│ C5 │ [B] │ Reset token is single-use (invalidated after use) │ +│ C6 │ [S] │ Token is cryptographically secure (min 32 bytes) │ +│ C7 │ [E] │ Invalid/expired token shows clear error message │ +└────┴──────┴─────────────────────────────────────────────────────────┘ + +SHOULD HAVE (expected): +┌────┬──────┬─────────────────────────────────────────────────────────┐ +│ ID │ Type │ Criterion │ +├────┼──────┼─────────────────────────────────────────────────────────┤ +│ C8 │ [S] │ Rate limited to 3 requests per hour per email │ +│ C9 │ [Q] │ All new code has >80% test coverage │ +│C10 │ [D] │ API documentation updated for new endpoints │ +└────┴──────┴─────────────────────────────────────────────────────────┘ + +COULD HAVE (nice to have): +┌────┬──────┬─────────────────────────────────────────────────────────┐ +│ ID │ Type │ Criterion │ +├────┼──────┼─────────────────────────────────────────────────────────┤ +│C11 │ [B] │ User notified of reset attempt on another device │ +│C12 │ [I] │ Reset events logged to audit system │ +└────┴──────┴─────────────────────────────────────────────────────────┘ + +══════════════════════════════════════════════════════════════════════ +Options: + [A] Accept all criteria as proposed + [M] Modify - specify changes (e.g., "remove C11, change C8 to COULD") + [R] Replace - provide your own criteria + [+] Add - specify additional criteria + +Your choice: _ +══════════════════════════════════════════════════════════════════════ +``` + +### Criteria registry + +After approval, maintain a live registry: + +``` +══════════════════════════════════════════════════════════════════════ +CLEARANCE CRITERIA REGISTRY +Task: Implement password reset flow +══════════════════════════════════════════════════════════════════════ + +ID │ Pri │ Type │ Status │ Criterion (abbreviated) +────┼──────┼──────┼────────┼────────────────────────────────────── +C1 │ MUST │ [F] │ ⏳ │ User can request reset via email +C2 │ MUST │ [F] │ ⏳ │ Reset link sent to email +C3 │ MUST │ [F] │ ⏳ │ User can set new password +C4 │ MUST │ [B] │ ⏳ │ Token expires after 1 hour +C5 │ MUST │ [B] │ ⏳ │ Token is single-use +C6 │ MUST │ [S] │ ⏳ │ Token cryptographically secure +C7 │ MUST │ [E] │ ⏳ │ Invalid token shows error +C8 │ SHLD │ [S] │ ⏳ │ Rate limited (3/hr/email) +C9 │ SHLD │ [Q] │ ⏳ │ >80% test coverage +C10 │ SHLD │ [D] │ ⏳ │ API docs updated +C11 │ CULD │ [B] │ ⏳ │ Multi-device notification +C12 │ CULD │ [I] │ ⏳ │ Audit logging + +Status: 0/12 satisfied (0%) +MUST: 0/7 | SHOULD: 0/3 | COULD: 0/2 +══════════════════════════════════════════════════════════════════════ +``` + +### Status symbols + +| Symbol | Status | Meaning | +|--------|--------|---------| +| ⏳ | `PENDING` | Not yet attempted | +| 🔄 | `IN_PROGRESS` | Currently working on | +| ✅ | `SATISFIED` | Verified with evidence | +| ❌ | `FAILED` | Attempted, not satisfied | +| ⏭️ | `WAIVED` | Explicitly skipped | +| 🚫 | `BLOCKED` | Cannot proceed (dependency) | + +## Progress tracking + +### During implementation + +Update registry after each significant change: + +``` +══════════════════════════════════════════════════════════════════════ +CRITERIA PROGRESS UPDATE +══════════════════════════════════════════════════════════════════════ + +Just completed: Created password reset endpoint + +Criteria affected: + C1: ⏳ → 🔄 IN_PROGRESS (endpoint exists, needs email integration) + C6: ⏳ → ✅ SATISFIED + Evidence: Using secrets.token_urlsafe(32) - 43 character token + +Updated registry: + MUST: 1/7 satisfied | SHOULD: 0/3 | COULD: 0/2 + Overall: 8% complete + +Next: Implement email sending for C2 +══════════════════════════════════════════════════════════════════════ +``` + +### Progress report format + +Periodically (or on request), emit full progress report: + +``` +══════════════════════════════════════════════════════════════════════ +CLEARANCE CRITERIA PROGRESS REPORT +Task: Implement password reset flow +Time elapsed: 45 minutes +══════════════════════════════════════════════════════════════════════ + +MUST HAVE: + C1 ✅ SATISFIED User can request reset via email + Evidence: test_request_reset passes + C2 ✅ SATISFIED Reset link sent to email + Evidence: Email sent in test (mock verified) + C3 🔄 IN_PROGRESS User can set new password + Status: Endpoint created, validation pending + C4 ⏳ PENDING Token expires after 1 hour + C5 ⏳ PENDING Token is single-use + C6 ✅ SATISFIED Token cryptographically secure + Evidence: secrets.token_urlsafe(32) + C7 ⏳ PENDING Invalid token shows error + +SHOULD HAVE: + C8 ⏳ PENDING Rate limited (3/hr/email) + C9 ⏳ PENDING >80% test coverage + C10 ⏳ PENDING API docs updated + +COULD HAVE: + C11 ⏳ PENDING Multi-device notification + C12 ⏳ PENDING Audit logging + +══════════════════════════════════════════════════════════════════════ +SUMMARY: + MUST: 3/7 satisfied (43%) + SHOULD: 0/3 satisfied (0%) + COULD: 0/2 satisfied (0%) + + Overall: 3/12 criteria satisfied (25%) + + Blocking: C3 must complete before C4, C5, C7 can be verified + +ESTIMATED REMAINING: 4-6 more implementation steps +══════════════════════════════════════════════════════════════════════ +``` + +## Evidence requirements + +### Evidence by criterion type + +| Type | Symbol | Acceptable evidence | +|------|--------|---------------------| +| **Functional** | `[F]` | Passing test, demo screenshot/video, code walkthrough | +| **Behavioral** | `[B]` | Test scenarios covering behavior, state machine verification | +| **Performance** | `[P]` | Benchmark results, profiler output, load test metrics | +| **Security** | `[S]` | Security test pass, code audit notes, threat model check | +| **Quality** | `[Q]` | Lint pass, coverage report, static analysis results | +| **Documentation** | `[D]` | Doc diff, rendered preview, link to updated docs | +| **Integration** | `[I]` | Integration test pass, E2E test results | +| **Edge case** | `[E]` | Specific test per edge case, error scenario verification | + +### Evidence format + +When marking a criterion satisfied, include evidence: + +``` +CRITERION SATISFIED: C6 + +Criterion: Token is cryptographically secure (min 32 bytes) +Type: [S] Security +Priority: MUST + +EVIDENCE: +┌─────────────────────────────────────────────────────────────────────┐ +│ Verification method: Code inspection │ +│ │ +│ Location: src/services/password_reset.py:34 │ +│ │ +│ Code: │ +│ token = secrets.token_urlsafe(32) # 32 bytes = 256 bits │ +│ # Results in 43-character URL-safe string │ +│ │ +│ Analysis: │ +│ - secrets module uses OS entropy source │ +│ - 32 bytes = 256 bits of entropy │ +│ - Exceeds minimum requirement │ +│ - URL-safe encoding for use in links │ +│ │ +│ Verdict: SATISFIES requirement │ +└─────────────────────────────────────────────────────────────────────┘ + +Status: C6 ⏳ → ✅ +``` + +### Evidence rigor levels + +**Light rigor** (self-attestation): +``` +C6 ✅ Token cryptographically secure + Evidence: Using secrets.token_urlsafe(32) +``` + +**Standard rigor** (demonstrable): +``` +C6 ✅ Token cryptographically secure + Evidence: Using secrets.token_urlsafe(32) + Verification: Code at src/services/password_reset.py:34 + Analysis: 32 bytes = 256 bits entropy, exceeds requirement +``` + +**Strict rigor** (reproducible): +``` +C6 ✅ Token cryptographically secure + Evidence: Using secrets.token_urlsafe(32) + Verification: Code at src/services/password_reset.py:34 + Test: test_token_security.py::test_token_entropy PASSED + Output: Generated 1000 tokens, all unique, entropy verified +``` + +**Auditable rigor** (documented): +``` +C6 ✅ Token cryptographically secure + Evidence: Using secrets.token_urlsafe(32) + Verification: Code at src/services/password_reset.py:34 + Test: test_token_security.py::test_token_entropy PASSED + Artifact: evidence/C6_token_security_audit.md + Reviewer: Verified by security review checklist + Timestamp: 2024-01-15T14:32:00Z +``` + +## Completion verification + +### Pre-completion checklist + +Before declaring task complete: + +``` +══════════════════════════════════════════════════════════════════════ +PRE-COMPLETION VERIFICATION +Task: Implement password reset flow +══════════════════════════════════════════════════════════════════════ + +MUST HAVE criteria: + C1 ✅ User can request reset via email + C2 ✅ Reset link sent to email + C3 ✅ User can set new password + C4 ✅ Token expires after 1 hour + C5 ✅ Token is single-use + C6 ✅ Token cryptographically secure + C7 ✅ Invalid token shows error + + MUST status: 7/7 satisfied ✅ + +SHOULD HAVE criteria: + C8 ✅ Rate limited (3/hr/email) + C9 ✅ >80% test coverage (actual: 87%) + C10 ✅ API docs updated + + SHOULD status: 3/3 satisfied ✅ + +COULD HAVE criteria: + C11 ⏭️ Multi-device notification (WAIVED - out of scope) + C12 ✅ Audit logging + + COULD status: 1/2 satisfied, 1 waived ✅ + +══════════════════════════════════════════════════════════════════════ +COMPLETION ELIGIBLE: YES + +All MUST criteria satisfied. +All SHOULD criteria satisfied. +COULD criteria: 1 satisfied, 1 explicitly waived. + +Declare complete? [Y/N]: _ +══════════════════════════════════════════════════════════════════════ +``` + +### Completion declaration + +``` +══════════════════════════════════════════════════════════════════════ +TASK COMPLETE +══════════════════════════════════════════════════════════════════════ + +Task: Implement password reset flow +Status: COMPLETE + +FINAL CRITERIA STATUS: + MUST: 7/7 satisfied (100%) + SHOULD: 3/3 satisfied (100%) + COULD: 1/2 satisfied, 1 waived (50%) + + Total: 11/12 criteria addressed + +EVIDENCE SUMMARY: + - 8 criteria verified by automated tests + - 2 criteria verified by code inspection + - 1 criterion verified by documentation review + - 1 criterion waived with justification + +ARTIFACTS: + - Test suite: tests/test_password_reset.py (12 tests, all passing) + - Coverage: 87% for new code + - Documentation: docs/api/password_reset.md + +WAIVER LOG: + - C11 (Multi-device notification): Waived - agreed as future enhancement + +══════════════════════════════════════════════════════════════════════ +``` + +## Criteria modification + +### Adding criteria mid-task + +New criteria can be added during implementation: + +``` +USER: "We also need to log all reset attempts for compliance" + +AGENT: +NEW CRITERION PROPOSED + +Adding to registry: + + C13 │ MUST │ [I] │ All reset attempts logged with timestamp, IP, result + │ │ │ Verification: Audit log inspection + │ │ │ Evidence: Log entries in compliance_audit table + +This is a MUST criterion per compliance requirements. + +Accept addition? [Y/N]: _ +``` + +### Modifying existing criteria + +``` +USER: "Change the token expiry to 30 minutes, not 1 hour" + +AGENT: +CRITERION MODIFICATION + +Current: + C4 │ MUST │ [B] │ Token expires after 1 hour + +Proposed: + C4 │ MUST │ [B] │ Token expires after 30 minutes + +Impact: + - More secure (shorter window) + - Current implementation uses 1 hour + - Requires code change if already implemented + +Status: C4 was ⏳ PENDING + +Accept modification? [Y/N]: _ +``` + +### Waiving criteria + +SHOULD and COULD criteria can be waived with justification: + +``` +USER: "Skip the multi-device notification for now" + +AGENT: +WAIVER REQUEST: C11 + +Criterion: Multi-device notification when reset attempted +Priority: COULD (nice to have) + +Waiver justification required: _ +``` + +After justification: + +``` +CRITERION WAIVED: C11 + +Criterion: Multi-device notification when reset attempted +Priority: COULD +Justification: "Out of scope for MVP, will address in v2" +Waived by: User +Timestamp: 2024-01-15T15:00:00Z + +Status: C11 ⏳ → ⏭️ WAIVED + +Note: COULD criteria do not require justification but it's recorded. + SHOULD criteria require justification. + MUST criteria cannot be waived (task incomplete). +``` + +### Cannot waive MUST criteria + +``` +USER: "Skip the token expiry, we'll add it later" + +AGENT: +⚠️ WAIVER DENIED: C4 + +Criterion: Token expires after 1 hour +Priority: MUST + +MUST criteria cannot be waived. They are required for task completion. + +Options: + [D] Downgrade to SHOULD (requires strong justification) + [P] Postpone entire task + [I] Implement criterion now + +Your choice: _ +``` + +## Failure handling + +### Criterion failure + +When a criterion cannot be satisfied: + +``` +CRITERION FAILED: C9 + +Criterion: >80% test coverage +Type: [Q] Quality +Priority: SHOULD + +Attempt result: + - Current coverage: 72% + - Target: 80% + - Gap: 8 percentage points + +Analysis: + - Missing coverage in error handling paths + - Complex branching in validate_token() + +Options: + [R] Retry - Add more tests to reach 80% + [W] Waive - Accept 72% with justification (SHOULD allows waiver) + [L] Lower - Change target to 70% (requires approval) + [B] Block - Mark as blocker, pause task + +Your choice: _ +``` + +### Blocked criteria + +When a criterion is blocked by external factors: + +``` +CRITERION BLOCKED: C8 + +Criterion: Rate limited to 3 requests per hour per email +Type: [S] Security +Priority: SHOULD + +Blocker: Redis dependency not available in test environment + +Impact: + - Cannot verify rate limiting works correctly + - Implementation complete but unverified + +Options: + [M] Mock - Test with mocked Redis (partial verification) + [D] Defer - Mark as verified-in-staging + [S] Skip - Waive with justification + [U] Unblock - Help me set up Redis + +Your choice: _ +``` + +## Partial completion + +### Partial completion report + +If task cannot be fully completed: + +``` +══════════════════════════════════════════════════════════════════════ +PARTIAL COMPLETION REPORT +Task: Implement password reset flow +Status: PARTIAL (blocked) +══════════════════════════════════════════════════════════════════════ + +COMPLETED: + C1 ✅ User can request reset via email + C2 ✅ Reset link sent to email + C3 ✅ User can set new password + C6 ✅ Token cryptographically secure + C7 ✅ Invalid token shows error + +NOT COMPLETED: + C4 ❌ Token expires after 1 hour + Blocker: Scheduler service unavailable + C5 ❌ Token is single-use + Blocker: Depends on C4 infrastructure + C8 🚫 Rate limiting + Blocker: Redis not configured + +WAIVED: + C11 ⏭️ Multi-device notification (out of scope) + +══════════════════════════════════════════════════════════════════════ +SUMMARY: + MUST: 5/7 satisfied (71%) - 2 blocked + Cannot declare complete until blockers resolved. + +RECOMMENDED ACTIONS: + 1. Deploy scheduler service (unblocks C4, C5) + 2. Configure Redis (unblocks C8) + 3. Resume Clearance Mode to complete + +Progress stored: [memory location] +══════════════════════════════════════════════════════════════════════ +``` + +### Handoff for partial completion + +When handing off partially complete work: + +``` +HANDOFF PACKAGE + +Task: Implement password reset flow +Status: 5/7 MUST criteria satisfied +Remaining: C4 (token expiry), C5 (single-use) + +Blockers requiring resolution: + 1. Scheduler service needed for token expiry job + 2. Background job infrastructure for token invalidation + +Completed work: + - Password reset endpoint: POST /api/reset-password + - Token generation: src/services/password_reset.py + - Email sending: integrated with SendGrid + - Error handling: all edge cases covered + +Resume instructions: + 1. Resolve scheduler blocker + 2. Activate: "CLEARANCE MODE ON, continue" + 3. Registry will load from stored session +``` + +## Session persistence + +### Storing criteria registry + +**If Memory MCP available (preferred):** +``` +Entity: ExitCriteriaSet +Attributes: + - task: <task description> + - project: <project name> + - created_at: <ISO timestamp> + - status: <IN_PROGRESS | COMPLETE | PARTIAL> + +Relations: + - (ExitCriteriaSet)-[:CONTAINS]->(ExitCriterion) + +Entity: ExitCriterion +Attributes: + - id: <criterion id> + - type: <F|B|P|S|Q|D|I|E> + - priority: <MUST|SHOULD|COULD> + - description: <criterion text> + - status: <PENDING|IN_PROGRESS|SATISFIED|FAILED|WAIVED|BLOCKED> + - evidence: <evidence text if satisfied> + - waiver_reason: <reason if waived> +``` + +**If Qdrant MCP available (fallback):** +``` +metadata.type: "clearance_session" +metadata.task: <task description> +metadata.project: <project name> +metadata.created_at: <ISO timestamp> +content: JSON-serialized criteria registry with all statuses +``` + +### Loading existing criteria + +At session start: + +``` +CLEARANCE MODE ON + +Checking for existing criteria... +Found: Criteria set for "Implement password reset flow" (5/7 complete) + +Options: + [C] Continue - Resume with existing criteria + [R] Reset - Start fresh with new criteria + [V] View - Show existing criteria before deciding + +Your choice: _ +``` + +## Compatibility with other modes + +### With Micro Mode + +Each micro edit maps to criteria progress: + +``` +[Micro edit] → [Update criteria status] → [Report progress] → [⏸️] +``` + +Progress report after each micro edit shows which criteria are affected. + +### With Scrimmage Mode + +Auto-generate security criteria from scrimmage findings: + +``` +SCRIMMAGE FINDING → CLEARANCE CRITERION + +Finding: SQL injection possible in user lookup +Severity: CRITICAL + +Auto-adding criterion: + C14 │ MUST │ [S] │ All database queries use parameterized statements + │ │ │ Evidence: Code review, SQLi test suite pass + +Accept? [Y/N]: _ +``` + +### With Blast Radius Mode + +Generate criteria for affected areas: + +``` +BLAST RADIUS → CLEARANCE CRITERIA + +Blast radius shows 8 callers affected by signature change. + +Auto-adding criteria: + C15 │ MUST │ [F] │ All 8 callers updated to new signature + C16 │ SHLD │ [Q] │ All affected tests passing + +Accept? [Y/N]: _ +``` + +### With Safeguard Mode + +Map invariants to criteria: + +``` +INVARIANTS → CLEARANCE CRITERIA + +Active invariants can become clearance criteria: + + non_negative_balance → C17 │ MUST │ [B] │ Balance never goes negative + order_state_machine → C18 │ MUST │ [B] │ Order states follow valid transitions + +This ensures invariants are explicitly verified at completion. + +Add invariant-based criteria? [Y/N]: _ +``` + +### With Shadow Mode + +Show criteria progress in each proposal: + +``` +PROPOSED CHANGE [3 of 5] +[diff...] + +CRITERIA IMPACT: + - This change satisfies: C3 (user can set new password) + - Progress after: 4/7 MUST criteria satisfied + +Apply when ready: _ +``` + +## Quick reference + +### Activation + +``` +CLEARANCE MODE ON # Standard rigor +CLEARANCE MODE ON, rigor: strict # Require automated tests +CLEARANCE MODE ON, continue # Resume previous session +``` + +### Defining criteria + +``` +[F] <functional criterion> # Feature works +[B] <behavioral criterion> # Behaves correctly +[P] <performance criterion> # Meets perf targets +[S] <security criterion> # Security requirement +[Q] <quality criterion> # Code quality +[D] <documentation criterion> # Docs complete +[I] <integration criterion> # Works with systems +[E] <edge case criterion> # Edge case handled +``` + +### During session + +``` +progress # Show current criteria status +criteria # List all criteria +add <criterion> # Add new criterion +modify <id> # Change criterion +waive <id> # Waive SHOULD/COULD criterion +evidence <id> # Show evidence for criterion +verify <id> # Manually verify criterion +complete # Attempt completion declaration +``` + +### Priorities + +``` +MUST # Required, cannot waive +SHOULD # Expected, can waive with justification +COULD # Nice to have, can waive freely +``` + +### Criterion status updates + +``` +satisfy <id> # Mark as satisfied (prompts for evidence) +fail <id> # Mark as failed (prompts for reason) +block <id> # Mark as blocked (prompts for blocker) +unblock <id> # Remove blocked status +reset <id> # Reset to pending +``` diff --git a/protocols/context/static/skills/micro-mode/SKILL.md b/protocols/context/static/skills/micro-mode/SKILL.md new file mode 100644 index 00000000..deb57265 --- /dev/null +++ b/protocols/context/static/skills/micro-mode/SKILL.md @@ -0,0 +1,461 @@ +--- +description: Step-gated editing with DAG-based planning and continuous user steering. Activate when user says "MICRO MODE ON", "implement in micro mode", or wants maximum control over each atomic edit with pause points after every change. Each edit is limited to one function and 30 lines. User resumes with ">" or ".". Ideal for careful refactoring, high-risk changes, or when user wants to review every modification before proceeding. +--- + +# Micro Mode editing protocol + +> <ins>***Goal:** step-gated edits in auto-accept mode*</ins> +> +> *Maximum throughput with continuous, real-time user steering. You will make exactly **one atomic "micro edit"** at a time, then pause. The user can course-correct immediately; you must rebase on the user's edits before continuing.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "MICRO MODE ON" +- complete this task in micro mode +- implement in micro mode + +*follow this Micro Mode protocol* until you are told anything like: + +- exit micro mode +- finish the task/implementation without micro mode +- "MICRO MODE OFF" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, you must output in this exact format: + +``` +═══════════════════════════════════════ +Micro Mode OFF +Completed: N/M steps +Remaining: [step-ids, or "none"] +DAG stored: [location, or "chat only"] +═══════════════════════════════════════ +``` + +If there are no steps/nodes remaining and the DAG is persisted to a memory tool (Neo4j-based graph memory or Qdrant), delete the DAG from memory. + +### Partial completion (pausing for later) + +If the user says anything like: + +- "pause for now" +- "stop here" +- "let's continue later" +- "save progress" + +You must: + +1. **Persist the DAG** to memory (see [DAG persistence protocol](#dag-persistence-protocol)) and ensure it is updated to reflect current progress +2. **Emit pause summary**: + + ``` + ⏸️ Micro Mode PAUSED + Completed: N/M steps + Next ready: [step-ids] + Blocked: [step-ids, or "none"] + DAG stored: [location] + Resume: "MICRO MODE ON, continue" + ``` + +3. Exit micro mode protocol (but do **not** emit deactivation summary: this is a pause, not an exit) + +## Domain-specific terms + +### *Micro edits* + +One micro edit may modify **at most**: + +- **one function** (primary target) +- **≤30 total lines changed** *(added + removed)* + +> [!NOTE] +> - **Line counting**: Count gross changes. Replacing 5 lines with 3 new lines = 8 changes (5 removed + 3 added). +> - **Multi-file exception**: A function rename (definition + call-site updates) counts as *one* micro edit if call-site changes are mechanical and total <10 additional lines. + +If a change exceeds these limits, you *must* split it into multiple micro edits, in this preferred order: + +1. interface / signature / scaffolding +2. core logic +3. edge cases +4. tests +5. cleanup or refactor + +### *Resumption tokens* + +> These enable a fast control loop. + +The user resumes execution (in phase 2 below) by sending **one character**: + +- `>` (preferred) +- `.` (equivalent) + +Alternatives like "continue", "proceed", "go on" or "next edit" are also permitted for flexibility. + +## Phase 1: planning + +Before editing anything: + +1. Carefully ensure any outstanding ambiguities/tradeoffs/design choices in the implementation are **unambiguously resolved** through dialogue with the user. +2. Construct a **DAG of *micro edits***: + + - Each [node](#node-specification) is a *micro edit* + - Each edge is a *blocking dependency* (encoded in each node's `deps` field, which contains a list of in-neighbours; see the spec below) + + You must execute nodes in **topological order**, never violating dependencies. + +### Node specification + +Each node must include: + +- `id` *(string, unique)*: **stable**, human-addressable (e.g. `parse_headers`, `validate_cfg`) + + > Stable IDs allow the user to give *unambiguous* instructions like: + > + > - "redo `validate_cfg`" + > - "skip `parse_headers`" + > - "run high-risk steps first" + +- `file` *(string)* +- `function` *(string)* +- `signature` *(string)*: signature of the function above +- `goal` *(string)*: intent/goal/nature of the change, in one sentence +- `diff` *(string)*: a concrete diff containing the exact changes needed; updated during phase 2 execution if stale state detection reveals the ground truth has changed (see stale state detection in [phase 2's execution loop](#mandatory-execution-loop)) + +- `deps` *(list of 0 or more `id`s)*: list of IDs of nodes corresponding to steps that block/must be completed before this one +- `risk` *(string enum)*: *exactly one* of `low | medium | high` + + - `risk: high` *micro edits*, in particular, are defined as those touching: + + - APIs / interfaces + - types or schemas + - concurrency / ordering + - serialization formats + - invariants relied on downstream + +- `type` *(string enum)*: *exactly one* of `API | IMPL | FIX | TEST | DOC` +- `status` *(string enum)*: *exactly one* of `planned|ready|in_progress|done|blocked|cancelled` + + - All nodes should have `status` set as one of the following at DAG creation: + + | Value of node's `deps` list | Resulting `status` value to set | + | --- | --- | + | Empty | `ready` | + | Non-empty | `planned` | + + - Status values are defined as follows: + + | `status` value | Definition | + | --- | --- | + | `ready` | Node's `deps` list (i.e. blocking micro edits) is *empty* | + | `planned` | Node's `deps` list is *non-empty* | + | `blocked` | Waiting on clarification or external decision as to how or whether to implement the node's *micro edit* | + | `done` | Node's *micro edit* was successfully applied and accepted | + | `in_progress` | Currently working on node's *micro edit* | + | `cancelled` | Node's changes are no longer needed; equivalent to `WONTFIX` in Jira tickets/GitHub issues | + +- `tradeoffs` *(string)*: contains extra notes about the node/change when the node is `status=blocked`; should be empty to begin with + +### Cross-phase protocols + +> [!IMPORTANT] +> These protocols must be followed **at all times *as soon as* the DAG is created**: from phase 1, all the way through to the very end of phase 2 (i.e., completion of the implementation/changeset represented by the DAG). + +#### Status update protocol + +> [!IMPORTANT] +> +> - Any time you set a node's `status` field to `cancelled` or `done` from any other status value, you **must** ensure that node is removed from any other nodes' `deps` lists in which it appears. +> +> - The converse also applies: whenever a node's `status` field is changed from `cancelled` or `done` from any status value, you **must** carefully ensure the node is added to any other node's `deps` list that depends on it. +> +> - Any time you change a node's `status` from `blocked` to any other value (i.e., since the blocking issue was resolved via resolution/clarification from the user) ensure the `tradeoffs` field's value is cleared (if non-empty). + +#### DAG persistence protocol + +Maintain the DAG explicitly in any structural memory tools available (or, as a last resort, in chat) and update it if needed/as appropriate after every prompt. + +> [!IMPORTANT] +> +> **No implicit dependencies allowed**: dependencies between nodes must be **explicitly encoded** in the DAG. + +> [!TIP] +> +> The `deps` list does NOT need to be explicitly maintained as a list/array of node IDs; in particular, if the storage tool/mechanism (e.g. a memory storage MCP) natively supports graph (ideally, digraph) or similar operations, `deps` can and should be recorded as edges (ideally, directed edges/arcs). +> +> Optimize for efficiency and native support (by the tool) of the DAG's representation in storage (while maintaining its completeness, of course). + +#### Scheduling protocol + +> *To surface design errors **before** building on top of them.* + +When multiple steps are ready (deps satisfied), prefer executing based on the following heuristics in the hierarchy given (unless explicitly told otherwise, or there is a clear reason not to): + +1. **steps with higher `risk` earlier** +2. nodes prioritized by `type` in this order ***(where `deps` allow)***: + + **`API → IMPL → FIX → TEST → DOC`** + +## Phase 2: execution + +### Mandatory execution loop + +This loop runs until every node in the DAG is marked `done`. + +> [!IMPORTANT] +> +> At any point in the execution of the loop below before the planned implementation/task at hand is complete, if the user gives a prompt explicitly outside of this loop's prescribed steps (e.g. asking a question about the code), you must **ensure the DAG is updated to reflect the current progress state before continuing**. + +Until the planned implementation/task is complete, execute the steps below, in the order given and in a loop: + +1. If there is currently a node with `status=in_progress`: + + - if the node's changes have (ostensibly very recently) been completed (with or without changes from the user, set that node's `status=done`) + - else, set the node's status to `ready|planned|blocked|cancelled` as appropriate; seek clarification if unsure + +2. Determine the next node to process: + + - If the user requested a certain step/node to be implemented, select that one (even if it's not the next in the topological ordering) + - Otherwise (i.e., by default), select the next **`status=ready`** DAG node (i.e. whose `deps` are now empty) according to the [scheduling protocol above](#scheduling-protocol) and set that node's `status` to `in_progress`. + +3. Run **stale state detection:** + + 1. **Re-read the target function's current contents** (always!). + + - If the user modified it since last read (or compared to what you were expecting), acknowledge and adapt: investigate as thoroughly as needed, think about and formulate the new changes needed (remaining mindful of the [DAG change protocol](#dag-change-protocol)) + + 2. Summarize the target function/code to the user in the following format: + + > Note each of the bullets below can have up to 3 sub-bullets as necessary to ensure information is easy to read and parse for the user. + + ```md + - <state the inputs/outputs if within a function, surrounding context if not> + - <the function/code's current behavior (and invariants, if any)> + ``` + + 3. ***Only if* the `goal` and `diff` were updated:** \ + + - Output to the user: + + ```md + ATTENTION: goal and diff for this change were updated due to changes to the ground truth; see below. + ``` + + - *Don't* print the goal and diff here; you'll do that in the next step, in the *step header*. + - Ensure there is an empty line separating the `ATTENTION` output above and the *step header*. + +4. Emit the *step header* in the *exact* format below: + + ```md + - Step <id>: <file>::<function> + - Signature: <exact function signature, as currently in file> + - Type: <API|IMPL|FIX|TEST|DOC> + - Risk: <low|medium|high> + + <diff of changes> + + - Goal: <one sentence> + - Why now: <deps satisfied> + - Summary: <summary of the changes, 6 bullet points maximum> + ``` + +5. Apply the *micro edit* based on the node's diff +6. Update the DAG node accordingly + + > Make sure not to forget the [status update](#status-update-protocol) and [DAG storage](#dag-persistence-protocol) protocols as appropriate. + +7. Emit the *step footer* in the *exact* format below: + + > Note the `check` bullet is only required if there actually *is* a command we could use to verify this particular step's changes were correct. + + ```md + - Changed: <file>::<function> (±<N> lines, starting at line <i> in the updated file) + - Exact diff: + + <output diff of the changes you made here> + + - Check: <command> → <result> + - Next candidates: <ready step ids> + + Press ">" or "." to continue + ``` + +8. Output `⏸️` and **STOP**: do not write the next step's header, do not read the next file, do not begin any further work. Your message **must** end within 1-2 lines after the ⏸️ symbol. + + > **Waiting for a [resumption token](#resumption-tokens) is NOT optional.** + > + > If you find yourself writing a second edit in the same response, you are violating micro mode: stop, delete everything after the first `⏸️`, and end your response. + +9. Upon receipt of a [resumption token](#resumption-tokens), restart this loop at step 1. + +### End-of-phase verification + +After all DAG nodes are marked `done`: + +1. Collect every distinct `Check` command from the step footers emitted during the loop +2. Run each one and record the result +3. If all pass: report success and proceed to exit/wrap-up +4. If any fail: present the failures to the user and ask how to proceed: + + - **Fix in micro mode** → plan new `FIX` nodes for each failure, re-enter the execution loop + - **Fix outside micro mode** → exit micro mode, address failures conversationally + - **Ignore** → user accepts the current state as-is + +> [!NOTE] +> +> The `Check` line in each step footer is **informational during the loop**: it documents what *would* verify that step, but the agent does not run it mid-loop. Verification is batched here to preserve the fast edit cadence. +> +> The user may, of course, run checks manually between any two steps. If they report a failure, the [course-correction protocol](#course-correction-protocol-contingent-on-user-input) applies. + +### Phase 2 protocols + +> [!IMPORTANT] +> +> The following protocols are applicable **at all times while in phase 2.** + +#### DAG change protocol + +If, for any reason, you must: + +- roll back in the execution sequence (i.e., the topological ordering) by one or more nodes +- edit/add new nodes (i.e. due to changes in implementation and/or approach) + +You must: + +1. Carefully think through the resulting updates to the DAG that are necessary + + > Note that **multiple (and potentially many) DAG changes may be needed,** including **new and/or updated nodes _and_ edges**. + > + > This is because node additions and/or changes may cause the need for changes/adjustments to ripple through to the rest of the implementation (and hence the rest of the DAG nodes/edges). + +> [!IMPORTANT] +> +> If and when planning changes and additions to the implementation/solution represented by the DAG, there may (and often will) be design decisions/tradeoffs that are substantial enough to require user input. +> +> For all nodes whose implementation is subject to a design tradeoff/choice: +> +> 1. Encode the implementation/choice (in the `diff` and `goal` fields) that maximizes the objective function `(2/3 * optimality + 1/3 * likeliness to be approved by the user)` +> 2. Set the node's: +> +> - `status` field to `blocked` +> - `tradeoffs` field to contain a detailed but concise description of the tradeoff/choice to be made/resolved + +2. Update the DAG based on the results of step 1, ensuring `deps` and `status` fields are carefully set + +3. **Log the update to the user** based on the following template + + ```md + *** DAG CHANGED *** + + - What triggered the change: <summarize in 1-2 sentences> + - Changes made to DAG: <summarize in 1-2 sentences OR 1-4 sub-bullet points each containing 1 sentence; make sure these cover both any concrete implementation changes AND changes made to the DAG> + ``` + +4. If there were any implementation changes made **that haven't been discussed with the user yet** (i.e. because you were asked to come up with them independently/agentically), output a final bullet point (as part of the output from step 3) explaining why these changes are needed and optimal: + + ```md + - Why the new implementation is necessary AND optimal: <describe in 1-2 sentences OR 1-4 sub-bullet points each containing 1 sentence> + ``` + +5. **Only if there are any nodes whose `status` is `blocked`**: + + 1. Respond with a numbered list of questions (e.g. via the `AskUserQuestion` tool or equivalent) seeking unambiguous clarification/resolution of each design choice/tradeoff, with each question preceded by a description of the design choice/tradeoff at hand based on (but not a direct reprint of) the node's `tradeoffs` field. + 2. Once clarifications/resolutions have been received, restart at step 1 of this DAG change protocol. + +> [!IMPORTANT] +> +> If any task (i.e., corresponding to an in-progress node that was interrupted) must be left unfinished due to step 4 above: +> +> 1. Ensure the DAG is updated (if not already) to: +> +> - change the node's `status` as one of `ready|planned|blocked` +> +> - if set to `blocked` (i.e. due to a newly-surfaced tradeoff/choice), add notes for this to `tradeoffs` +> +> - update `diff` and `goal` to contain the remaining changes needed +> +> 2. ***Only if* `status` was set to `blocked`**: +> +> - Present the design choice/tradeoff to the user and ask for unambiguous clarification +> - Return to step 1 (i.e. of the steps in this callout) and make sure to clear the `tradeoffs` field after recording the chosen solution/change +> +> 3. Briefly note that the current task must be left unfinished, citing: +> +> - the task's ID +> - the remaining changes needed + +#### Course-correction protocol (contingent on user input) + +If the user does *any* of the following: + +- edits your code manually +- says anything like: + + - "I tweaked it" + - "I changed it" + - "I fixed your code" + - "Read my version" + - "Take a look at my edits" + - "Rebase on my edits" + +You **must**: + +1. Re-open and re-read the affected file(s) + + > **Never** assume previous patch state. + +2. Briefly confirm you are now using the current contents +3. Wait for one of the [*resumption tokens* listed above](#resumption-tokens) before proceeding + +> [!IMPORTANT] +> +> #### <ins>*Revert-by-default* rule</ins> (for auto-reverts & redos) +> +> If the user says anything like: +> +> - "no" +> - "wrong direction" +> - "redo" +> - "undo that" +> - or manually reverts your change +> +> You **must:** +> +> 1. revert the change if asked to (or if the user manually reverted, treat the new working tree as authoritative) +> 2. assume the step you just performed is invalid and seek unambiguous clarification as to the correct way to proceed +> 3. update the DAG as needed +> +> - in particular, ensure the current `status=in_progress` node has its status set to `cancelled|ready|planned` as appropriate +> - if the node's status becomes `cancelled`, remove it from all other nodes' `deps` lists in which it appears +> +> 4. resume the [execution loop](#mandatory-execution-loop) +> +> **Never** defend the rejected implementation. + +## Compatibility with other Bureau-configured workflows + +### Superpowers skills *(Claude Code & Codex)* + +Micro Mode is **compatible** with Superpowers skills: + +- **TDD skill**: Each test-first step and implementation step becomes a micro edit. The TDD cycle (Red → Green → Refactor) maps to the DAG naturally. +- **Systematic debugging**: Investigation steps remain conversational; only actual code fixes become micro edits. +- **Code review**: Review findings can inform DAG construction; fixes are micro edits. + +### Handoff guidelines + +Micro Mode operates **within** a single agent session. If you need to delegate: + +1. **Pause** micro mode (persist DAG) +2. **Delegate** via `clink` or `Task` tool +3. **Resume** micro mode after delegation completes + +Do not attempt to run micro mode across multiple agents simultaneously. diff --git a/protocols/context/static/skills/prompt-engineering/SKILL.md b/protocols/context/static/skills/prompt-engineering/SKILL.md new file mode 100644 index 00000000..4df36119 --- /dev/null +++ b/protocols/context/static/skills/prompt-engineering/SKILL.md @@ -0,0 +1,717 @@ +--- +description: Use when creating, analyzing, or improving prompts for any purpose; whether system prompts, user prompts, agent instructions, or skill definitions. Use when prompts produce vague, inaccurate, generic, or poorly-structured outputs. Use when building new agents, skills, or automated workflows. +--- + +# Prompt engineering + +## Overview + +Transform weak prompts into precision instruments through systematic technique application. + +**Core principle:** Prompts fail in predictable ways; each failure mode has research-proven countermeasures. + +**Violating the letter of this process is violating the spirit of prompt engineering.** + +## When to use + +```dot +digraph when_prompt_eng { + "Creating new prompt?" [shape=diamond]; + "Existing prompt failing?" [shape=diamond]; + "Building agent/skill?" [shape=diamond]; + "Use this skill" [shape=box, style=filled, fillcolor="#ccffcc"]; + "Skip" [shape=box]; + + "Creating new prompt?" -> "Use this skill" [label="yes"]; + "Creating new prompt?" -> "Existing prompt failing?" [label="no"]; + "Existing prompt failing?" -> "Use this skill" [label="yes"]; + "Existing prompt failing?" -> "Building agent/skill?" [label="no"]; + "Building agent/skill?" -> "Use this skill" [label="yes"]; + "Building agent/skill?" -> "Skip" [label="no"]; +} +``` + +**Use for:** + +- System prompts for agents or assistants +- Task prompts producing poor results +- Skill definitions +- User-facing prompt templates +- Automated workflow prompts + +**Don't use for:** + +- One-off simple questions +- Prompts already performing well +- Non-prompt content (use other skills) + +## The iron law + +``` +NO PROMPT SHIPS WITHOUT: +DIAGNOSIS -> TECHNIQUE -> VERIFICATION +``` + +Guessing at improvements without diagnosing the failure mode is not prompt engineering. + +## Failure mode taxonomy + +Before applying any technique, identify which failure mode(s) the prompt exhibits: + +### Category 1: Output quality failures + +| Failure Mode | Symptoms | Primary Techniques | +|--------------|----------|-------------------| +| `VAGUE` | Ambiguous, wishy-washy, hedging | Constraint-first, Role-based | +| `GENERIC` | Template-sounding, could apply to anything | Few-shot with negatives, Directional Stimulus | +| `VERBOSE` | Unnecessary padding, repetition | Skeleton-of-Thought, Constraint-first | +| `WRONG_FORMAT` | Correct content, wrong structure | Structured thinking, Constraint-first | +| `WRONG_TONE` | Content ok, voice/style wrong | Role-based, Directional Stimulus | + +### Category 2: Reasoning failures + +| Failure Mode | Symptoms | Primary Techniques | +|--------------|----------|-------------------| +| `SHALLOW` | Superficial analysis, misses nuance | Structured thinking, Multi-perspective | +| `ILLOGICAL` | Steps don't follow, jumps to conclusions | Zero-Shot CoT, Least-to-Most | +| `SINGLE_PATH` | Misses alternative approaches | Tree of Thoughts, Self-consistency | +| `ARITHMETIC` | Calculation errors | PAL (Program-Aided) | +| `COMPLEX_PROBLEM` | Can't handle multi-step problems | Least-to-Most, Skeleton-of-Thought | + +### Category 3: Accuracy failures + +| Failure Mode | Symptoms | Primary Techniques | +|--------------|----------|-------------------| +| `HALLUCINATION` | Fabricated facts, false citations | Chain-of-verification, Context injection | +| `OUTDATED` | Uses stale information | ReAct (with tools), Generated Knowledge | +| `OVERCONFIDENT` | Wrong but confident | Confidence-weighted, Self-consistency | +| `CONTEXT_LEAK` | Uses outside knowledge when shouldn't | Context injection with boundaries | + +### Category 4: Task execution failures + +| Failure Mode | Symptoms | Primary Techniques | +|--------------|----------|-------------------| +| `INCOMPLETE` | Stops before finishing | Iterative refinement, Exit criteria | +| `OFF_TASK` | Answers different question | Constraint-first, Structured thinking | +| `NO_TOOLS` | Needs external data but doesn't seek it | ReAct | +| `STUCK` | Can't proceed, gives up | Reflexion, Tree of Thoughts | + +## Technique selection matrix + +**Quick reference for technique selection:** + +| If prompt needs... | Use technique | Key benefit | +|-------------------|---------------|-------------| +| Precise expertise | Role-based constraint | Domain accuracy | +| Fact verification | Chain-of-verification | Fewer hallucinations | +| Clear boundaries | Few-shot with negatives | Fewer generic outputs | +| Deep analysis | Structured thinking | Layered reasoning | +| Decision confidence | Confidence-weighted | Explicit uncertainty | +| Bounded knowledge | Context injection | Less knowledge bleed | +| Quality iteration | Iterative refinement | Progressive improvement | +| Specific constraints | Constraint-first | Hard requirements met | +| Multiple viewpoints | Multi-perspective | Less bias | +| Self-optimization | Meta-prompting | Optimal structure | +| Consistent answers | Self-consistency | Fewer single-path errors | +| Exploration/backtrack | Tree of Thoughts | Strategic lookahead | +| Tool integration | ReAct | Grounded responses | +| Parallel structure | Skeleton-of-Thought | Speed + organization | +| Progressive complexity | Least-to-Most | Generalization | +| Targeted generation | Directional Stimulus | Specific properties | +| Self-correction | Reflexion | Learn from failures | +| Precise calculation | PAL | Zero arithmetic errors | +| Simple reasoning boost | Zero-Shot CoT | Quick improvement | +| Knowledge priming | Generated Knowledge | Better factual basis | + +## The engineering process + +### Phase 1: Diagnosis + +**Before touching the prompt, identify failure modes:** + +1. **Collect evidence** + + - Run the prompt 3-5 times + - Document specific failures (not general impressions) + - Note which outputs were acceptable vs. unacceptable + +2. **Classify failures** + + ``` + DIAGNOSIS REPORT + + Prompt: [brief description] + Test runs: N + + Observed failures: + [1] <FAILURE_MODE>: <specific example> + [2] <FAILURE_MODE>: <specific example> + ... + + Primary failure mode: <most frequent/severe> + Secondary failure modes: <others to address> + ``` + +3. **Identify root cause** + + - Is the failure in the prompt structure? + - Is it a missing technique? + - Is it conflicting instructions? + - Is it asking the impossible? + +### Phase 2: Technique selection + +**Match failure modes to techniques:** + +```dot +digraph technique_selection { + rankdir=TB; + + "Failure mode identified" [shape=ellipse]; + "Accuracy issue?" [shape=diamond]; + "Reasoning issue?" [shape=diamond]; + "Output quality issue?" [shape=diamond]; + "Execution issue?" [shape=diamond]; + + "Chain-of-verification" [shape=box]; + "Context injection" [shape=box]; + "CoT / ToT / L2M" [shape=box]; + "Self-consistency" [shape=box]; + "Role-based" [shape=box]; + "Constraint-first" [shape=box]; + "Few-shot negatives" [shape=box]; + "ReAct / Reflexion" [shape=box]; + + "Failure mode identified" -> "Accuracy issue?"; + "Accuracy issue?" -> "Chain-of-verification" [label="hallucination"]; + "Accuracy issue?" -> "Context injection" [label="knowledge bleed"]; + "Accuracy issue?" -> "Reasoning issue?" [label="no"]; + + "Reasoning issue?" -> "CoT / ToT / L2M" [label="shallow/illogical"]; + "Reasoning issue?" -> "Self-consistency" [label="single-path"]; + "Reasoning issue?" -> "Output quality issue?" [label="no"]; + + "Output quality issue?" -> "Role-based" [label="tone/expertise"]; + "Output quality issue?" -> "Constraint-first" [label="format/bounds"]; + "Output quality issue?" -> "Few-shot negatives" [label="generic"]; + "Output quality issue?" -> "Execution issue?" [label="no"]; + + "Execution issue?" -> "ReAct / Reflexion" [label="stuck/incomplete"]; +} +``` + +**Technique stacking rules:** + +- Start with ONE primary technique +- Add secondary techniques only if primary insufficient +- Maximum 3 techniques per prompt (avoid complexity) +- Techniques must be compatible (see compatibility matrix) + +### Phase 3: Application + +**Apply selected technique(s) systematically:** + +1. **Preserve working elements** + + - Don't rewrite from scratch + - Keep what already works + - Add technique elements surgically + +2. **Follow technique templates exactly** + + - Don't improvise on proven patterns + - Use the documented structure + - Modify content, not framework + +3. **Layer techniques carefully** + + - Primary technique: structural changes + - Secondary techniques: augmentations + - Don't let techniques conflict + +### Phase 4: Verification + +**Test the improved prompt:** + +1. **Run same test cases** that revealed original failures + +2. **Compare outputs** quantitatively where possible + +3. **Document improvement**: + + ``` + VERIFICATION REPORT + + Original failure rate: X/N tests + Improved failure rate: Y/N tests + + Failure modes addressed: + [1] <MODE>: FIXED / IMPROVED / UNCHANGED + [2] <MODE>: FIXED / IMPROVED / UNCHANGED + + New issues introduced: [none / list] + + Verdict: SHIP / ITERATE / REVERT + ``` + +4. **Iterate if needed** + + - If UNCHANGED on primary failure: try different technique + - If new issues: revert, analyze, try again + - If IMPROVED but not FIXED: add complementary technique + +## Technique reference + +### Foundation techniques + +**Role-based constraint prompting** + +``` +You are a [specific role] with [X years] experience in [domain]. +Your task: [specific task] +Constraints: [list 3-5 specific limitations] +Output format: [exact format needed] +``` + +*Use for: VAGUE, WRONG_TONE, expertise needed* + +**Constraint-first prompting** + +``` +HARD CONSTRAINTS (cannot be violated): +- [constraint 1] +- [constraint 2] + +SOFT PREFERENCES (optimize for these): +- [preference 1] +- [preference 2] + +TASK: [your actual request] +``` + +*Use for: WRONG_FORMAT, OFF_TASK, specific requirements* + +**Context injection with boundaries** + +``` +[CONTEXT] +[paste documentation, code, etc.] + +[FOCUS] +Only use information from CONTEXT. If answer isn't there, say so. + +[TASK] +[your question] + +[CONSTRAINTS] +- Cite specific sections +- Do not use general knowledge +``` + +*Use for: HALLUCINATION, CONTEXT_LEAK, grounded responses* + +### Reasoning techniques + +**Zero-Shot Chain-of-Thought** + +``` +[Your question] + +Let's think step by step. +``` + +*Use for: Quick reasoning boost, SHALLOW analysis* + +**Self-consistency prompting** + +``` +[Your question] + +Generate [N] different reasoning paths. +Show work for each path. +Identify which answer appears most frequently. +``` + +*Use for: SINGLE_PATH, OVERCONFIDENT, consensus needed* + +**Tree of Thoughts** + +``` +Problem: [problem] + +Step 1 - Generate 3 possible first moves +Step 2 - Evaluate each: promising / neutral / unlikely +Step 3 - Expand most promising into 3 next steps +Step 4 - Backtrack if stuck +Step 5 - Continue until solved + +Show exploration tree with evaluations. +``` + +*Use for: COMPLEX_PROBLEM, exploration, strategic problems* + +**Least-to-Most prompting** + +``` +Problem: [complex problem] + +Step 1 - Decomposition: +Break into simpler subproblems, easiest to hardest. + +Step 2 - Sequential solving: +Solve subproblem 1: [solution] +Using that, solve subproblem 2: [solution] +...continue... + +Final answer: [synthesized solution] +``` + +*Use for: COMPLEX_PROBLEM, generalization beyond examples* + +**PAL (Program-Aided Language)** + +``` +Problem: [problem with computation] + +Write Python code to solve step by step. +Include comments explaining reasoning. +End with print(f"Answer: {answer}") + +Execute code for result. +``` + +*Use for: ARITHMETIC errors, precise computation* + +### Verification techniques + +**Chain-of-verification** + +``` +Task: [question] + +Step 1: Provide initial answer +Step 2: Generate 5 verification questions that would expose errors +Step 3: Answer each verification question +Step 4: Provide corrected final answer +``` + +*Use for: HALLUCINATION, OVERCONFIDENT, accuracy-critical* + +**Reflexion** + +``` +Task: [task] + +Attempt 1: [solution] + +Evaluation: +- Success? [Yes/No] +- What went wrong? [issues] + +Self-Reflection: +- Root cause: [why] +- Lesson: [what to change] + +Attempt 2 (with lessons): [improved solution] + +[Repeat until success] +``` + +*Use for: STUCK, iterative improvement, learning from failure* + +**Confidence-weighted prompting** + +``` +Answer: [question] + +Provide: +1. Primary answer +2. Confidence (0-100%) +3. Key assumptions +4. What would change your answer +5. Alternative if <80% confident +``` + +*Use for: OVERCONFIDENT, high-stakes decisions* + +### Output control techniques + +**Few-shot with negative examples** + +``` +I need you to [task]. Examples: + +GOOD: [example] +GOOD: [example] + +BAD: [example] +Why bad: [reason] +BAD: [example] +Why bad: [reason] + +Now complete: [task] +``` + +*Use for: GENERIC outputs, style control* + +**Structured thinking protocol** + +``` +Before answering, complete: + +[UNDERSTAND] +- Restate problem +- Identify what's asked + +[ANALYZE] +- Break into sub-components +- Note assumptions/constraints + +[STRATEGIZE] +- 2-3 potential approaches +- Evaluate trade-offs + +[EXECUTE] +- Final answer +- Explain reasoning + +Question: [question] +``` + +*Use for: SHALLOW, ILLOGICAL, complex reasoning* + +**Skeleton-of-Thought** + +``` +Question: [question] + +Stage 1 - Skeleton: +List key points as brief phrases (3-7 points). +Do not elaborate. + +Stage 2 - Expansion: +Expand each point into detailed paragraph. +Each self-contained. +``` + +*Use for: VERBOSE, speed + structure* + +**Directional Stimulus prompting** + +``` +Task: [task] +Input: [content] + +Directional stimulus: +- Keywords to include: [required] +- Tone: [style] +- Focus on: [priorities] +- Avoid: [exclusions] + +Complete task incorporating stimulus above. +``` + +*Use for: GENERIC, specific properties needed* + +### Meta techniques + +**Meta-prompting** + +``` +I need to accomplish: [goal] + +Your task: +1. Analyze what would make the PERFECT prompt +2. Consider: specificity, context, constraints, format, examples +3. Write that perfect prompt +4. Execute it and provide output + +[GOAL]: [objective] +``` + +*Use for: Unknown optimal structure, self-optimization* + +**Multi-perspective prompting** + +``` +Analyze [topic] from these perspectives: + +[PERSPECTIVE 1: Technical Feasibility] +[PERSPECTIVE 2: Business Impact] +[PERSPECTIVE 3: User Experience] +[PERSPECTIVE 4: Risk/Security] + +SYNTHESIS: +Integrate into recommendation with trade-offs. +``` + +*Use for: Bias reduction, strategic analysis* + +**Generated Knowledge prompting** + +``` +Question: [question] + +Step 1 - Knowledge Generation: +Generate 3-5 relevant facts that would help answer accurately. + +Step 2 - Knowledge Integration: +Using ONLY generated knowledge, answer the question. +``` + +*Use for: Factual accuracy, self-retrieval* + +**ReAct prompting** + +``` +Task: [question] + +Use this loop: +Thought: [what info needed / what to do] +Action: [Search/Lookup/Calculate][query] +Observation: [result] + +When confident: +Thought: I have enough information. +Final Answer: [answer] +``` + +*Use for: NO_TOOLS, needs external information* + +**Iterative refinement** + +``` +[ITERATION 1] +Create draft of [task] + +[ITERATION 2] +Review output. Identify 3 weaknesses. + +[ITERATION 3] +Rewrite addressing all weaknesses. + +[ITERATION 4] +Final review: Production-ready? If not, what's missing? +``` + +*Use for: INCOMPLETE, quality improvement* + +## Technique compatibility + +Some techniques work well together; others conflict: + +| Technique | Combines well with | Conflicts with | +|-----------|-------------------|----------------| +| Role-based | Constraint-first, Few-shot | - | +| Constraint-first | All | - | +| Chain-of-verification | Self-consistency | - | +| Few-shot negatives | Role-based, Constraint-first | - | +| Structured thinking | Multi-perspective | Skeleton-of-Thought | +| Self-consistency | Chain-of-verification | Skeleton-of-Thought | +| Tree of Thoughts | Reflexion | Skeleton-of-Thought | +| ReAct | Generated Knowledge | PAL | +| Skeleton-of-Thought | Directional Stimulus | Multi-path techniques | +| Least-to-Most | Self-consistency | - | +| Reflexion | Tree of Thoughts | - | +| PAL | Constraint-first | ReAct | +| Zero-Shot CoT | Everything | Nothing | +| Meta-prompting | Use alone first | All (it generates its own) | + +## Anti-patterns + +### What not to do + +| Anti-pattern | Problem | Instead | +|--------------|---------|---------| +| "Just make it better" | No diagnosis, no technique | Diagnose failure mode first | +| Technique soup | Too many techniques, conflicts | Max 3, check compatibility | +| Ignoring evidence | Not testing before/after | Always verify improvement | +| Template worship | Copying template without understanding | Understand why technique works | +| Overengineering | Complex prompt for simple task | Match complexity to problem | +| Skipping verification | Assuming improvement worked | Test same failure cases | + +### Common rationalizations + +| Excuse | Reality | +|--------|---------| +| "The prompt is close enough" | If it's failing, it's not close enough | +| "More examples will fix it" | Examples without negatives often fail | +| "Just add more detail" | Verbosity does not equal clarity; structure matters | +| "This technique is overkill" | Proven techniques beat intuition | +| "I'll verify later" | Unverified improvements ship bugs | +| "The model should understand" | Models need explicit structure | + +## Integration with other skills + +### With bureau-scrimmage-mode + +For **security-critical prompts**: + +1. Engineer prompt with this skill +2. Activate scrimmage mode +3. Attack the prompt (injection, jailbreak, edge cases) +4. Fix vulnerabilities found +5. Re-verify + +## Quick reference + +### Diagnosis + +``` +1. Run prompt 3-5 times +2. Document specific failures +3. Classify failure modes +4. Identify primary mode +``` + +### Technique selection + +``` +Accuracy -> Chain-of-verification / Context injection +Reasoning -> CoT / ToT / Least-to-Most / Self-consistency +Output -> Role-based / Constraint-first / Few-shot negatives +Execution -> ReAct / Reflexion +``` + +### Application + +``` +1. Preserve working elements +2. Follow templates exactly +3. Max 3 techniques +4. Check compatibility +``` + +### Verification + +``` +1. Rerun original failure cases +2. Compare quantitatively +3. Document: FIXED / IMPROVED / UNCHANGED +4. Verdict: SHIP / ITERATE / REVERT +``` + +## Checklist + +Before shipping any engineered prompt: + +- [ ] Failure modes diagnosed with evidence +- [ ] Technique(s) selected based on diagnosis +- [ ] Applied using documented templates +- [ ] Compatible techniques (max 3) +- [ ] Tested on original failure cases +- [ ] Improvement documented quantitatively +- [ ] No new failure modes introduced +- [ ] Verified by independent run + +## Red flags - stop + +- Applying technique without diagnosis +- Combining incompatible techniques +- Skipping verification +- "It seems better" without evidence +- More than 3 techniques stacked +- Template significantly modified +- Original failure cases not retested + +**All of these mean: Go back to Phase 1.** diff --git a/protocols/context/static/skills/safeguard-mode/SKILL.md b/protocols/context/static/skills/safeguard-mode/SKILL.md new file mode 100644 index 00000000..0176da6f --- /dev/null +++ b/protocols/context/static/skills/safeguard-mode/SKILL.md @@ -0,0 +1,606 @@ +--- +description: Continuous invariant protection throughout implementation. Activate when user says "SAFEGUARD MODE ON", "protect these invariants", "safeguard these rules", "verify invariants", or "guard these rules". Defines system rules that must never break (value constraints, state machines, relationships, uniqueness, temporal, ordering, consistency), verifies after every change, and blocks changes that would violate them. Configurable intensity from light to paranoid. +--- + +# Safeguard Mode: *protocol* + +> <ins>***Goal:** define the rules that must never break, verify after every change*</ins> +> +> *Continuous invariant protection throughout implementation. You will define system invariants upfront, generate verification checks, and block any change that would violate them.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "SAFEGUARD MODE ON" +- "protect these invariants" +- "safeguard these rules" +- "verify invariants after changes" +- "guard these rules" + +*follow this Safeguard Mode protocol* until you are told anything like: + +- "safeguard mode off" +- "SAFEGUARD MODE OFF" +- "stop checking invariants" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, emit: + +``` +═══════════════════════════════════════ +Safeguard Mode OFF +Invariants defined: N +Changes verified: M +Violations caught: K +Violations fixed: J +Active invariants stored: [location, or "none"] +═══════════════════════════════════════ +``` + +### Verification intensity + +Verification intensity can be configured. Default is `standard`. + +| Intensity | When to verify | Verification method | +|-----------|----------------|---------------------| +| `light` | After each function change | Static analysis only | +| `standard` | After each edit | Static + generated assertions | +| `strict` | After each edit | Static + runtime checks + tests | +| `paranoid` | After each line change | All methods + formal reasoning | + +Activate specific intensity: "SAFEGUARD MODE ON, intensity: strict" + +## Core contract + +### The invariant guarantee + +For every code change: + +1. **Check** all defined invariants against the change +2. **Verify** no invariant is violated (statically or dynamically) +3. **Block** changes that would break invariants +4. **Report** potential violations with evidence +5. **Require** fix or explicit waiver before proceeding + +### What is an invariant? + +An invariant is a property that must **always** hold true throughout program execution: + +| Invariant type | Description | Example | +|----------------|-------------|---------| +| **Value constraint** | Bounds on numeric/string values | `balance >= 0`, `age > 0 && age < 150` | +| **State machine** | Valid state transitions | `order: pending → confirmed → shipped` | +| **Relationship** | Properties between entities | `child.parent_id == parent.id` | +| **Cardinality** | Count constraints | `user.sessions.length <= 5` | +| **Uniqueness** | No duplicates | `emails are unique across users` | +| **Temporal** | Time-based constraints | `token.expires_at > token.created_at` | +| **Ordering** | Sequence requirements | `events sorted by timestamp` | +| **Consistency** | Cross-field agreement | `if premium then subscription != null` | + +## Invariant definition + +### Setup phase + +Before making changes, define invariants explicitly. Prompt user if none provided: + +``` +INVARIANT SETUP REQUIRED + +Before proceeding, define the invariants to guard. +Format: One invariant per line, with category prefix. + +Example: + [VALUE] user.balance >= 0 + [STATE] Order: pending → confirmed → shipped → delivered + [UNIQUE] User.email must be unique + [TEMPORAL] Session.expires_at > Session.created_at + [CARD] User.active_sessions <= 5 + +Your invariants: _ +``` + +### Invariant syntax + +Each invariant definition must include: + +``` +INVARIANT: <id> +Type: <VALUE | STATE | RELATIONSHIP | CARDINALITY | UNIQUE | TEMPORAL | ORDERING | CONSISTENCY> +Entity: <class/table/module affected> +Rule: <formal expression or natural language description> +Scope: <where this applies: function, module, system-wide> +Severity: <CRITICAL | HIGH | MEDIUM | LOW> +Verify: <how to check: static | runtime | test | manual> +``` + +**Example definitions:** + +``` +INVARIANT: non_negative_balance +Type: VALUE +Entity: Account +Rule: account.balance >= 0 at all times +Scope: system-wide +Severity: CRITICAL +Verify: static + runtime + +INVARIANT: order_state_machine +Type: STATE +Entity: Order +Rule: status transitions only: pending → confirmed → shipped → delivered → completed + OR pending → cancelled (terminal) + No skips, no reversals except via explicit refund flow +Scope: Order module +Severity: CRITICAL +Verify: static + test + +INVARIANT: cache_size_bound +Type: CARDINALITY +Entity: CacheManager +Rule: cache.size <= cache.max_size +Scope: CacheManager class +Severity: HIGH +Verify: runtime assertion + +INVARIANT: user_email_unique +Type: UNIQUE +Entity: User +Rule: No two users share the same email address +Scope: system-wide +Severity: CRITICAL +Verify: database constraint + test +``` + +### Invariant registry format + +Maintain a running registry of all active invariants: + +``` +══════════════════════════════════════════════════════════════════════ +INVARIANT REGISTRY +Task: <current task description> +══════════════════════════════════════════════════════════════════════ + +ID │ Type │ Entity │ Severity │ Status +──────────────────────┼────────┼───────────────┼──────────┼───────── +non_negative_balance │ VALUE │ Account │ CRITICAL │ ✅ ACTIVE +order_state_machine │ STATE │ Order │ CRITICAL │ ✅ ACTIVE +cache_size_bound │ CARD │ CacheManager │ HIGH │ ✅ ACTIVE +user_email_unique │ UNIQUE │ User │ CRITICAL │ ✅ ACTIVE + +Total: 4 invariants (4 active, 0 suspended, 0 violated) +══════════════════════════════════════════════════════════════════════ +``` + +## Verification protocol + +### Pre-change analysis + +Before applying any code change: + +1. **Identify affected invariants**: Which invariants touch the code being modified? +2. **Flag high-risk changes**: Changes to code that enforces an invariant +3. **Warn on invariant removal**: Any code deletion that removes invariant enforcement + +``` +PRE-CHANGE INVARIANT ANALYSIS +Change: Modify withdraw() in Account class + +Affected invariants: +├── non_negative_balance (CRITICAL) ⚠️ DIRECTLY AFFECTED +│ └── This function enforces the balance >= 0 constraint +├── audit_trail_complete (HIGH) +│ └── withdraw() creates audit entries +└── No other invariants affected + +Risk level: HIGH - modifying invariant enforcement code +Proceed with caution: YES/NO? +``` + +### Post-change verification + +After every code change, verify all affected invariants: + +1. **Static analysis**: Inspect code paths for possible violations +2. **Assertion generation**: Create runtime assertions if not present +3. **Test verification**: Run relevant tests that exercise invariants +4. **Manual confirmation**: For invariants that can't be automated + +### Verification report format + +``` +══════════════════════════════════════════════════════════════════════ +INVARIANT VERIFICATION: <file>::<function> +Change: <brief description> +══════════════════════════════════════════════════════════════════════ + +INVARIANTS CHECKED: + +[1] non_negative_balance (CRITICAL) + Method: Static analysis of all code paths + Result: ✅ PRESERVED + Evidence: All paths through withdraw() check balance before decrement + +[2] order_state_machine (CRITICAL) + Method: State transition analysis + Result: ✅ PRESERVED + Evidence: New code only allows valid transitions per state machine + +[3] cache_size_bound (HIGH) + Method: Runtime assertion exists + Result: ⚠️ WEAKENED + Evidence: New code path bypasses size check on cache.force_insert() + Location: line 45-48 + Recommendation: Add size check or document exception + +[4] audit_trail_complete (HIGH) + Method: Code inspection + Result: ❌ VIOLATED + Evidence: New early return on line 23 skips audit logging + Must fix before proceeding + +══════════════════════════════════════════════════════════════════════ +SUMMARY: 4 checked | 2 preserved | 1 weakened | 1 violated +ACTION REQUIRED: Fix [4] before proceeding +══════════════════════════════════════════════════════════════════════ +``` + +### Result classifications + +| Symbol | Status | Meaning | Action | +|--------|--------|---------|--------| +| ✅ | `PRESERVED` | Invariant still holds | None required | +| ⚠️ | `WEAKENED` | Invariant partially compromised | Warn, recommend fix | +| ❌ | `VIOLATED` | Invariant broken | **Must fix before proceeding** | +| 🔄 | `TRANSFERRED` | Enforcement moved to different location | Document new location | +| ⏸️ | `SUSPENDED` | Temporarily disabled (with approval) | Track, re-enable later | + +## Violation handling + +### On VIOLATED result + +1. **Stop immediately** — do not proceed with additional changes +2. **Identify root cause** — what in the change breaks the invariant? +3. **Propose fix** — concrete code to restore invariant +4. **Apply fix** — implement the restoration +5. **Re-verify** — confirm invariant now holds +6. **Continue only after** — all VIOLATED become PRESERVED + +### On WEAKENED result + +1. **Document the weakening** clearly +2. **Assess risk** — could this lead to violation under certain conditions? +3. **Propose strengthening** — how to restore full protection +4. **Ask user**: "Accept weakened invariant, or fix now?" +5. **If accepted**: Mark as acknowledged, continue monitoring +6. **If fix**: Apply strengthening, re-verify + +### Violation report format + +``` +⚠️ INVARIANT VIOLATION DETECTED + +Invariant: non_negative_balance +Severity: CRITICAL +Status: ❌ VIOLATED + +VIOLATION DETAILS: +┌─────────────────────────────────────────────────────────────────────┐ +│ Location: src/accounts/account.py::withdraw() line 34 │ +│ │ +│ Code path that violates: │ +│ 1. User requests withdrawal of $100 │ +│ 2. Concurrent request reduces balance to $50 │ +│ 3. First request proceeds (no re-check after lock acquisition) │ +│ 4. Balance becomes -$50 ← VIOLATION │ +│ │ +│ Root cause: Race condition - balance checked before lock, used │ +│ after lock without re-validation │ +└─────────────────────────────────────────────────────────────────────┘ + +PROPOSED FIX: +```python +def withdraw(self, amount): + with self.lock: + # Re-check balance after acquiring lock + if self.balance < amount: + raise InsufficientFunds(f"Balance {self.balance} < {amount}") + self.balance -= amount +``` + +APPLY FIX? [Y/N]: _ +``` + +### Waiver protocol + +In rare cases, an invariant may need temporary suspension: + +``` +INVARIANT WAIVER REQUEST + +Invariant: cache_size_bound +Reason for waiver: Emergency bulk import requires temporary over-capacity +Duration: Until import completes (estimated 5 minutes) +Risk acknowledged: YES +Compensating control: Manual monitoring of memory usage + +User approval required: _ +``` + +**Waivers require:** +- Explicit user approval +- Documented reason +- Defined duration or condition for restoration +- Compensating controls identified +- Automatic reminder to restore + +## Invariant types deep dive + +### Value constraints + +``` +[VALUE] account.balance >= 0 +[VALUE] 0 < user.age < 150 +[VALUE] password.length >= 8 +[VALUE] retry_count <= max_retries +``` + +**Verification methods:** +- Static: Range analysis, symbolic execution +- Runtime: Assertions, property-based tests +- Database: CHECK constraints + +### State machines + +Define valid states and transitions: + +``` +[STATE] Order lifecycle: + States: {pending, confirmed, processing, shipped, delivered, cancelled, refunded} + + Transitions: + pending → confirmed (on: payment_received) + pending → cancelled (on: user_cancel, timeout) + confirmed → processing (on: start_fulfillment) + processing → shipped (on: carrier_pickup) + shipped → delivered (on: delivery_confirmed) + delivered → refunded (on: refund_approved) + + Terminal: {delivered, cancelled, refunded} + + Forbidden: + * → pending (no restart) + cancelled → * (terminal) + shipped → processing (no reversal) +``` + +**Verification methods:** +- Static: State transition analysis, exhaustive path checking +- Runtime: State machine assertions, event sourcing validation +- Test: Property-based tests with random transitions + +### Relationships + +``` +[REL] Every Order has exactly one Customer +[REL] LineItem.quantity * LineItem.unit_price == LineItem.total +[REL] Parent.children contains Child implies Child.parent == Parent +[REL] User.role IN Organization.allowed_roles +``` + +**Verification methods:** +- Static: Type checking, reference analysis +- Runtime: Foreign key constraints, computed property checks +- Test: Referential integrity tests + +### Cardinality + +``` +[CARD] User.sessions.count <= 5 +[CARD] Order.line_items.count >= 1 +[CARD] Team.members.count BETWEEN 2 AND 10 +[CARD] Singleton.instances.count == 1 +``` + +**Verification methods:** +- Static: Collection size analysis +- Runtime: Length assertions, database constraints +- Test: Boundary tests + +### Uniqueness + +``` +[UNIQUE] User.email (system-wide) +[UNIQUE] Order.order_number (system-wide) +[UNIQUE] Session.token (system-wide) +[UNIQUE] Product.sku (per Vendor) +``` + +**Verification methods:** +- Static: Unique index verification +- Runtime: Database UNIQUE constraints, set membership checks +- Test: Duplicate insertion tests + +### Temporal + +``` +[TEMPORAL] token.expires_at > token.created_at +[TEMPORAL] order.shipped_at > order.confirmed_at (if both exist) +[TEMPORAL] subscription.end_date >= subscription.start_date +[TEMPORAL] event.processed_at <= NOW() + tolerance +``` + +**Verification methods:** +- Static: Temporal logic analysis +- Runtime: DateTime comparisons, monotonic clock checks +- Test: Time-based property tests + +### Ordering + +``` +[ORDER] events sorted by timestamp ASC +[ORDER] priority_queue maintains heap property +[ORDER] version_history sorted by version DESC +[ORDER] search_results sorted by relevance DESC, then date DESC +``` + +**Verification methods:** +- Static: Sort stability analysis +- Runtime: is_sorted assertions, heap property checks +- Test: Ordering preservation tests + +### Consistency + +``` +[CONSIST] if user.is_premium then user.subscription != null +[CONSIST] if order.status == 'shipped' then order.tracking_number != null +[CONSIST] sum(line_items.total) == order.subtotal +[CONSIST] cache.keys == database.active_records.ids +``` + +**Verification methods:** +- Static: Conditional analysis, sum verification +- Runtime: Consistency checks on state changes +- Test: Property-based consistency tests + +## Invariant persistence + +### Storing invariants for future sessions + +When pausing or completing a session, persist invariants: + +**If Memory MCP available (preferred):** +``` +Entity: InvariantSet +Attributes: + - project: <project name> + - task: <task description> + - created_at: <ISO timestamp> + +Relations: + - (InvariantSet)-[:CONTAINS]->(Invariant) + +Entity: Invariant +Attributes: + - id: <invariant id> + - type: <VALUE|STATE|...> + - entity: <affected entity> + - rule: <formal rule> + - severity: <CRITICAL|HIGH|...> + - status: <ACTIVE|SUSPENDED|...> +``` + +**If Qdrant MCP available (fallback):** +``` +metadata.type: "invariant_set" +metadata.project: <project name> +metadata.task: <task description> +metadata.created_at: <ISO timestamp> +content: JSON-serialized invariant registry +``` + +### Loading invariants + +At session start, check for existing invariants: + +``` +SAFEGUARD MODE ON + +Checking for existing invariants... +Found: 4 invariants from previous session (2024-01-15) + +Load existing invariants? [Y/N/Review first]: _ +``` + +## Compatibility with other modes + +### With Micro Mode + +Invariant verification triggers **after each micro edit**: + +``` +[Micro edit] → [Invariant verification] → [Fix if violated] → [⏸️] +``` + +For efficiency: +- Only verify invariants relevant to the changed code +- Batch verification for related micro edits + +### With Scrimmage Mode + +Complementary relationship: +- **Safeguard Mode**: Prevents known violations (defined rules) +- **Scrimmage Mode**: Discovers unknown vulnerabilities (attack vectors) + +Run in parallel: +1. Safeguard Mode checks defined rules +2. Scrimmage Mode attacks beyond defined rules +3. Discovered vulnerabilities can become new invariants + +### With Blast Radius Mode + +Sequence: +1. **Blast Radius**: Analyze what could be affected +2. **Safeguard Mode**: Verify no invariants in blast radius are violated +3. **Apply change** only if both pass + +### With Clearance Mode + +Auto-add invariant criteria: +``` +CLEARANCE CRITERIA (auto-added by Safeguard Mode): +□ All defined invariants verified post-implementation +□ No VIOLATED status on any invariant +□ All WEAKENED invariants acknowledged or fixed +□ Invariant tests added for CRITICAL invariants +``` + +## Quick reference + +### Activation + +``` +SAFEGUARD MODE ON # Standard intensity +SAFEGUARD MODE ON, intensity: strict # With runtime checks +SAFEGUARD MODE ON, load: previous # Load from previous session +``` + +### Defining invariants + +``` +define invariant: <natural language description> +add invariant: [TYPE] <rule> +remove invariant: <id> +suspend invariant: <id> (reason: <reason>, until: <condition>) +restore invariant: <id> +``` + +### During session + +``` +list invariants # Show all active invariants +verify all # Run full verification +verify <id> # Verify specific invariant +waiver <id> # Request temporary suspension +strengthen <id> # Propose stronger enforcement +``` + +### Shorthand responses + +After verification report: + +``` +> # Proceed (all invariants preserved) +fix # Apply proposed fix +waiver # Request temporary suspension +strengthen # Add stronger enforcement +skip # Skip verification (requires justification) +``` diff --git a/protocols/context/static/skills/scrimmage-mode/SKILL.md b/protocols/context/static/skills/scrimmage-mode/SKILL.md new file mode 100644 index 00000000..de6a5a6a --- /dev/null +++ b/protocols/context/static/skills/scrimmage-mode/SKILL.md @@ -0,0 +1,352 @@ +--- +description: Systematic self-attack vulnerability testing after every code change. Activate when user says "SCRIMMAGE MODE ON", "attack your own code", "red-team this", "scrimmage this", or wants proactive security testing. Generates attack vectors from 5 categories (input validation, state, failure modes, concurrency, security), executes attacks, and blocks progression until vulnerabilities are fixed. Configurable depth levels from light to paranoid. +--- + +# Scrimmage Mode: *protocol* + +> <ins>***Goal:** attack your own code immediately after writing it*</ins> +> +> *Proactive vulnerability discovery through systematic self-attack. After each code change, you will generate attack vectors, attempt to break your own code, and only proceed after surviving or fixing all attacks.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "SCRIMMAGE MODE ON" +- "scrimmage this" +- "implement with scrimmage testing" +- "attack your own code" +- "red-team this implementation" + +*follow this Scrimmage Mode protocol* until you are told anything like: + +- "scrimmage mode off" +- "SCRIMMAGE MODE OFF" +- "skip scrimmage testing" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, emit: + +``` +═══════════════════════════════════════ +Scrimmage Mode OFF +Changes tested: N +Attacks generated: M +Vulnerabilities found & fixed: K +Acknowledged risks: [ids, or "none"] +═══════════════════════════════════════ +``` + +### Depth levels + +Scrimmage testing depth can be configured. Default is `standard`. + +| Depth | When to use | Attack vectors per change | +|-------|-------------|---------------------------| +| `light` | Utility code, low-risk changes | 3-5 vectors, input validation only | +| `standard` | Most application code | 5-8 vectors, inputs + state + failures | +| `deep` | Security-critical, financial, auth | 8-12 vectors, full taxonomy | +| `paranoid` | Cryptography, access control, payments | 12+ vectors, including concurrency & timing | + +Activate specific depth: "SCRIMMAGE MODE ON, depth: deep" + +## Core contract + +### The scrimmage guarantee + +For every code change that modifies behavior (not formatting, comments, or renames): + +1. **Generate** targeted attack vectors based on the change +2. **Execute** attacks (mentally simulate or actually run if testable) +3. **Report** results with evidence +4. **Block** progression until all attacks are survived or fixed + +### What constitutes an "attack" + +An attack is a specific input, state, or condition designed to: + +- Cause incorrect behavior +- Trigger an unhandled exception +- Corrupt data or state +- Bypass validation or security controls +- Exhaust resources (memory, CPU, connections) +- Expose sensitive information +- Create race conditions or deadlocks + +## Attack vector taxonomy + +You must draw attacks from these categories based on the code being changed: + +### Category 1: Input validation attacks + +| Vector | Description | Example | +|--------|-------------|---------| +| `null` | Null/None/nil where value expected | `process_user(None)` | +| `empty` | Empty string, list, dict | `parse_csv("")` | +| `boundary` | Min/max values, off-by-one | `withdraw(balance + 1)` | +| `type_confusion` | Wrong type that might coerce | `set_age("25")` vs `set_age(25)` | +| `malformed` | Syntactically invalid input | `parse_json("{invalid")` | +| `oversized` | Extremely large inputs | `process(data_1GB)` | +| `unicode` | Special characters, RTL, emoji | `username = "admin\u0000"` | +| `injection` | SQL, command, template injection | `query("; DROP TABLE users;")` | + +### Category 2: State attacks + +| Vector | Description | Example | +|--------|-------------|---------| +| `invalid_state` | Operation in wrong state | `ship_order(order_status="cancelled")` | +| `stale_state` | Concurrent modification | Read-modify-write without locking | +| `duplicate` | Repeated operation that should be idempotent | `charge_payment()` called twice | +| `ordering` | Out-of-order operations | `complete()` before `start()` | +| `partial` | Incomplete initialization | Object used before fully constructed | + +### Category 3: Failure mode attacks + +| Vector | Description | Example | +|--------|-------------|---------| +| `network_fail` | Connection refused, timeout, DNS failure | External API unreachable | +| `disk_full` | Write failures due to storage | Log rotation fails | +| `oom` | Memory exhaustion | Large file loaded entirely | +| `timeout` | Operation exceeds time limit | Database query hangs | +| `dependency_fail` | Downstream service unavailable | Cache server down | + +### Category 4: Concurrency attacks + +| Vector | Description | Example | +|--------|-------------|---------| +| `race_condition` | Timing-dependent correctness | Check-then-act without synchronization | +| `deadlock` | Circular wait on resources | Lock A then B vs Lock B then A | +| `starvation` | Resource never available | Unfair scheduling | +| `lost_update` | Concurrent writes clobber each other | Two threads increment counter | + +### Category 5: Security attacks + +| Vector | Description | Example | +|--------|-------------|---------| +| `authz_bypass` | Access without proper authorization | Direct object reference | +| `authn_bypass` | Circumvent authentication | Token reuse, session fixation | +| `info_leak` | Sensitive data in logs/errors | Stack trace with credentials | +| `privilege_escalation` | Gain elevated access | User becomes admin | +| `timing_attack` | Information via timing differences | Password comparison timing | + +## Execution protocol + +### When to attack + +Trigger scrimmage testing after: + +1. **Completing a function or method** (new or modified) +2. **Modifying control flow** (if/else, loops, error handling) +3. **Changing data validation or transformation** +4. **Touching security-related code** (auth, crypto, access control) +5. **Modifying state management** (database, cache, session) + +Do NOT trigger for: + +- Pure formatting changes +- Comment-only changes +- Renaming without logic change +- Import reordering + +### Attack generation process + +For each triggering change: + +1. **Identify attack surface**: What inputs, states, and failure modes does this code interact with? + +2. **Select relevant categories**: Based on what the code does: + - Handles user input → Category 1 (inputs) + - Manages state/database → Category 2 (state) + - Calls external services → Category 3 (failures) + - Uses threads/async → Category 4 (concurrency) + - Controls access → Category 5 (security) + +3. **Generate specific vectors**: Create concrete attack scenarios, not abstract categories + +4. **Prioritize by risk**: High-severity vectors first + +### Attack report format + +After generating attacks, emit a report in a similar format to the following exemplar: + +``` +══════════════════════════════════════════════════════════ +SCRIMMAGE ANALYSIS: <file>::<function> +Depth: <light|standard|deep|paranoid> +══════════════════════════════════════════════════════════ + +ATTACK VECTORS: + +[1] null_user_id (Category: input/null) + Attack: Call get_user(user_id=None) + Expected: Should raise ValueError or return None safely + Result: ✅ SURVIVED — raises ValueError("user_id required") + +[2] sql_injection (Category: security/injection) + Attack: get_user(user_id="1; DROP TABLE users;--") + Expected: Should use parameterized query, not string concat + Result: ✅ SURVIVED — using parameterized query + +[3] negative_balance (Category: input/boundary) + Attack: withdraw(amount=-100) + Expected: Should reject negative amounts + Result: ❌ BROKEN — accepts negative, effectively deposits + Fix: Add validation `if amount <= 0: raise ValueError` + +[4] concurrent_withdraw (Category: concurrency/race) + Attack: Two threads withdraw(50) when balance=75 + Expected: One should fail, total withdrawn ≤ 75 + Result: ⚠️ POTENTIAL — no locking visible, needs verification + Mitigation: Add SELECT FOR UPDATE or application-level lock + +══════════════════════════════════════════════════════════ +SUMMARY: 4 vectors | 2 survived | 1 broken | 1 potential +ACTION REQUIRED: Fix [3], verify [4] before proceeding +══════════════════════════════════════════════════════════ +``` + +### Result classifications + +| Symbol | Status | Meaning | Action | +|--------|--------|---------|--------| +| ✅ | `SURVIVED` | Code handles attack correctly | None required | +| ❌ | `BROKEN` | Attack succeeded, vulnerability confirmed | **Must fix before proceeding** | +| ⚠️ | `POTENTIAL` | Cannot verify without runtime test | Document, recommend verification | +| 🛡️ | `MITIGATED` | Vulnerability exists but mitigated elsewhere | Document mitigation | +| 📝 | `ACKNOWLEDGED` | User accepted risk explicitly | Log acknowledgment | + +## Failure handling + +### On BROKEN result + +1. **Stop immediately** — do not proceed to next change +2. **Propose fix** — concrete code change to address vulnerability +3. **Apply fix** — implement the mitigation +4. **Re-attack** — verify the fix actually works +5. **Continue only after** — all BROKEN vectors become SURVIVED or MITIGATED + +### On POTENTIAL result + +1. **Document the risk** clearly +2. **Propose verification method** (test to write, manual check needed) +3. **Ask user**: "Acknowledge risk and proceed, or pause to verify?" +4. **If acknowledged**: Mark as 📝 ACKNOWLEDGED with user's response +5. **If verify**: Pause, write verification, confirm result + +### Escalation + +If you discover a vulnerability that: + +- Affects code outside current scope +- Indicates systemic issue (pattern appears elsewhere) +- Has high severity (auth bypass, data corruption, RCE) + +**Escalate immediately**: + +``` +⚠️ ESCALATION: High-severity vulnerability pattern detected + +Finding: SQL injection via string concatenation +Location: src/db/queries.py::get_user (current change) +Pattern also appears in: + - src/db/queries.py::get_orders (line 45) + - src/db/queries.py::search_products (line 89) + - src/api/admin.py::lookup_user (line 23) + +Recommend: Pause current work, address systemic issue first. +Proceed with current fix only? [y/n] +``` + +## Attack persistence + +### Storing attacks for regression + +When attacks are generated, optionally persist them as test cases: + +**If test framework is available**: + +```python +# Auto-generated scrimmage test from Scrimmage Mode +# Attack vector: null_user_id (Category: input/null) +def test_scrimmage_get_user_null_input(): + """Scrimmage: get_user should handle None user_id safely.""" + with pytest.raises(ValueError, match="user_id required"): + get_user(user_id=None) +``` + +**Offer to generate**: After each attack session, ask: +"Generate test cases for these attack vectors? [y/n]" + +### Session persistence + +If pausing scrimmage session: + +1. Store pending attack vectors to memory (Qdrant with `metadata.type: "scrimmage_session"`) +2. On resume: reload vectors, continue from last position + +## Compatibility with other modes + +### With Micro Mode + +Scrimmage testing triggers **after each micro edit**, not after each line: + +``` +[Micro Mode Step] → [Apply micro edit] → [Scrimmage analysis] → [Fix if broken] → [⏸️] +``` + +The micro edit is not complete until scrimmage analysis passes. + +### With Contract-First Mode + +Scrimmage testing applies to **implementation phase only**, not contract design: + +1. Contract-First: Design interfaces (no scrimmage testing) +2. Contract-First: User approves interfaces +3. Implementation begins → Scrimmage Mode activates + +### With Clearance Mode + +Add scrimmage criteria automatically: + +``` +CLEARANCE CRITERIA (auto-added by Scrimmage Mode): +□ All code changes passed scrimmage analysis +□ No BROKEN vectors remain unfixed +□ All POTENTIAL vectors either verified or acknowledged +□ No unaddressed escalations +``` + +## Quick reference + +### Activation + +``` +SCRIMMAGE MODE ON # Standard depth +SCRIMMAGE MODE ON, depth: deep # Deep testing +SCRIMMAGE MODE ON, depth: paranoid # Maximum scrutiny +``` + +### During session + +``` +> # Proceed (after all attacks pass) +skip # Skip scrimmage testing for this one change (requires justification) +deeper # Increase depth for current analysis +reattack # Re-run attacks after manual changes +``` + +### Key commands + +``` +show vectors # List all attack vectors for current change +add vector X # Add custom attack vector +focus category Y # Prioritize specific attack category +persist tests # Generate test cases from attacks +``` diff --git a/protocols/context/static/skills/shadow-mode/SKILL.md b/protocols/context/static/skills/shadow-mode/SKILL.md new file mode 100644 index 00000000..1be3b8d3 --- /dev/null +++ b/protocols/context/static/skills/shadow-mode/SKILL.md @@ -0,0 +1,891 @@ +--- +description: Propose-only editing with inverted control flow where agent proposes changes as diffs and human applies manually. Activate when user says "SHADOW MODE ON", "propose don't apply", "show me the diffs", "I'll apply manually", or "don't touch my files". Supports 5 output formats (unified diff, side-by-side, contextual, git patch, step-by-step instructions) and 4 granularity levels. Verifies correct application before proceeding. Ideal for learning, maximum transparency, or untrusted environments. +--- + +# Shadow Mode: *protocol* + +> <ins>***Goal:** agent proposes, human applies—inverted control*</ins> +> +> *Maximum transparency and learning through manual application. You will output all changes as reviewable diffs or patches, never apply them directly, and verify correct application before proceeding.* + +> [!IMPORTANT] +> +> The directives below are **non-negotiable hard constraints** to be followed **exactly as they are specified**. + +## Entry/exit protocols + +### Activation/deactivation + +When the user says anything like: + +- "SHADOW MODE ON" +- "propose don't apply" +- "show me the diffs" +- "I'll apply manually" +- "don't touch my files" +- "suggest only" + +*follow this Shadow Mode protocol* until you are told anything like: + +- "exit shadow mode" +- "SHADOW MODE OFF" +- "you can apply changes now" +- "auto-apply mode" + +If you are unsure, confirm unambiguously with the user. + +Upon exit, emit: + +``` +═══════════════════════════════════════ +Shadow Mode OFF +Changes proposed: N +Changes applied by user: M +Changes skipped: K +Verification failures: J +═══════════════════════════════════════ +``` + +### Output format preference + +Output format can be configured. Default is `unified`. + +| Format | Description | Best for | +|--------|-------------|----------| +| `unified` | Standard unified diff format | Git users, patch application | +| `side-by-side` | Before/after columns | Visual comparison | +| `contextual` | Change with surrounding code | Understanding context | +| `patch` | Git-applicable patch file | Version control integration | +| `instructions` | Step-by-step natural language | Manual editing, learning | + +Activate specific format: "SHADOW MODE ON, format: side-by-side" + +### Granularity + +Change granularity can be configured. Default is `logical`. + +| Granularity | Description | When to use | +|-------------|-------------|-------------| +| `atomic` | One change at a time, smallest possible | Maximum control, learning | +| `logical` | Logically related changes grouped | Balance of control and efficiency | +| `file` | All changes to a file at once | Faster application | +| `batch` | Multiple files in one proposal | Experienced users, large refactors | + +Activate specific granularity: "SHADOW MODE ON, granularity: atomic" + +## Core contract + +### The shadow guarantee + +For every code modification: + +1. **Propose** the change as a reviewable artifact +2. **Never** apply changes directly to files +3. **Wait** for user confirmation of application +4. **Verify** the change was applied correctly +5. **Proceed** only after verification passes + +### What the agent outputs (never executes) + +| Change type | Output format | +|-------------|---------------| +| Code modification | Diff/patch showing exact changes | +| New file creation | Full file content with clear "CREATE FILE" header | +| File deletion | Clear "DELETE FILE" instruction with confirmation | +| File rename/move | Source and destination paths with any content changes | +| Configuration change | Before/after with explanation of impact | + +### What the user does + +1. **Reviews** the proposed change +2. **Applies** manually (copy-paste, patch command, IDE, etc.) +3. **Signals** completion: "applied", "done", ">", or "." +4. **Or rejects**: "skip", "no", "different approach" + +## Proposal format + +### Standard proposal structure + +Every change proposal must follow this format: + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [N of M] +File: <relative path> +Action: <MODIFY | CREATE | DELETE | RENAME> +Purpose: <one-line description> +══════════════════════════════════════════════════════════════════════ + +<diff or content based on format preference> + +══════════════════════════════════════════════════════════════════════ +INSTRUCTIONS: +<step-by-step application guide> + +When applied, respond: "applied" or ">" +To skip: "skip" or "no" +To modify approach: describe your preference +══════════════════════════════════════════════════════════════════════ +``` + +### Unified diff format (default) + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [1 of 3] +File: src/services/user_service.py +Action: MODIFY +Purpose: Add input validation to update_email method +══════════════════════════════════════════════════════════════════════ + +--- src/services/user_service.py ++++ src/services/user_service.py +@@ -45,6 +45,10 @@ class UserService: + def update_email(self, user_id: str, new_email: str) -> User: ++ # Validate email format ++ if not self._is_valid_email(new_email): ++ raise ValueError(f"Invalid email format: {new_email}") ++ + user = self.repository.get(user_id) + if not user: + raise UserNotFound(user_id) + +══════════════════════════════════════════════════════════════════════ +APPLY WITH: + Option A: Copy the added lines (45-48) into your editor + Option B: Save diff to file, run: git apply change.patch + +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### Side-by-side format + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [1 of 3] +File: src/services/user_service.py +Action: MODIFY +Purpose: Add input validation to update_email method +══════════════════════════════════════════════════════════════════════ + +BEFORE (lines 45-50) │ AFTER (lines 45-54) +────────────────────────────────────────┼──────────────────────────────────────── +def update_email(self, user_id, email): │ def update_email(self, user_id, email): + │ # Validate email format + │ if not self._is_valid_email(email): + │ raise ValueError(f"Invalid: {email}") + │ + user = self.repository.get(user_id) │ user = self.repository.get(user_id) + if not user: │ if not user: + raise UserNotFound(user_id) │ raise UserNotFound(user_id) + +══════════════════════════════════════════════════════════════════════ +INSTRUCTIONS: +1. Open src/services/user_service.py +2. Find the update_email method (around line 45) +3. Add the 4 new lines after the method signature +4. Save the file + +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### Contextual format + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [1 of 3] +File: src/services/user_service.py +Action: MODIFY +Purpose: Add input validation to update_email method +══════════════════════════════════════════════════════════════════════ + +LOCATION: UserService class, update_email method + +FIND THIS CODE (around line 45): +┌─────────────────────────────────────────────────────────────────────┐ +│ def update_email(self, user_id: str, new_email: str) -> User: │ +│ user = self.repository.get(user_id) │ +│ if not user: │ +└─────────────────────────────────────────────────────────────────────┘ + +INSERT AFTER LINE 45 (after method signature, before user = ...): +┌─────────────────────────────────────────────────────────────────────┐ +│ # Validate email format │ +│ if not self._is_valid_email(new_email): │ +│ raise ValueError(f"Invalid email format: {new_email}") │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ + +RESULT SHOULD LOOK LIKE: +┌─────────────────────────────────────────────────────────────────────┐ +│ def update_email(self, user_id: str, new_email: str) -> User: │ +│ # Validate email format │ +│ if not self._is_valid_email(new_email): │ +│ raise ValueError(f"Invalid email format: {new_email}") │ +│ │ +│ user = self.repository.get(user_id) │ +│ if not user: │ +└─────────────────────────────────────────────────────────────────────┘ + +══════════════════════════════════════════════════════════════════════ +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### Patch format (git-applicable) + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [1 of 3] +File: src/services/user_service.py +Action: MODIFY +Purpose: Add input validation to update_email method +══════════════════════════════════════════════════════════════════════ + +Save the following to a file (e.g., change-001.patch): + +───────────────────────────────────────────────────────────────────── +From: Agent <agent@shadow.mode> +Date: Mon, 30 Dec 2024 12:00:00 +0000 +Subject: [PATCH] Add email validation to update_email + +--- + src/services/user_service.py | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/src/services/user_service.py b/src/services/user_service.py +index abc1234..def5678 100644 +--- a/src/services/user_service.py ++++ b/src/services/user_service.py +@@ -45,6 +45,10 @@ class UserService: + def update_email(self, user_id: str, new_email: str) -> User: ++ # Validate email format ++ if not self._is_valid_email(new_email): ++ raise ValueError(f"Invalid email format: {new_email}") ++ + user = self.repository.get(user_id) + if not user: + raise UserNotFound(user_id) +-- +2.34.1 +───────────────────────────────────────────────────────────────────── + +APPLY WITH: + git apply change-001.patch + # or + git am change-001.patch (to create a commit) + +══════════════════════════════════════════════════════════════════════ +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### Instructions format (natural language) + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [1 of 3] +File: src/services/user_service.py +Action: MODIFY +Purpose: Add input validation to update_email method +══════════════════════════════════════════════════════════════════════ + +STEP-BY-STEP INSTRUCTIONS: + +1. Open the file: src/services/user_service.py + +2. Navigate to the UserService class (around line 30) + +3. Find the update_email method (around line 45). It should look like: + + def update_email(self, user_id: str, new_email: str) -> User: + user = self.repository.get(user_id) + ... + +4. Position your cursor at the end of line 45 (after the colon) + +5. Press Enter to create a new line + +6. Add the following 4 lines (with proper indentation - 8 spaces): + + # Validate email format + if not self._is_valid_email(new_email): + raise ValueError(f"Invalid email format: {new_email}") + + (Note: there's a blank line after the raise statement) + +7. Save the file (Cmd+S / Ctrl+S) + +8. Verify: The method should now have validation before fetching the user + +══════════════════════════════════════════════════════════════════════ +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### New file creation + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [2 of 3] +File: src/utils/validators.py +Action: CREATE (new file) +Purpose: Create email validation utility +══════════════════════════════════════════════════════════════════════ + +CREATE NEW FILE: src/utils/validators.py + +───────────────────────────────────────────────────────────────────── +"""Validation utilities for the application.""" +import re +from typing import Optional + + +EMAIL_REGEX = re.compile( + r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' +) + + +def is_valid_email(email: str) -> bool: + """ + Validate email format. + + Args: + email: The email address to validate + + Returns: + True if valid, False otherwise + """ + if not email or not isinstance(email, str): + return False + return EMAIL_REGEX.match(email) is not None + + +def sanitize_email(email: str) -> Optional[str]: + """ + Sanitize and normalize email address. + + Args: + email: The email address to sanitize + + Returns: + Normalized email or None if invalid + """ + if not is_valid_email(email): + return None + return email.lower().strip() +───────────────────────────────────────────────────────────────────── + +INSTRUCTIONS: +1. Create directory if needed: mkdir -p src/utils +2. Create the file: touch src/utils/validators.py +3. Open src/utils/validators.py in your editor +4. Paste the entire content above +5. Save the file + +══════════════════════════════════════════════════════════════════════ +When applied, respond: "applied" or ">" +══════════════════════════════════════════════════════════════════════ +``` + +### File deletion + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSED CHANGE [3 of 3] +File: src/utils/old_validators.py +Action: DELETE +Purpose: Remove deprecated validator module +══════════════════════════════════════════════════════════════════════ + +⚠️ DELETE FILE: src/utils/old_validators.py + +SAFETY CHECK: +- No other files import from this module (verified) +- Functionality has been migrated to src/utils/validators.py +- File contains 45 lines of deprecated code + +INSTRUCTIONS: + Option A (keep history): git rm src/utils/old_validators.py + Option B (immediate): rm src/utils/old_validators.py + +══════════════════════════════════════════════════════════════════════ +⚠️ This action is destructive. Confirm with "delete" or skip with "keep" +══════════════════════════════════════════════════════════════════════ +``` + +## Verification protocol + +### Post-application verification + +After user signals "applied": + +1. **Re-read the modified file** completely +2. **Compare** actual content with expected result +3. **Report** verification status + +### Verification report format + +``` +VERIFICATION: src/services/user_service.py + +Checking applied changes... + +✅ VERIFIED - Changes applied correctly + - Lines 46-49: Email validation added as expected + - Indentation: Correct (8 spaces) + - No unintended modifications detected + +Proceeding to next change... +``` + +### Verification failure + +If verification fails: + +``` +VERIFICATION: src/services/user_service.py + +Checking applied changes... + +❌ MISMATCH DETECTED + +Expected (lines 46-49): +┌─────────────────────────────────────────────────────────────────────┐ +│ # Validate email format │ +│ if not self._is_valid_email(new_email): │ +│ raise ValueError(f"Invalid email format: {new_email}") │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ + +Actual (lines 46-49): +┌─────────────────────────────────────────────────────────────────────┐ +│ # Validate email format │ +│ if not self._is_valid_email(new_email): │ +│ raise ValueError("Invalid email format") │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ + +DIFFERENCE: Line 48 - f-string formatting missing, error message + won't include the invalid email value + +OPTIONS: + [F] Fix it - I'll re-propose with correction guidance + [A] Accept as-is - Proceed with your version + [R] Retry - Show the proposal again + +Your choice: _ +``` + +### Partial application + +If user applies only part of a multi-part change: + +``` +VERIFICATION: src/services/user_service.py + +Checking applied changes... + +⚠️ PARTIAL APPLICATION + +Applied: + ✅ Lines 46-47: Comment and if statement + +Missing: + ❌ Lines 48-49: raise statement and blank line + +This may cause: SyntaxError (if block without body) + +OPTIONS: + [C] Complete - Show remaining lines to add + [U] Undo - Show how to remove partial changes + [A] Accept - I'll adapt to work with partial (not recommended) + +Your choice: _ +``` + +## User responses + +### Standard responses + +| User says | Agent action | +|-----------|--------------| +| `applied`, `done`, `>`, `.` | Verify, then proceed to next | +| `skip`, `no`, `pass` | Skip this change, proceed to next | +| `undo`, `revert` | Show undo instructions | +| `retry`, `again` | Re-propose same change | +| `different`, `alternative` | Propose alternative approach | +| `explain` | Explain the change in detail | +| `why` | Explain rationale for the change | +| `batch` | Combine remaining changes into one proposal | +| `pause` | Save progress, exit shadow mode temporarily | + +### Handling "skip" + +When user skips a change: + +``` +SKIPPED: Change [2 of 5] - Add email validation + +Note: Skipping may affect subsequent changes. + Changes [3] and [4] depend on this validation existing. + +Options: + [C] Continue anyway - I'll adapt remaining changes + [D] Show dependencies - See what else is affected + [R] Reconsider - Show the change again + +Your choice: _ +``` + +### Handling "undo" + +When user requests undo after misapplication: + +``` +UNDO GUIDANCE: src/services/user_service.py + +To revert the last change: + +Option A - If you haven't saved yet: + Cmd+Z / Ctrl+Z in your editor + +Option B - If using git: + git checkout -- src/services/user_service.py + +Option C - Manual restoration: + Remove lines 46-49 (the validation block you just added) + + The file should return to: + ┌─────────────────────────────────────────────────────────────────┐ + │ def update_email(self, user_id: str, new_email: str) -> User: │ + │ user = self.repository.get(user_id) │ + │ if not user: │ + └─────────────────────────────────────────────────────────────────┘ + +When reverted, respond: "reverted" or "undone" +``` + +## Multi-change proposals + +### Logical grouping (default) + +Related changes are grouped logically: + +``` +══════════════════════════════════════════════════════════════════════ +PROPOSAL SET: Add email validation feature +Contains: 3 related changes +══════════════════════════════════════════════════════════════════════ + +[1/3] CREATE: src/utils/validators.py + Purpose: Email validation utility functions + +[2/3] MODIFY: src/services/user_service.py + Purpose: Import and use validators + +[3/3] MODIFY: tests/test_user_service.py + Purpose: Add validation tests + +══════════════════════════════════════════════════════════════════════ +Viewing: Change [1/3] +Navigation: "next" / "prev" / "list" / "apply all" +══════════════════════════════════════════════════════════════════════ +``` + +### Batch application + +If user requests batch mode: + +``` +══════════════════════════════════════════════════════════════════════ +BATCH PROPOSAL: All remaining changes +══════════════════════════════════════════════════════════════════════ + +CHANGE 1: src/utils/validators.py (CREATE) +───────────────────────────────────────── +[full content...] + +CHANGE 2: src/services/user_service.py (MODIFY) +───────────────────────────────────────── +[diff...] + +CHANGE 3: tests/test_user_service.py (MODIFY) +───────────────────────────────────────── +[diff...] + +══════════════════════════════════════════════════════════════════════ +BATCH APPLY INSTRUCTIONS: + +1. Create src/utils/validators.py with content from Change 1 +2. Modify src/services/user_service.py per Change 2 +3. Modify tests/test_user_service.py per Change 3 + +Or apply as git patches: + cat changes/*.patch | git apply + +When all applied, respond: "applied all" +══════════════════════════════════════════════════════════════════════ +``` + +## Version control integration + +### Git patch generation + +When format is `patch`, generate git-applicable patches: + +```bash +# Apply single patch +git apply change-001.patch + +# Apply with commit +git am change-001.patch + +# Apply all patches in order +for patch in changes/*.patch; do git apply "$patch"; done + +# Preview what would change +git apply --stat change-001.patch +git apply --check change-001.patch # dry run +``` + +### Stash workflow + +For complex changes, suggest stash workflow: + +``` +RECOMMENDED WORKFLOW (complex change set): + +1. Stash current work (if any): + git stash push -m "WIP before shadow mode changes" + +2. Apply changes one-by-one as I propose them + +3. When complete, create a commit: + git add -A && git commit -m "Apply shadow mode changes" + +4. Restore stashed work (if any): + git stash pop +``` + +## Session management + +### Progress tracking + +Track progress through change set: + +``` +══════════════════════════════════════════════════════════════════════ +SHADOW MODE PROGRESS +══════════════════════════════════════════════════════════════════════ + +Task: Implement email validation feature + +Changes: + [1] ✅ APPLIED src/utils/validators.py (CREATE) + [2] ✅ APPLIED src/services/user_service.py (MODIFY) + [3] ⏳ CURRENT tests/test_user_service.py (MODIFY) + [4] ⏸️ PENDING src/api/routes.py (MODIFY) + [5] ⏸️ PENDING docs/api.md (MODIFY) + +Progress: 2/5 complete (40%) +══════════════════════════════════════════════════════════════════════ +``` + +### Pause and resume + +When user says "pause": + +``` +⏸️ SHADOW MODE PAUSED + +Progress saved: + - Completed: 2 changes + - Current: Change [3] (not yet applied) + - Remaining: 2 changes + +Resume with: "SHADOW MODE ON, continue" or "resume shadow mode" + +Session stored: [memory location or "chat context only"] +``` + +### Session persistence + +**If Qdrant MCP available:** +``` +metadata.type: "shadow_mode_session" +metadata.task: <task description> +metadata.project: <project name> +metadata.created_at: <ISO timestamp> +content: { + changes: [...], + progress: { completed: [], current: N, pending: [] }, + format: "unified", + granularity: "logical" +} +``` + +## Error handling + +### Application errors + +If user reports an error after applying: + +``` +USER: "applied but getting syntax error" + +AGENT: +ERROR DIAGNOSIS + +Let me re-read the file to understand what happened... + +[Reads file] + +FOUND: Indentation error on line 47 + +The issue: Mixed tabs and spaces. Your editor inserted tabs + but the file uses spaces. + +FIX: +1. Select lines 46-49 +2. Convert indentation to spaces (in VS Code: Cmd+Shift+P → + "Convert Indentation to Spaces") +3. Ensure 8 spaces before each line + +Or delete lines 46-49 and I'll re-propose with explicit +indentation markers. + +Choice: [fix manually / re-propose / help]: _ +``` + +### Merge conflicts + +If user has modified the target area: + +``` +USER: "applied" (but file has diverged) + +AGENT: +⚠️ VERIFICATION: Unexpected content detected + +The file has changed since I analyzed it. It appears you (or +another process) modified the area around line 45. + +CURRENT FILE (lines 43-50): +┌─────────────────────────────────────────────────────────────────┐ +│ def update_email(self, user_id: str, new_email: str) -> User: │ +│ # Added by user: logging │ +│ logger.info(f"Updating email for {user_id}") │ +│ # My proposed validation (applied) │ +│ if not self._is_valid_email(new_email): │ +│ raise ValueError(f"Invalid email format: {new_email}") │ +└─────────────────────────────────────────────────────────────────┘ + +This looks correct - you added logging AND my validation. + +Options: + [A] Accept - This merged version is correct + [R] Revert - Show me how to undo to original + [D] Different - I want a different arrangement + +Your choice: _ +``` + +## Compatibility with other modes + +### With Micro Mode + +Shadow mode can wrap micro mode edits: + +``` +[Micro Mode step] → [Generate diff, don't apply] → [User applies] → +[Verify] → [⏸️ micro mode pause] +``` + +Each micro edit becomes a shadow proposal. + +### With Scrimmage Mode + +Run scrimmage analysis on proposed changes before user applies: + +``` +PROPOSED CHANGE [1 of 3] +[diff...] + +SCRIMMAGE ANALYSIS (pre-application): + ✅ No injection vulnerabilities + ✅ Handles null input + ⚠️ Consider: What if email contains unicode? + +Apply with confidence: This change passes scrimmage review. +``` + +### With Blast Radius Mode + +Show blast radius in proposal: + +``` +PROPOSED CHANGE [1 of 3] +[diff...] + +BLAST RADIUS: + - Direct callers: 3 functions + - Test coverage: 2 test files + - No breaking API changes + +Change is low-risk. Apply when ready. +``` + +### With Safeguard Mode + +Verify invariants on proposed (not yet applied) changes: + +``` +PROPOSED CHANGE [1 of 3] +[diff...] + +INVARIANT CHECK (pre-application): + ✅ non_negative_balance: Not affected + ✅ user_email_unique: Validation strengthens this + ✅ audit_trail_complete: Not affected + +All invariants will be preserved. Apply when ready. +``` + +## Quick reference + +### Activation + +``` +SHADOW MODE ON # Default (unified, logical) +SHADOW MODE ON, format: side-by-side # Visual comparison +SHADOW MODE ON, format: instructions # Step-by-step natural language +SHADOW MODE ON, granularity: atomic # One change at a time +SHADOW MODE ON, format: patch # Git-applicable patches +``` + +### During session + +``` +applied / done / > / . # Confirm application, verify, proceed +skip / no / pass # Skip current change +retry / again # Re-show current proposal +undo / revert # Show undo instructions +explain / why # Explain change rationale +batch # Combine remaining into one +next / prev # Navigate multi-change proposals +list # Show all changes in set +pause # Save progress, exit temporarily +``` + +### Navigation (multi-change) + +``` +next # View next change +prev # View previous change +list # Overview of all changes +goto N # Jump to change N +apply all # Batch apply remaining +``` diff --git a/protocols/context/static/tools-guide.md b/protocols/context/static/tools-guide.md index 768b7279..4f58f1ae 100644 --- a/protocols/context/static/tools-guide.md +++ b/protocols/context/static/tools-guide.md @@ -1,39 +1,35 @@ # Tools: quick decision guide > [!NOTE] -> +> > This guide: -> +> > - Provides quick directives/heuristics as to which tools to use per task. > - Contains links to: > > | File set | Look at these files <ins>only if</ins> *(to save context)*: | > | --- | --- | -> | Per‑category guides | Your exact, desired use case is not covered here | -> | Per‑MCP deep dives | When you need full guidance on the intricacies of using a particular MCP's toolset | > | Editing mode files | When user explicitly activates | ## Code search -- For going through **public, open‑source code**: use Sourcegraph ([deep dive](deep-dives/sourcegraph.md)) to find examples/patterns (interactive time/result limits) +- For going through **public, open‑source code**: use Sourcegraph to find examples/patterns (interactive time/result limits) - **Within local codebases**: - - For **semantic navigation/symbol-level refactors**: use Serena MCP ([deep dive](deep-dives/serena.md)) + - For **semantic navigation/symbol-level refactors**: use Serena MCP - For **simple text searches**: use ripgrep/grep to find plain text/regex matches fast (respects .gitignore) -> Link: [Full category guide - *code search*](by-category/code-search.md) ## Web research -- For **general web info**: use Tavily (cited results; 1k/mo) ([deep dive](deep-dives/tavily.md)) +- For **general web info**: use Tavily (cited results; 1k/mo) - - Fallback if *Tavily is exhausted*: use Brave (2k/mo) ([deep dive](deep-dives/brave.md)) + - Fallback if *Tavily is exhausted*: use Brave (2k/mo) - Fallback if *Tavily and Brave are exhausted*: use Playwright to use the browser to search the web (unlimited) -- For **simple URL fetches**: use Fetch (unlimited) ([deep dive](deep-dives/fetch.md)) +- For **simple URL fetches**: use Fetch (unlimited) -> Link: [full category guide - *web research*](by-category/web-research.md) ## GitHub access @@ -54,9 +50,8 @@ ## API docs -- For **official documentation**: use Context7 (versioned; public repos only) ([deep dive](deep-dives/context7.md)) +- For **official documentation**: use Context7 (versioned; public repos only) -> Link: [full category guide - *API docs*](by-category/documentation.md) ## Memory @@ -134,7 +129,6 @@ > | Serena MCP | *None required; automatically created* | > | claude-mem (Claude Code *only*) | *None required; automatically created* | -> Link: [full category guide - *memory MCPs*](by-category/memory.md) ## Code analysis, editing and Git @@ -159,10 +153,6 @@ - For *all other operations*, default to your **native/built-in Write/Edit tool(s)**. -> [!TIP] -> -> To resolve any remaining ambiguities, see the [Serena deep dive](deep-dives/serena.md) for the *full* decision tree for symbol vs text-based editing. - ### Git operations Use (via Bash): @@ -172,13 +162,11 @@ Use (via Bash): ### Security and quality scans -Use **Semgrep** (local scans; autofix) *(for more info, see the [Semgrep deep dive](deep-dives/semgrep.md))* +Use **Semgrep** (local scans; autofix) ## Browser automation -- For **web automation and testing**: use Playwright (click, type, navigate, extract content) ([deep dive](deep-dives/playwright.md)) - -> Link: [full category guide - *browser automation*](by-category/browser-automation.md) +- For **web automation and testing**: use Playwright (click, type, navigate, extract content) ## Limits @@ -189,20 +177,3 @@ All non-listed MCPs are local and/or have no usage limits. | Tavily | 1,000 credits/month | Resets on 1st of month | | Brave | 2,000 queries/month | Free tier; basic web search | | Sourcegraph | Interactive limits | use count:all to make the search exhaustive, bump timeout if needed; switch to src-cli for very large result sets beyond the UI display limit. | - -## Editing modes - -> [!IMPORTANT] -> -> You must continuously watch for any activation inputs provided by the user as mentioned below in order to read the protocol file for and activate the correct editing mode. -> -> If unsure/unambiguous, confirm clearly. - -| Editing mode | Activate when user says *anything like* | Full protocol file (read *only* when activated) | -| --- | --- | --- | -| Micro Mode | "MICRO MODE ON", "complete/implement in micro mode", etc. | [`modes/micro.md`](modes/micro.md) | -| Adversarial Mode | "ADVERSARIAL MODE ON", "attack your own code", "red-team this", etc. | [`modes/adversarial.md`](modes/adversarial.md) | -| Blast Radius Mode | "BLAST RADIUS MODE ON", "analyze impact", "careful mode", etc. | [`modes/blast-radius.md`](modes/blast-radius.md) | -| Invariant Guard Mode | "INVARIANT GUARD MODE ON", "protect these invariants", "verify invariants", etc. | [`modes/invariant-guard.md`](modes/invariant-guard.md) | -| Shadow Mode | "SHADOW MODE ON", "propose don't apply", "show me the diffs", "I'll apply manually", etc. | [`modes/shadow.md`](modes/shadow.md) | -| Exit Criteria Mode | "EXIT CRITERIA MODE ON", "define done as", "success criteria first", "verify completion", etc. | [`modes/exit-criteria.md`](modes/exit-criteria.md) | diff --git a/protocols/context/templates/AGENTS.template.md b/protocols/context/templates/AGENTS.template.md index 37c7476e..0d81aa26 100644 --- a/protocols/context/templates/AGENTS.template.md +++ b/protocols/context/templates/AGENTS.template.md @@ -6,26 +6,7 @@ You ***must* read these essential files** using the appropriate read tool: - before starting any task - at the beginning of any conversation -### [MCP tools available: quick reference]({{REPO_ROOT}}/protocols/context/static/tools-guide.md) - -> **Read**: `@{{REPO_ROOT}}/protocols/context/static/tools-guide.md` - -Contains: - - - Fast tool selection guide (Tier 1) - - ~330 tokens, always worth reading - - Covers: code search, web research, API docs, memory, file operations - -### [Handoff guidelines]({{REPO_ROOT}}/protocols/context/static/handoff-guide.md) - -> **Read**: `@{{REPO_ROOT}}/protocols/context/static/handoff-guide.md` - -Covers: - - - When to delegate work to other agents/models using PAL's `clink` MCP tool - - How to select the model/CLI to use when spawning agents (Codex/Gemini/Claude) - - When to ask user vs handle directly - - What requires explicit approval +{{MUST_READ_SECTION}} > [!IMPORTANT] > You must read these files *every time* (even when spawned with a specialized role) since they provide: @@ -102,9 +83,3 @@ You are running via Gemini CLI or Codex, not Claude Code. This means: - You have access to the same MCPs as Claude Code - Role prompts are loaded from `~/.pal/cli_clients/systemprompts/role-prompts/` - Configurations are in `~/.pal/cli_clients/*.json` - -## Superpowers System *(Codex-only)* - -<EXTREMELY_IMPORTANT> -You have superpowers. Superpowers teach you new skills and capabilities. RIGHT NOW run: `~/.codex/superpowers/.codex/superpowers-codex bootstrap` and follow the instructions it returns. -</EXTREMELY_IMPORTANT> \ No newline at end of file diff --git a/protocols/context/templates/CLAUDE.template.md b/protocols/context/templates/CLAUDE.template.md index a61ac9ca..f761226a 100644 --- a/protocols/context/templates/CLAUDE.template.md +++ b/protocols/context/templates/CLAUDE.template.md @@ -6,26 +6,7 @@ You ***must* read these essential files** using the Read tool: - before starting any task - at the beginning of any conversation -### [MCP tools available: quick reference]({{REPO_ROOT}}/protocols/context/static/tools-guide.md) - -> **Read**: `@{{REPO_ROOT}}/protocols/context/static/tools-guide.md` - -Contains: - - - Fast tool selection guide (Tier 1) - - ~330 tokens, always worth reading - - Covers: code search, web research, API docs, memory, file operations - -### [Handoff guidelines]({{REPO_ROOT}}/protocols/context/static/handoff-guide.md) - -> **Read**: `@{{REPO_ROOT}}/protocols/context/static/handoff-guide.md` - -Covers: - - - When to delegate work to subagents (using either PAL's `clink` or the Claude subagent *Task* tool) - - How to select the model/CLI to use with the subagent that you spawn (Codex/Gemini/Claude) - - When to ask user vs handle directly - - What requires explicit approval +{{MUST_READ_SECTION}} > [!IMPORTANT] > You must read these files *every time* (even when spawned as a specialized agent) since they provide: @@ -109,4 +90,4 @@ Context auto-compacts as it approaches limits. For large tasks: 3. **Be efficient** - Progressive file reading (offset/limit), reference stored memories 4. **Never truncate** - If incomplete, store progress and delegate remainder with clear handoff -**Task completion > token efficiency. Use delegation + memory to achieve both.** \ No newline at end of file +**Task completion > token efficiency. Use delegation + memory to achieve both.** diff --git a/protocols/pal/settings.yaml b/protocols/pal/settings.yaml index ef0c5575..3ded928a 100644 --- a/protocols/pal/settings.yaml +++ b/protocols/pal/settings.yaml @@ -2,12 +2,12 @@ # # This file defines how to generate PAL-specific configuration file for each CLI. # The generator script protocols/scripts/generate-pal-configs.py script reads this file -# and produces one JSON config per CLI, using model settings from directives.yml ( +# and produces one JSON config per CLI, using model settings from defaults.yml ( # pal.<cli> sections, e.g., pal.claude.model). # # Role discovery is automatic: all .md files in agents/role-prompts/ are # included unless explicitly filtered via YML configs. -# See documentation or directives.yml for details. +# See documentation or defaults.yml for details. # # ┌─────────────────────────────────────────────────────────────────────────────┐ # │ SECURITY NOTE │ diff --git a/protocols/scripts/set-up-configs.sh b/protocols/scripts/set-up-protocols.sh similarity index 70% rename from protocols/scripts/set-up-configs.sh rename to protocols/scripts/set-up-protocols.sh index a3d92591..bbac2684 100755 --- a/protocols/scripts/set-up-configs.sh +++ b/protocols/scripts/set-up-protocols.sh @@ -1,16 +1,10 @@ #!/usr/bin/env bash # -# Setup script for global config files -# Generates config files from templates and creates symlinks for portability +# Setup script for global/user-scoped config/context files +# Handles generating config files from templates and creating symlinks for portability set -euo pipefail -# Color codes for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -NC='\033[0m' # No Color - CONTEXT_DIRNAME="context" TEMPLATES_DIRNAME="templates" GENERATED_DIRNAME="generated" @@ -22,40 +16,20 @@ CONTEXT_TEMPLATES="$(cd "$CONFIGS_DIR/$CONTEXT_DIRNAME/$TEMPLATES_DIRNAME/" && p # Create generated directory if it doesn't exist mkdir -p "$CONFIGS_DIR/$CONTEXT_DIRNAME/$GENERATED_DIRNAME" -CONTEXT_GENERATED="$(cd "$CONFIGS_DIR/$CONTEXT_DIRNAME/$GENERATED_DIRNAME/" && pwd)" +CONTEXT_GENERATED="$(cd "$CONFIGS_DIR/$CONTEXT_DIRNAME/$GENERATED_DIRNAME/" && pwd)" -# Source agent selection library +# Source internal Bureau libraries source "$REPO_ROOT/bin/lib/agent-selection.sh" +source "$REPO_ROOT/bin/lib/logging.sh" -# Detect installed CLIs based on directory existence +# Detect installed CLIs based on directory existence # (exits if none found, logs detected CLIs) discover_agents -echo -e "${GREEN}User-level agent context files setup${NC}" -echo -e "Repo root: $REPO_ROOT" -echo -e "Selected agents: ${AGENTS[*]}" -echo "" - -# Function to print step headers -print_step() { - echo -e "${YELLOW}==>${NC} $1" -} - -# Function to print success -print_success() { - echo -e "${GREEN}✓${NC} $1" -} - -# Function to print warning -print_warning() { - echo -e "${YELLOW}⚠${NC} $1" -} - -# Function to print error and exit -print_error() { - echo -e "${RED}✗${NC} $1" >&2 - exit 1 -} +log_banner "User-level agent context files setup" +echo "Repo root: $REPO_ROOT" +echo "Selected agents: ${AGENTS[*]}" +log_empty_line # Function to safely create symlink # Args: @@ -72,75 +46,75 @@ create_safe_symlink() { local current_link current_link="$(readlink "$target")" if [[ "$current_link" == "$source" ]]; then - print_warning "Symlink already exists: $target -> $source" + log_warning "Symlink already exists: $target -> $source" return 0 else # Points to wrong location - remove it - print_warning "Removing incorrect symlink: $target -> $current_link" + log_warning "Removing incorrect symlink: $target -> $current_link" rm "$target" fi elif [[ -f "$target" ]]; then # It's a regular file - backup before removing local backup backup="${target}.backup.$(date +%Y%m%d_%H%M%S)" - print_warning "Backing up existing file: $target -> $backup" + log_warning "Backing up existing file: $target -> $backup" mv "$target" "$backup" elif [[ -e "$target" ]]; then # Exists but not a file or symlink (directory?) - print_error "Cannot create symlink: $target exists and is not a file or symlink" + log_error "Cannot create symlink: $target exists and is not a file or symlink" fi # Create the symlink ln -s "$source" "$target" - print_success "Created symlink: $target -> $source" + log_success "Created symlink: $target -> $source" } # Check if we're in the right place if [[ ! -f "$CONTEXT_TEMPLATES/AGENTS.template.md" ]] || [[ ! -f "$CONTEXT_TEMPLATES/CLAUDE.template.md" ]]; then - print_error "Cannot find config template files. Please run this script from within the repository." + log_error "Cannot find config template files. Please run this script from within the repository." fi # ============================================================================ # Generate config files from templates (in repo) # ============================================================================ -print_step "Generating config files from templates" +log_action "Generating config files from templates" # Generate AGENTS.md in repo (for Gemini CLI & Codex) sed "s|{{REPO_ROOT}}|$REPO_ROOT|g" "$CONTEXT_TEMPLATES/AGENTS.template.md" > "$CONTEXT_GENERATED/AGENTS.md" -print_success "Generated $CONTEXT_GENERATED/AGENTS.md from template" +log_success "Generated $CONTEXT_GENERATED/AGENTS.md from template" # Generate CLAUDE.md in repo (for Claude Code) sed "s|{{REPO_ROOT}}|$REPO_ROOT|g" "$CONTEXT_TEMPLATES/CLAUDE.template.md" > "$CONTEXT_GENERATED/CLAUDE.md" -print_success "Generated $CONTEXT_GENERATED/CLAUDE.md from template" +log_success "Generated $CONTEXT_GENERATED/CLAUDE.md from template" echo "" # ============================================================================ # Create symlinks from CLI config locations to repo files # ============================================================================ -print_step "Creating symlinks to generated config files" +log_action "Creating symlinks to generated config files" # Symlink for Gemini CLI if agent_enabled "Gemini CLI"; then create_safe_symlink "$CONTEXT_GENERATED/AGENTS.md" "$HOME/.gemini/GEMINI.md" else - print_step "Skipping Gemini symlink (user-scoped CLI directory not found)" + log_action "Skipping Gemini symlink (user-scoped CLI directory not found)" fi # Symlink for Codex if agent_enabled "Codex"; then create_safe_symlink "$CONTEXT_GENERATED/AGENTS.md" "$HOME/.codex/AGENTS.md" else - print_step "Skipping Codex symlink (user-scoped CLI directory not found)" + log_action "Skipping Codex symlink (user-scoped CLI directory not found)" fi # Symlink for Claude Code if agent_enabled "Claude Code"; then create_safe_symlink "$CONTEXT_GENERATED/CLAUDE.md" "$HOME/.claude/CLAUDE.md" else - print_step "Skipping Claude symlink (user-scoped CLI directory not found)" + log_action "Skipping Claude symlink (user-scoped CLI directory not found)" fi -echo -e "${GREEN}✓ Config files setup complete!${NC}" +log_success "Config files setup complete!" echo "" # Show verification steps only for enabled agents @@ -164,32 +138,33 @@ if agent_enabled "Codex"; then fi echo "Configured CLIs now have access to:" -echo " - Handoff guidelines (delegation rules)" -echo " - Compact MCP list (tool selection guide)" +echo " - All .md files from $BUREAU_PROTOCOLS_DIR" +echo " (default: tools-guide, handoff-guide, code-standards)" echo "" echo "Context files are symlinked from $CONTEXT_GENERATED/" -echo "To update these:" -echo " 1. Edit the templates in $CONTEXT_TEMPLATES" -echo " 2. Re-run this script" +echo "To customize agent context:" +echo " 1. Edit/add/remove .md files in $BUREAU_PROTOCOLS_DIR" +echo " 2. Re-run this script (or bin/open-bureau)" +echo " To restore defaults: bin/reset-protocols" echo "" # ============================================================================ # Generate and symlink PAL CLI configs # ============================================================================ -print_step "Generating PAL per-CLI config files (used when coding CLIs are called via clink)" +log_action "Generating PAL per-CLI config files (used when coding CLIs are called via clink)" PAL_GENERATED_DIR="$REPO_ROOT/protocols/pal/generated" PAL_CLI_CLIENTS_DIR="$HOME/.pal/cli_clients" # Generate PAL CLI configs from settings.yaml (auto-discovers roles from agents/role-prompts/) if uv run python "$REPO_ROOT/protocols/scripts/generate-pal-configs.py"; then - print_success "PAL per-CLI config files generated in $PAL_GENERATED_DIR" + log_success "PAL per-CLI config files generated in $PAL_GENERATED_DIR" else - print_warning "Failed to generate PAL per-CLI config files - using existing files" + log_warning "Failed to generate PAL per-CLI config files - using existing files" fi echo "" -print_step "Symlinking PAL per-CLI config files to $PAL_CLI_CLIENTS_DIR" +log_action "Symlinking PAL per-CLI config files to $PAL_CLI_CLIENTS_DIR" mkdir -p "$PAL_CLI_CLIENTS_DIR" @@ -207,9 +182,9 @@ for json_file in "$PAL_GENERATED_DIR/"*.json; do done if [[ $symlink_count -gt 0 ]]; then - print_success "PAL per-CLI config files symlinked ($symlink_count files)" + log_success "PAL per-CLI config files symlinked ($symlink_count files)" else - print_warning "No PAL per-CLI config files found to symlink" + log_warning "No PAL per-CLI config files found to symlink" fi echo "" @@ -220,6 +195,20 @@ echo " 2. Re-run this script (or bin/open-bureau, which calls this script)" echo " 3. Restart your coding CLIs (or use their internal MCP-related commands if possible) to reconnect to PAL" echo "" +# ============================================================================ +# Set up Bureau editing mode skills +# ============================================================================ +log_action "Setting up Bureau editing mode skills" + +SCRIPTS_DIR="$(dirname "${BASH_SOURCE[0]}")" +if [[ -x "$SCRIPTS_DIR/set-up-skills.sh" ]]; then + "$SCRIPTS_DIR/set-up-skills.sh" +else + log_warning "set-up-skills.sh not found or not executable; skipping skill setup" +fi + +echo "" + echo "────────────────────────────────────────────────────────────────────────────────" echo "⚠️ IMPORTANT SECURITY NOTE" echo @@ -229,7 +218,7 @@ echo " • Codex: --dangerously-bypass-approvals-and-sandbox" echo " • Gemini: --yolo" echo "" echo "This is intentional for automation, especially given subagents are only spawned" -echo "by user-run agents that they trust. However, be aware:" +echo "by *users* from within agents that they've launched. However, remain aware:" echo "" echo " • Review changes after delegation completes (git diff)" echo " • Don't delegate tasks you wouldn't run yourself" diff --git a/protocols/scripts/set-up-skills.sh b/protocols/scripts/set-up-skills.sh new file mode 100755 index 00000000..ea4c8e64 --- /dev/null +++ b/protocols/scripts/set-up-skills.sh @@ -0,0 +1,217 @@ +#!/usr/bin/env bash +# +# Purpose: +# - Installs Bureau's skills to all supported coding agents +# - Handles all Bureau-supported coding agents using the following user-scoped skills dirs: +# - Claude Code: ~/.claude/skills/ +# - OpenCode: ~/.config/opencode/skill/ +# - Gemini CLI: ~/.gemini/skills/ +# - Codex: ~/.agents/skills/ +# +# Args: +# --dry-run Show what would be done without making changes +# --uninstall Remove all Bureau skill installs + +set -e + +# Constants for functionality +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" # we are in protocols/scripts/, so move up 2 parents + +# Source internal Bureau libraries +source "$REPO_ROOT/bin/lib/logging.sh" + +DRY_RUN=false +UNINSTALL=false +SKILLS_CONFIG_PATH="$REPO_ROOT/protocols/context/generated/skills-config.generated.json" +SKILLS_CONFIG_GENERATOR="$REPO_ROOT/protocols/scripts/generate-skills-config.py" + +# Skill directory locations for each CLI +CLAUDE_SKILLS_DIR="$HOME/.claude/skills" +OPENCODE_SKILLS_DIR="$HOME/.config/opencode/skill" +CODEX_SKILLS_DIR="$HOME/.agents/skills" +LEGACY_CODEX_SKILLS_DIR="$HOME/.codex/skills" +GEMINI_SKILLS_DIR="$HOME/.gemini/skills" +BUREAU_SKILL_PREFIX="bureau-" + +# Parse arguments +while [[ $# -gt 0 ]]; do + case $1 in + -n|--dry-run) + DRY_RUN=true + shift + ;; + -u|--uninstall) + UNINSTALL=true + shift + ;; + -h|--help) + sed -n '2,/^$/p' "$0" | sed 's/^# //g' + exit 0 + ;; + *) + echo "Unknown option: $1" + exit 1 + ;; + esac +done + +ensure_dir() { + local dir=$1 + + if [[ "$DRY_RUN" == true ]]; then + log_action "Would create:" "$dir" + else + mkdir -p "$dir" + fi +} + +set_up_bureau_skill_dirs() { + local skill_conf_dir=$1 + ensure_dir "$skill_conf_dir" + + for ((i = 0; i < ${#SKILL_SOURCE_DIRS[@]}; i++)); do + local skill_name + local skill_install_subdir # install path relative to the CLI's skill config dir + local skill_install_dir # full install path + local skill_prefix + local skill_source_dir + + skill_name="${SKILL_NAMES[$i]}" + skill_prefix="${SKILL_PREFIXES[$i]}" + skill_source_dir="${SKILL_SOURCE_DIRS[$i]}" + + if [[ ! -d "$skill_source_dir" ]]; then + log_warning "Skipped missing skill dir: $skill_source_dir" + continue + fi + + skill_install_subdir="${skill_prefix}${skill_name}" + skill_install_dir="$skill_conf_dir/$skill_install_subdir" + + if [[ $DRY_RUN == true ]]; then + log_action "Would link from:" "$skill_install_dir" + continue + fi + + ln -sfn "$skill_source_dir" "$skill_install_dir" + log_success "$skill_install_dir" + done +} + +remove_bureau_skill_dirs() { + local skill_conf_dir=$1 + + if [[ ! -d "$skill_conf_dir" ]]; then + return + fi + + for entry in "$skill_conf_dir"/${BUREAU_SKILL_PREFIX}*; do + if [[ ! -e "$entry" ]]; then + continue + fi + + if [[ -L "$entry" ]]; then + if [[ "$DRY_RUN" == true ]]; then + log_action "Would remove:" "$entry" + else + rm -f "$entry" + log_success "Removed: $entry" + fi + elif [[ -d "$entry" ]]; then + if [[ "$DRY_RUN" == true ]]; then + log_action "Would remove:" "$entry/" + else + rm -rf "$entry" + log_success "Removed: $entry/" + fi + elif [[ -e "$entry" ]]; then + if [[ "$DRY_RUN" == true ]]; then + log_action "Would remove:" "$entry" + else + rm -f "$entry" + log_success "Removed: $entry" + fi + fi + done +} + +# Ensure jq is available +if ! command -v jq >/dev/null 2>&1; then + log_error "jq is required to parse $SKILLS_CONFIG_PATH" + exit 1 +fi + +# Generate skills config +if ! uv run python "$SKILLS_CONFIG_GENERATOR"; then + log_error "Failed to generate skills config: $SKILLS_CONFIG_GENERATOR" + exit 1 +fi + +if [[ ! -f "$SKILLS_CONFIG_PATH" ]]; then + log_error "Skills config not found: $SKILLS_CONFIG_PATH" + exit 1 +fi + +# Get list of skill directories (source dirs don't have bureau- prefix) +SKILL_DIRS=($(find "$BUREAU_SKILLS_DIR" -maxdepth 1 -mindepth 1 -type d | sort)) + +if [[ ${#SKILL_NAMES[@]} -eq 0 ]]; then + log_error "No skills found in $SKILLS_CONFIG_PATH" + exit 1 +fi + +log_empty_line +log_banner "Bureau skill installation" +echo "Config: $SKILLS_CONFIG_PATH" +echo "Skills found: ${#SKILL_NAMES[@]}" +log_empty_line + +if [[ "$DRY_RUN" == true ]]; then + log_warning "Dry run mode: no changes will be made" + log_empty_line +fi + +log_warning "Removing existing Bureau skill installs..." +log_empty_line + +# Always remove Bureau-prefixed skills to keep installs consistent +# and avoid stale entries +remove_bureau_skill_dirs "$CLAUDE_SKILLS_DIR" +remove_bureau_skill_dirs "$OPENCODE_SKILLS_DIR" +remove_bureau_skill_dirs "$CODEX_SKILLS_DIR" +remove_bureau_skill_dirs "$LEGACY_CODEX_SKILLS_DIR" +remove_bureau_skill_dirs "$GEMINI_SKILLS_DIR" + +log_empty_line +log_success "Bureau skills uninstalled." +log_empty_line + +if [[ "$UNINSTALL" == true ]]; then + exit 0 +fi + +log_info "Setting up fresh skill installs..." +log_empty_line + +log_header "Claude Code" "$CLAUDE_SKILLS_DIR" +set_up_bureau_skill_dirs "$CLAUDE_SKILLS_DIR" +echo "" + +log_header "OpenCode" "$OPENCODE_SKILLS_DIR" +set_up_bureau_skill_dirs "$OPENCODE_SKILLS_DIR" +echo "" + +log_header "Codex" "$CODEX_SKILLS_DIR" "Note Codex ignores symlinked directories; copying instead." +set_up_bureau_skill_dirs "$CODEX_SKILLS_DIR" +echo "" + +log_header "Gemini CLI" "$GEMINI_SKILLS_DIR" +set_up_bureau_skill_dirs "$GEMINI_SKILLS_DIR" +echo "" + +log_divider +if [[ "$DRY_RUN" == true ]]; then + log_warning "Dry run complete. Run without --dry-run to apply changes." +else + log_success "Bureau skills setup complete!" +fi diff --git a/pyproject.toml b/pyproject.toml index 5a1efefb..556391a1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,6 +5,7 @@ description = "Multi-agent AI orchestration framework" requires-python = ">=3.13" dependencies = [ "pyyaml>=6.0", + "tomlkit>=0.12", ] [project.scripts] @@ -30,6 +31,9 @@ dev = [ "types-requests>=2.31", ] +[tool.mypy] +python_version = "3.13" + [tool.pytest.ini_options] testpaths = ["operations/cleanup/tests"] pythonpath = ["."] diff --git a/tools/README.md b/tools/README.md deleted file mode 100644 index 28d94be3..00000000 --- a/tools/README.md +++ /dev/null @@ -1,213 +0,0 @@ -# *MCP and CLI tools suite*: essential must-read information - -**Contents:** -- [What the setup script does](#what-the-setup-script-does) - - [Side effects](#side-effects) -- [Tools set up by the script](#tools-set-up-by-the-script) - - [Important: ensuring tools work properly](#important-ensuring-tools-work-properly) -- [Using the script](#using-the-script) - - [Prerequisites](#prerequisites) - - [Running the script](#running-the-script) -- [Configuration](#configuration) - - [Key settings](#key-settings) -- [Data retention](#data-retention) - - [Retention configuration](#retention-configuration) - - [Default retention periods](#default-retention-periods) - - [Commands](#commands) - - [Complete wipe](#complete-wipe) - - [Soft delete](#soft-delete) - ---- - -## What the [setup script](./scripts/set-up-tools.sh) does - -1. Sets up a series of essential MCPs and CLI tools (including running some MCPs & Docker containers locally) -2. Configures [these coding agent CLIs](#supported-coding-agents) to use them - -And other useful background stuff. - -### Side effects - -- Uses ports **8780-8785** for the servers/containers it starts. - - - Pulls Qdrant DB Docker image and starts a container (to back the local Qdrant MCP) - -- Clones the Sourcegraph MCP repo locally (required for custom fork with fixes). - -## Tools set up by the script - -See [`tools.md`](tools.md) for: - -- **full list of tools** set up/made available by the script -- **how to use them** (e.g. when writing prompts) - -### Important: ensuring tools work properly - - - When using the Serena MCP with agents, you **need to activate the project *first* by providing the prompt `Activate the current dir as project using Serena`**. - - - Best to *always do this* when launching an agent, since Serena makes agents much more reliable & efficient at static analysis and syntax-related stuff - -## Using the script - -### Prerequisites - -Run `./bin/check-prereqs` to verify all prerequisites are installed: - -```bash -./bin/check-prereqs -``` - -If any are missing, install them using the instructions below. - -<details> -<summary><b>npm/Node</b> - for npx-based MCP servers</summary> - -Use [nvm](https://github.com/nvm-sh/nvm) to install/manage Node versions: - -```bash -curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash && \ - nvm install --lts && \ - nvm use --lts -``` -</details> - -<details> -<summary><b>uv</b> - for Python-based MCP servers and Semgrep</summary> - -```bash -curl -LsSf https://astral.sh/uv/install.sh | sh -``` -</details> - -<details> -<summary><b>Docker</b> - for Qdrant container</summary> - -Install [Rancher Desktop](https://rancherdesktop.io/) or [Docker Desktop](https://docs.docker.com/get-docker/). -</details> - -#### API keys - -> - **Free tiers are used in this script for each service/MCP server** that isn't free/open-source in the first place; these API keys aren't used for any payments -> - API keys created here are because either: -> -> - The service's free tier requires an API key -> - The service offers bonus features/usage on top of the regular free tier as a bonus for signing up and using an API key - -1. Create API key at these services' websites - - - [Tavily](https://www.tavily.com/) - - [Brave](https://brave.com/search/api/) - -2. Add to as exports from shell config (`.zshrc` or `.bashrc`) **with these variable names**: - - ```bash - export TAVILY_API_KEY=<...> - export BRAVE_API_KEY=<...> - ``` - -3. `source` your shell config to ensure they're available - -### Running the script - -```bash -tools/scripts/set-up-tools.sh -``` - -All configuration is done via YAML files. See [Configuration](#configuration) for details. - -## Configuration - -Bureau uses YAML configuration files at the repo root. Settings are loaded in this order (later sources override earlier): - -1. **`charter.yml`** - System defaults (endpoints, package paths) - **do not edit** -2. **`directives.yml`** - Team/shared settings (agents, retention, ports, paths) -3. **`local.yml`** - Personal overrides (gitignored) -4. **Environment variables** - Highest priority for paths - -For full configuration reference, see [docs/CONFIGURATION.md](../docs/CONFIGURATION.md). - -### Key settings - -```yaml -# directives.yml - customize these as needed - -# Which agents to configure -agents: - enabled: [claude, gemini, codex, opencode] - -# MCP tool auto-approval (skips permission prompts) -mcp: - auto_approve: no # yes/true or no/false - -# Filesystem MCP security boundary -path_to: - fs_mcp_whitelist: ~/code - -# Server ports (change if conflicts occur) -port_for: - qdrant_db: 8780 - # ... see docs/CONFIGURATION.md for full list -``` - -## Data retention - -Bureau automatically cleans up old memories to prevent unbounded storage growth. - -### Retention configuration - -Edit `directives.yml` to customize retention periods: - -```yaml -retention_period_for: - claude_mem: 30d - serena: 90d - qdrant: 180d - memory_mcp: 365d - -trash: - grace_period: 30d -``` - -Duration format: `<number><unit>` where unit is `h` (hours), `d` (days), `w` (weeks), `m` (months), `y` (years). -Set to `"never"` to disable cleanup for a storage. - -### Default retention periods - -| Storage | Default | Data Type | -|:--------|:--------|:----------| -| claude-mem | 30 days | Session summaries, observations | -| Serena | 90 days | Project memories | -| Qdrant | 180 days | Solutions, patterns, insights | -| Memory MCP | 365 days | Entity relationships | - -### Commands - -- **Automatic cleanup**: Runs on `./bin/open-bureau` if >24h since last run -- **Manual prune**: `./bin/prune [--force] [--dry-run] [--storage L]` where `L` is a combo of letters `q` (Qdrant), `c` (claude-mem), `s` (Serena), `m` (memory-mcp); e.g., `-s smq` -- **Empty trash**: `./bin/empty-trash` -- **Wipe storage**: `./bin/wipe <storage> [<storage>...] [--no-backup] [--yes]` - -### Complete wipe - -To completely erase all data from one or more storages: - -```bash -# Wipe a single storage (backs up to trash first) -./bin/wipe claude-mem - -# Wipe multiple storages -./bin/wipe claude-mem qdrant - -# Wipe all storages -./bin/wipe all - -# Skip confirmation prompt -./bin/wipe all --yes - -# Permanent deletion (no backup - DANGEROUS) -./bin/wipe claude-mem --no-backup --yes -``` - -### Soft delete - -Deleted items go to `.archives/trash/` with a 30-day grace period before permanent deletion. diff --git a/tools/bg-info.md b/tools/bg-info.md deleted file mode 100644 index c98f3537..00000000 --- a/tools/bg-info.md +++ /dev/null @@ -1,114 +0,0 @@ -# Useful background info - -## MCP server types - -- **Official/reference** servers are developed and maintained by the creators of MCP - - - Are standard, primary implementations that serve as models for how to build new MCPs - -- **Community** servers are developed by the community - -## Ways of running MCP servers - -Most MCP servers are run as **client-managed `stdio` servers**: - -- You configure your coding agent (e.g. using `claude mcp add ...`) to use that MCP, telling the command to use to run the server (usually something like `npx -y ...`) -- Then, the coding agent will start its own private instance of the server when launched and stop it when it's exited. - -Some servers support being accessed remotely (being run either as a locally-running process or by an online service that you access with a URL + API key), which allows them to be reused by many agents. - -| Method of talking to remote MCP servers | Use case | How it works | -| --- | --- | --- | -| **HTTP** | Best for MCPs whose toolcalls are quick & synchronous | Run the server once, `mcp add` command provides the server's URL; client then initiates exchanges w/ server via HTTP | -| **SSE *(deprecated in a lot of places, better to just always use HTTP)*** | Best for MCPs whose tools run for a long time, thus making progress updates useful; Agent CLI *and* MCP server **must support SSE** | Similar to `http`, except connection remains open instead of closing after each request. Client then stays listening, and server can "push" messages (via events) to the client whenever new data is available | - -## Adding MCPs to coding agents - -> The `/mcp` command in each of the agent CLIs below will **list currently-active servers** *(useful for verifying setup was successful)* - -## Gemini CLI - -> [***Full Gemini MCP guide***](https://geminicli.com/docs/tools/mcp-server/) -> -> → [*Shortcut: guide to `gemini mcp` commands*](https://github.com/google-gemini/gemini-cli/blob/main/docs/tools/mcp-server.md#managing-mcp-servers-with-gemini-mcp) - -- Add servers with `gemini mcp add <name> <commandOrUrl> [args...]` - - - Scope used determines which config file is changed: - - - **Project scope *(default)*** → `~/.gemini/settings.json` - - **User scope** → `~/.gemini/settings.json` - - - Use via `-s user` option - -## Claude Code - -> [***Full Claude Code MCP guide***](https://docs.claude.com/en/docs/claude-code/mcp) - -- **Support for SSE servers is deprecated**; prefer HTTP servers instead -- Can add MCP servers at these scopes (with `--scope <local|project|user>`): - - - **Local (*default*)**: for current repo - - **Project**: changes current repo's `.mcp.json` so collaborators can reuse the same MCP setup - - **User**: for Claude Code anywhere on your device (changes config in `~/.claude`) - -- Listing and using available MCPs: - - - Type `@` to see available resources from all connected MCP servers (alongside files) - - Use the format **`@server:protocol://resource/path`** to reference a resource, for example: - - > `Compare @postgres:schema://users with @docs:file://database/user-model` - -## Codex - -> [***Full Codex MCP guide***](https://developers.openai.com/codex/mcp) - -- Adding MCP servers: - - - Options for `stdio` servers: - - 1. **Edit `~/.codex/config.toml` config file** with this format: - - ```toml - [mcp_servers.<server-name>] - command = <server launch command> # required - args = <args for launch command> # optional - env = { "ENV_VAR" = "VALUE" } # optional: env vars for server to use - - # alternate way of adding any env vars for server to use - [mcp_servers.<server-name>.env] - ENV_VAR = "VALUE" - # ... repeat for each variable - ``` - - - Example: - - ```toml - [mcp_servers.context7] - command = "npx" - args = ["-y", "@upstash/context7-mcp"] - - [mcp_servers.context7.env] - SUNRISE_DIRECTION = "EAST" - ``` - - 2. **Use shortcut command** (creates config entry for you): - - ```bash - codex mcp add <server-name> [--env <VAR=VALUE>]... -- <server launch command> - ``` - - - For `http` servers: **must edit `~/.codex/config.toml`** config file with this format: - - ```toml - # optional: add this line if you want to use RMCP client to connect to server - # enables auth via OAuth for HTTP servers - experimental_use_rmcp_client = true - - [mcp_servers.<server-name>] - url = <server URL> # required - bearer_token = <token> # optional: bearer token to use in an `Authorization` header - # (if not using OAuth via RMCP above) - ``` - - - **Doesn't support SSE**; use HTTP servers instead diff --git a/tools/scripts/service-order.py b/tools/scripts/service-order.py new file mode 100644 index 00000000..de7f827b --- /dev/null +++ b/tools/scripts/service-order.py @@ -0,0 +1,70 @@ +#!/usr/bin/env -S uv run +""" +Parses service plan JSON files to determine startup startup_seq of runtime services: + + 1. performs a topological sort based on declared dependencies to ensure + services are started in a valid sequence + 2. outputs a scripting-friendly list of service names to stdout, which are + then piped directly to orchestration tools + +Usage: + ./service-order.py <path_to_plan.json> + +Assumption: + The input JSON is expected to follow this structure for listing dependencies: + runtime_services -> <name> -> depends_on -> services -> [list of service names] + +Note: + Arguments are mandatory. If no argument is provided, the script exits + silently with code 1 to defer error handling to the invoking shell script. +""" +from __future__ import annotations + +import json +import sys + + +def main() -> int: + if len(sys.argv) < 2: + # Intentionally fail silently on missing args to avoid cluttering CLI + return 1 + + path = sys.argv[1] + with open(path, "r", encoding="utf-8") as handle: + plan = json.load(handle) + + # Extract dependencies into the mapping {service_name: {requirements}} + services : dict = plan.get("runtime_services", {}) + deps = { + name: set(entry.get("depends_on", {}).get("services", [])) + for name, entry in services.items() + } + + startup_seq: list[str] = [] + remaining = set(deps) + + # Topological sort via a variation of Kahn's algorithm: + # repeatedly find services whose dependencies have already been + # satisfied (i.e. are in `startup_seq`) + while remaining: + ready = [service for service in sorted(remaining) + if deps[service].issubset(startup_seq)] + if not ready: + # note a service's deps can only consist of other services listed + # in the configuration + # so if no service exists having no deps, there must be a cycle in + # in the service dependencies (making the config invalid) + print("Detected cycle in service dependencies.", file=sys.stderr) + return 1 + for service in ready: + remaining.remove(service) + startup_seq.append(service) + + for service in startup_seq: + print(service) + + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/tools/scripts/set-up-tools.sh b/tools/scripts/set-up-tools.sh index 88700582..dcaf183c 100755 --- a/tools/scripts/set-up-tools.sh +++ b/tools/scripts/set-up-tools.sh @@ -28,68 +28,14 @@ cfg() { (cd "$REPO_ROOT" && uv run get-config "$key" 2>/dev/null) || true } -# Setting MCP source clone paths -CLONE_DIR="$(cfg path_to.mcp_clones)" -export SOURCEGRAPH_REPO_PATH="$CLONE_DIR/sourcegraph-mcp" - SERVER_START_TIMEOUT="$(cfg startup_timeout_for.mcp_servers)" DOCKER_TIMEOUT="$(cfg startup_timeout_for.docker_daemon)" -# Remote server URLs -export SOURCEGRAPH_ENDPOINT="${SOURCEGRAPH_ENDPOINT:-$(cfg endpoint_for.sourcegraph)}" -export CONTEXT7_URL="${CONTEXT7_URL:-$(cfg endpoint_for.context7)}" -export TAVILY_URL="${TAVILY_URL:-$(cfg endpoint_for.tavily)}" - -# PAL MCP: disable all tools except clink (since they need an API key) -export PAL_DISABLED_TOOLS="${PAL_DISABLED_TOOLS:-$(cfg pal_disabled_tools)}" - -# Add CLI bin paths (resolved directly for portability) to the PATH provided to PAL MCP -# on startup so that each can be called directly via clink (i.e. to spawn subagents) -CLI_BIN_PATHS="" -for cli in claude gemini codex; do - if cli_path="$(command -v "$cli" 2>/dev/null)"; then - cli_dir="$(dirname "$cli_path")" - # Add to path if not already present (dedup) - if [[ ":$CLI_BIN_PATHS:" != *":$cli_dir:"* ]]; then - CLI_BIN_PATHS="${CLI_BIN_PATHS:+$CLI_BIN_PATHS:}$cli_dir" - fi - else - log_warning "CLI '$cli' not found in PATH - clink won't be able to use it" - fi -done -export CLI_BIN_PATHS - -# Ports for local HTTP servers -export QDRANT_DB_PORT="${QDRANT_DB_PORT:-$(cfg port_for.qdrant_db)}" -export QDRANT_MCP_PORT="${QDRANT_MCP_PORT:-$(cfg port_for.qdrant_mcp)}" -export SOURCEGRAPH_MCP_PORT="${SOURCEGRAPH_MCP_PORT:-$(cfg port_for.sourcegraph_mcp)}" -export SEMGREP_MCP_PORT="${SEMGREP_MCP_PORT:-$(cfg port_for.semgrep_mcp)}" -export SERENA_MCP_PORT="${SERENA_MCP_PORT:-$(cfg port_for.serena_mcp)}" - -# Configure Qdrant MCP (handles semantic memory) -# Derive QDRANT_URL if not provided in env -if [[ -z "${QDRANT_URL:-}" ]]; then - QDRANT_URL="http://127.0.0.1:$QDRANT_DB_PORT" -fi -export QDRANT_URL -export QDRANT_COLLECTION_NAME="${QDRANT_COLLECTION_NAME:-$(cfg qdrant.collection)}" -export QDRANT_EMBEDDING_PROVIDER="${QDRANT_EMBEDDING_PROVIDER:-$(cfg qdrant.embedding_provider)}" - -# Expand ~ to $HOME for: -# - Docker compatibility (QDRANT_STORAGE_PATH) -# - mkdir/Node compatibility (MEMORY_MCP_STORAGE_PATH) -MEMORY_MCP_STORAGE_PATH="${MEMORY_MCP_STORAGE_PATH:-$(cfg path_to.storage_for.memory_mcp)}" -QDRANT_STORAGE_PATH="${QDRANT_STORAGE_PATH:-$(cfg path_to.storage_for.qdrant)}" -export QDRANT_STORAGE_PATH="${QDRANT_STORAGE_PATH/#\~/$HOME}" -export MEMORY_MCP_STORAGE_PATH="${MEMORY_MCP_STORAGE_PATH/#\~/$HOME}" - -# Directories -FS_MCP_WHITELIST="${FS_MCP_WHITELIST:-$(cfg path_to.fs_mcp_whitelist)}" - -# Source agent selection library +# Source shared libraries source "$REPO_ROOT/bin/lib/agent-selection.sh" +source "$REPO_ROOT/bin/lib/logging.sh" -# Supported agents' printable string names +# Supported agents' printable string names CLAUDE="Claude Code" CODEX="Codex" GEMINI="Gemini CLI" @@ -100,28 +46,13 @@ CODEX_CONFIG="$HOME/.codex/config.toml" CLAUDE_CONFIG="$HOME/.claude/settings.json" CLAUDE_CLI_STATE="$HOME/.claude.json" -# Contains the list of agents to be configured by this script to use Bureau and its tools; -# Populated by discover_agents(); agentic CLIs above are added if their corresponding -# user-level config dir exists +# Contains the list of agents to be configured by this script to use Bureau and its tools +# Populated later by discover_agents() based on the YML configs AGENTS=() -# Colors -GREEN='\033[0;32m' -BLUE='\033[0;34m' -YELLOW='\033[1;33m' -RED='\033[0;31m' -NC='\033[0m' - # --- CONFIG VALUES --- -# Read auto-approve setting from config (accepts yes/true/no/false) -_auto_approve_cfg="$(cfg mcp.auto_approve)" -case "${_auto_approve_cfg,,}" in - yes|true) AUTO_APPROVE_MCP=true ;; - *) AUTO_APPROVE_MCP=false ;; -esac - -# Detect installed CLIs based on config directory existence (exits if none found, logs detected CLIs) +# Detect enabled agents based on YML configs (exits if none found, logs detected CLIs) discover_agents # --- HELPERS --- diff --git a/tools/scripts/tests/test_add_codex_auto_approvals.py b/tools/scripts/tests/test_add_codex_auto_approvals.py new file mode 100644 index 00000000..31d7fc21 --- /dev/null +++ b/tools/scripts/tests/test_add_codex_auto_approvals.py @@ -0,0 +1,61 @@ +from importlib import util +from pathlib import Path + +module_path = Path(__file__).resolve().parents[1] / "add-codex-auto-approvals.py" +spec = util.spec_from_file_location("add_codex_auto_approvals", module_path) +module = util.module_from_spec(spec) +assert spec and spec.loader +spec.loader.exec_module(module) +update_codex_config = module.update_codex_config + + +def _section_lines(content: str, section: str) -> list[str]: + lines = content.splitlines() + header = f"[mcp_servers.{section}]" + start = None + for idx, line in enumerate(lines): + if line.strip() == header: + start = idx + 1 + break + if start is None: + return [] + + end = len(lines) + for idx in range(start, len(lines)): + if lines[idx].strip().startswith("["): + end = idx + break + + return [line.strip() for line in lines[start:end] if line.strip()] + + +def test_sets_enabled_true_for_auto_approved_servers(tmp_path): + config_path = tmp_path / "config.toml" + config_path.write_text( + """ +approval_policy = "on-request" + +[mcp_servers.alpha] +url = "http://example.com" + +[mcp_servers.beta] +url = "http://example.com" +enabled = false + +[mcp_servers.gamma] +url = "http://example.com" +""".lstrip(), + encoding="utf-8", + ) + + update_codex_config(str(config_path), ["alpha", "beta"]) + + content = config_path.read_text(encoding="utf-8") + alpha_lines = _section_lines(content, "alpha") + beta_lines = _section_lines(content, "beta") + gamma_lines = _section_lines(content, "gamma") + + assert "enabled = true" in alpha_lines + assert "enabled = true" in beta_lines + assert "enabled = true" not in gamma_lines + assert "enabled = false" not in gamma_lines diff --git a/tools/tools-decision-guide.md b/tools/tools-decision-guide.md deleted file mode 100644 index 952e845c..00000000 --- a/tools/tools-decision-guide.md +++ /dev/null @@ -1,634 +0,0 @@ -# When to use each MCP tool: decision guide - -> [!NOTE] -> - **Audience:** both humans (documentation) and coding agents (instructions) -> - **Purpose:** optimal tool selection for -> -> - maximizing value -> - avoiding rate limits -> - minimizing waste *(of tokens, tool usage limits, etc.)* - -**<ins>Contents:</ins>** - -- [Quick Reference: Tool Usage Hierarchy](#quick-reference-tool-usage-hierarchy) -- [Detailed Tool Profiles](#detailed-tool-profiles) -- [Decision Trees by Task Type](#decision-trees-by-task-type) -- [Rate Limit Management](#rate-limit-management) -- [Special Cases & Gotchas](#special-cases--gotchas) - ---- - -## Quick reference: tool usage hierarchy - -### Web browsing/researching/fetching tools (use following prescribed priorities) - -**Tier 1: Primary Tools (Use First)** - -1. **Sourcegraph MCP** - Code search across public repos -2. **Tavily MCP** - Web research with citations (1000 credits/month) -3. **Context7 MCP** - API documentation and examples - -**Tier 2: Specialized Tools (Conditional Use)** - -4. **Brave MCP** - Privacy-focused search (2000 queries/month) - -**Tier 3: Fallback Tools (Last Resort)** - -5. **Fetch MCP** - Simple URL fetching (no rate limits) - -### Memory & coding tools (use as needed) - -**Memory & Knowledge:** - -- **Qdrant MCP** - Semantic memory layer (vector search, find by meaning, no rate limits) -- **Memory MCP** - Knowledge graph (entities/relations, structured memory, no rate limits) - -**Code Analysis & Manipulation:** - -- **Serena MCP** - Semantic code navigation/refactoring (symbol-level operations) -- **Semgrep MCP** - Security/bug scanning (pattern-based analysis) -- **Filesystem MCP** - File operations (read/write/edit) -- **Git MCP** - Git operations (status/diff/commit/etc.) - -**Browser Automation:** - -- **Playwright MCP** - Web automation (navigate, click, type, extract via accessibility tree) - -## Detailed tool profiles - -### Web research & search tools - -#### Sourcegraph MCP ⭐ **[PRIMARY FOR CODE]** - -**What it does:** - -- "Google for code" - searches across public GitHub repos -- Powerful filters: regex, language, file path, branch -- Guided search prompts (natural language → precise queries) -- Returns exact code snippets with line numbers - -**When to use:** - -- Finding code examples/patterns across repos -- Researching how libraries/APIs are used in practice -- Discovering implementations of specific algorithms -- Learning from real-world code - -**Rate limits:** None apparent (free tier for public repos) - -**Why use first:** Purpose-built for code search with no strict limits - -#### Tavily MCP ⭐ **[PRIMARY FOR WEB]** - -**What it does:** - -- Web search, extract, map, and crawl -- **Includes citations** (critical for credibility) -- Handles news, general info, current events - -**When to use:** - -- General web research -- Finding current information -- Getting cited sources for claims -- Extracting content from known URLs -- Mapping site structure - -**Rate limits:** 1000 API credits/month (resets on 1st) - -- Basic search: ~1-5 credits -- Extract: varies by complexity -- See [full credit costs](https://docs.tavily.com/documentation/api-credits#api-credits-costs) - -**Why use second:** Best balance of features and generous limits - -#### Context7 MCP ⭐ **[PRIMARY FOR DOCS]** - -**What it does:** - -- Fetches up-to-date, version-specific API documentation -- Includes code examples from official docs -- Works with public repos only - -**When to use:** - -- Learning a new library/framework -- Checking current API syntax -- Getting official usage examples -- Understanding library capabilities - -**Rate limits:** Free tier, public repos only - -**Why use third:** Specialized for documentation, no apparent hard limits - -#### Brave MCP **[SECONDARY SEARCH]** - -**What it does:** - -- Privacy-focused search engine -- Web, local, news, image, video search -- No tracking or profiling - -**When to use:** - -- Tavily credits exhausted -- Need privacy-focused results -- Basic web search without advanced features - -**Rate limits:** 2000 queries/month (basic web search only on free tier) - -**Why use here:** Good fallback when Tavily exhausted, but limited to basic search - -#### Fetch MCP **[SIMPLE FALLBACK]** - -**What it does:** - -- Basic HTTP/HTTPS URL fetching -- Converts HTML to Markdown -- Optional raw HTML -- Chunk reading via start_index - -**When to use:** - -- Simple one-off URL fetch -- Don't need search/crawl/extraction -- All other tools exhausted or overkill - -**Rate limits:** None - -**Limitations:** - -- **Does NOT support fetching directly from github.com** (fetch from `raw.githubusercontent.com` instead, or use `gh` CLI) - -**Why use last:** No advanced features, but reliable and unlimited - -### Memory & knowledge tools - -#### Qdrant MCP ⭐ **[PRIMARY FOR SEMANTIC MEMORY]** - -**What it does:** - -- Vector-based semantic memory layer using Qdrant database -- Stores information with embeddings for semantic (meaning-based) retrieval -- Uses FastEmbed models (default: sentence-transformers/all-MiniLM-L6-v2) -- Can run locally or connect to cloud/remote Qdrant instances -- Supports optional structured metadata alongside text - -**Tools available:** - -1. `qdrant-store` - Store information with optional metadata -2. `qdrant-find` - Retrieve semantically similar information by query - -**When to use (MANDATORY for these scenarios):** - -- **After solving ANY problem** - Store the solution, approach, and why it worked -- **After investigating code** - Store patterns discovered, gotchas found, insights gained -- **After making decisions** - Store trade-offs considered, alternatives rejected, rationale -- **After debugging** - Store root cause, symptoms, fix approach, prevention tips -- **After analyzing performance** - Store bottlenecks found, optimizations applied, metrics -- **After discovering undocumented behavior** - Store quirks, edge cases, workarounds -- Storing code snippets, examples, or reusable patterns for later retrieval -- Building a personal knowledge base across sessions -- Need to find information by *meaning* rather than exact keywords -- Storing learned insights from previous conversations - -**When NOT to use:** - -- Need to track explicit relationships between items → Use Memory MCP instead -- Need structured graph queries → Use Memory MCP instead -- Simple keyword search is sufficient → Use grep/filesystem tools -- Truly trivial one-time lookups with zero learning value → Skip (rare) - -**Rate limits:** None (local Docker container or self-hosted) - -**Best practices:** - -- Store atomic pieces of information (one concept per store) -- Use metadata field for structure (e.g., `{"type": "code", "language": "python"}`) -- Descriptive text helps retrieval (include context, not just code) -- Good for: code patterns, solutions to problems, useful links, learned facts -- Works great with Cursor/Windsurf for code snippet libraries - -**Example use cases:** - -- "Store this React hook pattern for reuse later" → retrieves by describing what you need -- Building a personal StackOverflow of solved problems -- Remembering API patterns across different projects -- Semantic code snippet search in your IDE - -**Why use:** Best tool for "find things similar to X" - works like a persistent, intelligent search over your saved knowledge - -#### Memory MCP ⭐ **[PRIMARY FOR KNOWLEDGE GRAPHS]** - -**What it does:** - -- Persistent knowledge graph with entities, relations, and observations -- Tracks explicit relationships between concepts/people/things -- Stores structured information in local JSONL file -- Maintains context and facts across sessions -- Official MCP implementation by Anthropic - -**Tools available:** - -1. `create_entities` - Create nodes in the graph (people, orgs, events, concepts) -2. `create_relations` - Define directed relationships between entities (in active voice) -3. `add_observations` - Add facts/notes to existing entities -4. `delete_entities` - Remove entities and their relations -5. `delete_observations` - Remove specific facts from entities -6. `delete_relations` - Remove specific relationships -7. `read_graph` - Read the entire knowledge graph -8. `search_nodes` - Search entities by name/type/observation content -9. `open_nodes` - Retrieve specific entities by name - -**When to use (MANDATORY for these scenarios):** - -- **After working on a project** - Create/update entities for components, modules, dependencies -- **After discovering relationships** - Map how components interact, depend on each other -- **After identifying key people/tools** - Track who owns what, what tools are used where -- **After analyzing architecture** - Store system structure, data flows, integration points -- **When project context emerges** - Capture facts about the codebase, team, processes -- **After making architectural decisions** - Store what components exist and how they relate -- Need to track *who* relates to *what* and *how* -- Building a structured knowledge base with explicit relationships -- Maintaining facts about users/preferences/history -- Need to query relationships (e.g., "which components depend on module X?") - -**When NOT to use:** - -- Need semantic/similarity search → Use Qdrant MCP instead -- Relationships aren't important, just storage → Use Qdrant MCP instead -- Simple note-taking without structure → Use filesystem or Qdrant -- Temporary context (single session) → Just keep in conversation context - -**Rate limits:** None (local JSONL file storage) - -**Best practices:** - -- Entities: Use clear, unique names (e.g., "John_Smith", "ProjectX") -- Relations: Always use active voice (e.g., "works_at", "depends_on") -- Observations: Keep atomic (one fact per observation) -- Entity types: Use consistent categorization (e.g., "person", "company", "project") -- Relations are directed: order matters (from → to) - -**Example structure:** - -``` -Entity: John_Smith (type: person) - Observations: ["Speaks Spanish", "Prefers async communication"] - Relations: John_Smith --works_at--> Anthropic - John_Smith --contributes_to--> ProjectX - -Entity: Anthropic (type: company) - Observations: ["AI safety research", "Based in San Francisco"] -``` - -**Example use cases:** - -- Personal memory: Remember user preferences, context, history -- Project documentation: Track components, dependencies, who owns what -- Relationship mapping: Social/professional network graphs -- Learning journal: Connect concepts, topics, resources with explicit links -- Code understanding: Map relationships between modules, functions, data flows - -**Why use:** Best tool for "X relates to Y" - maintains structured knowledge with queryable relationships - -### Qdrant vs memory: quick decision guide - -**Use Qdrant when:** - -- "Find things similar to this concept" -- Semantic search is the main access pattern -- Relationships between items aren't critical -- Building a retrieval/search system - -**Use Memory when:** - -- "Show me what relates to X" -- Explicit relationships matter -- Need structured graph queries -- Building a knowledge/context management system - -**Use both when:** - -- Complex knowledge base needs both similarity search AND relationship tracking -- E.g., Qdrant for code snippets, Memory for tracking which projects use which patterns - -### Code analysis & manipulation tools - -#### Serena MCP ⭐ **[PRIMARY FOR CODE EDITING]** - -**What it does:** - -- Language-server-powered semantic code navigation -- IDE-grade symbol search (find_symbol, find_referencing_symbols) -- Structural edits (rename, insert, replace at symbol level) -- 20+ languages supported - -**When to use:** - -- Need semantic understanding of code (not just text) -- Refactoring operations -- Finding all references to a symbol -- IDE-level code intelligence - -**Rate limits:** None (local server) - -**Why use:** Works at semantic level vs. whole-file operations - -#### Semgrep MCP - -**What it does:** - -- AST-aware security/bug/anti-pattern scanning -- Pattern-based rules (built-in or custom) -- Autofix suggestions -- Local scanning (code never leaves machine) - -**When to use:** - -- Security audits -- Finding bugs/anti-patterns -- Code quality checks -- Custom rule enforcement - -**Rate limits:** None (free community edition, local server) - -#### Playwright MCP - -**What it does:** - -- Browser automation using Playwright's accessibility tree -- Fast, deterministic tool application (no vision models) -- Navigate, click, type, extract content from web pages -- Supports Chrome, Firefox, WebKit with device emulation -- Can run headless or headed mode - -**When to use:** - -- Automated web testing and interaction -- Scraping dynamic content requiring JavaScript execution -- Form filling and submission automation -- End-to-end testing workflows -- Extracting data from interactive web applications - -**Rate limits:** None (local execution) - -#### Filesystem MCP - -**What it does:** Bulk file reads (filtered to `read_multiple_files` only) - -**When to use:** -- Batch reading 10+ files (30-60% token savings vs multiple Read calls) - -**Rate limits:** None (local) - -## Decision trees by task type - -### Finding code examples - -``` -START - ↓ -Need code from public repos? - ├─ YES → Use Sourcegraph MCP (no rate limits) - └─ NO → Need GitHub-specific? - ├─ YES → Use gh CLI or raw.githubusercontent.com via Fetch - └─ NO → Context7 for docs/examples from official sources -``` - -### Web research & information gathering - -``` -START - ↓ -What type of information? - ├─ API docs/library info → Context7 MCP - ├─ Current events/general web → Tavily MCP - ├─ Basic search (Tavily exhausted) → Brave MCP - └─ Simple URL content → Fetch MCP -``` - -### Website crawling/scraping - -``` -START - ↓ -Single URL or simple extraction? - ├─ YES → Fetch MCP (unlimited) or Tavily extract - └─ NO → Multiple pages/complex? - ↓ - Try Tavily search/extract/map/crawl - ↓ Still need more? - ↓ - Use Fetch iteratively on known URLs -``` - -### Code manipulation - -``` -START - ↓ -Need semantic understanding? - ├─ YES → Serena MCP (symbol-level operations) - └─ NO → Simple file edits? - ├─ YES → Filesystem MCP - └─ NO → Security/bug scan → Semgrep MCP -``` - -### Browser automation & web interaction - -``` -START - ↓ -Need to interact with web pages? - ├─ Static content (no JS) → Fetch MCP or Tavily extract - └─ Dynamic content or user interaction needed? - ├─ YES → Playwright MCP - │ (click, type, navigate, extract) - └─ NO → Just need HTML? → Fetch MCP -``` - -### Memory & knowledge storage - -``` -START - ↓ -Need to store/retrieve information across sessions? - ├─ NO → Keep in conversation context - └─ YES → What's the primary access pattern? - ↓ - Do relationships between items matter? - ├─ NO → Need similarity/semantic search? - │ ├─ YES → Qdrant MCP - │ │ (find by meaning: "auth patterns" → OAuth/JWT/etc.) - │ └─ NO → Simple storage → Filesystem or notes - │ - └─ YES → Need explicit relationships? - ├─ YES → Memory MCP - │ (track X relates to Y: person → works_at → company) - └─ NO → Qdrant MCP sufficient - -Special case: Complex knowledge base? - → Use BOTH: - • Qdrant: Store searchable content - • Memory: Track relationships between content - -Example: Code snippet library - → Qdrant: Store snippets, find by description - + Memory: Track which projects/patterns use which snippets -``` - -## Rate limit management - -### Critical limits to track - -| Tool | Limit Type | Amount | Reset | Severity | -|------|-----------|---------|-------|----------| -| Tavily | Monthly | 1000 credits | 1st of month | 🟡 MEDIUM | -| Brave | Monthly | 2000 queries | Monthly | 🟡 MEDIUM | -| Sourcegraph | None | ∞ | N/A | 🟢 SAFE | -| Fetch | None | ∞ | N/A | 🟢 SAFE | -| Playwright | None | ∞ | N/A | 🟢 SAFE | - -### Strategies - -1. **Always exhaust unlimited tools first** (Sourcegraph, Fetch) -2. **Use monthly-reset tools wisely** (Tavily/Brave) - both reset monthly -3. **Front-load Tavily early in month** - will reset on 1st - -### Cost-benefit analysis before using limited tools - -**Before using Tavily (1k/month):** - -- Can Fetch do this for known URLs? (Unlimited) -- Can Brave do this basic search? (2k/month) -- Is this worth using monthly quota? -- Early in month vs. late in month? - -## Special cases & gotchas - -### GitHub content - -❌ **DON'T:** Use Fetch MCP on github.com URLs (not supported) - -✅ **DO:** Use one of these: - -1. **Best:** `raw.githubusercontent.com/<user>/<repo>/<branch>/<file>` via Fetch -2. **Also good:** `gh` CLI locally -3. **For search:** Sourcegraph MCP -4. **For analysis:** Clone locally + Git MCP + Serena MCP - -### Documentation lookup - -**Use this priority:** - -1. Context7 (official docs, version-specific) -2. Tavily search (general web docs, tutorials) -3. Sourcegraph (real-world usage examples) - -### Multi-page content extraction - -**Recommended sequence:** - -1. Tavily search to find relevant pages -2. Tavily extract on specific URLs -3. Tavily map/crawl for site structure -4. If still insufficient, Fetch iteratively on known URLs - -### Dynamic web content - -**Use this priority:** - -1. Playwright (for JS-heavy sites, form interactions, dynamic content) -2. Tavily extract (for simpler extractions) -3. Fetch MCP (for static HTML only) - -### Memory & knowledge storage - -**Memory MCP Best Practices:** - -❌ **DON'T:** - -- Use passive voice for relations ("is_managed_by" → use "manages") -- Create duplicate entities (check with `search_nodes` first) -- Store multiple facts in one observation -- Use complex entity names with spaces/special chars - -✅ **DO:** - -- Use active voice relations: "John --works_at--> Company" (not "Company --employs--> John") -- Use underscores in names: "John_Smith", not "John Smith" -- Keep observations atomic: ["Speaks Spanish", "Graduated 2019"] (not ["Speaks Spanish and graduated in 2019"]) -- Use consistent entity types across the graph - -**Qdrant vs Memory Decision:** - -| Scenario | Use Qdrant | Use Memory | Use Both | -|----------|-----------|------------|----------| -| Store code snippets for "find similar" | ✅ | ❌ | Optional | -| Track who created what code | ❌ | ✅ | Recommended | -| Personal knowledge base | ✅ | ❌ | Optional | -| Project relationship map | ❌ | ✅ | N/A | -| Searchable docs + author tracking | ✅ | ✅ | ✅ | - -**Data Persistence:** - -- **Qdrant**: Data in Docker volume (survives restarts) OR cloud (persistent) -- **Memory**: JSONL file (location: `MEMORY_MCP_STORAGE_PATH` or default `~/.memory-mcp/memory.jsonl`) -- Both require explicit deletion - data persists across sessions - -### When multiple tools can work - -**Default to this order:** - -1. Unlimited tools (Sourcegraph, Fetch) -2. Monthly-reset tools (Tavily, Brave) - prefer Tavily for citations - -## Summary: golden rules - -### Search & research - -1. **Sourcegraph first for code**, Tavily first for web -2. **Fetch is unlimited** - use liberally for simple fetches -3. **Context7 for official docs**, Sourcegraph for real examples -4. **Tavily for citations**, Brave as fallback when Tavily exhausted -5. **Front-load Tavily early each month** before credits run out - -### Memory & knowledge - -8. **Qdrant for "find similar"**, Memory for "X relates to Y" -9. **Both memory tools have no rate limits** - use freely for persistent storage -10. **Qdrant needs Docker OR cloud**, Memory works out of the box -11. **Memory relations in active voice** - "works_at" not "is_employed_by" -12. **Data persists across sessions** - remember to clean up when done - -## Quick decision flowchart - -``` -Need to accomplish task - ↓ -Is it code-related? - ├─ YES → Finding examples? → Sourcegraph - │ ↓ - │ Need docs? → Context7 - │ ↓ - │ Editing/refactoring? → Serena - │ ↓ - │ Security scan? → Semgrep - │ - └─ NO → Is it web/research? - ├─ YES → General info? → Tavily - │ ↓ - │ Simple URL? → Fetch - │ ↓ - │ Need semantic search? → Try Tavily first, then Exa - │ ↓ - │ Complex crawl? → Try Tavily, then consider Firecrawl - │ - └─ NO → Memory storage? → Qdrant (semantic) or Memory (graph) - ↓ - Files? → Filesystem - ↓ - Git? → Git MCP -``` - diff --git a/tools/tools.md b/tools/tools.md deleted file mode 100644 index ca9c805c..00000000 --- a/tools/tools.md +++ /dev/null @@ -1,147 +0,0 @@ -# Tools set up by [`scripts/set-up-tools.sh`](scripts/set-up-tools.sh) - -> This file is meant for both humans and agents to read - -- [**MCP servers**](#mcp-servers) -- [**Non-MCP tools**](#non-mcp-tools) - -A **full decision guide as to when to use each tool** can be found at [`tools-decision-guide.md`](tools-decision-guide.md). - ---- - -## MCP servers - -### MCP servers for browsing/researching/fetching stuff from the internet - -A few of these servers have redundant/duplicate roles. This is on purpose so we have fallback choices in case we hit rate limits. - -| Server | Functionality | How it's run/talked to by agents | Restrictions | -| :----- | :------------ | :------------ | :----------- | -| **Fetch MCP** | fetch HTTP/HTTPS URLs; turns HTML into Markdown for faster/more accurate processing by LLMs (optional raw HTML); chunk reading via start_index; custom UA/robots handling | Stdio with private client-managed instance | None | -| **Context7 MCP** | pull up-to-date, version-specific API/code docs & examples into prompts | **Claude & Gemini**: HTTP using Context7's cloud-hosted server; **Codex**: stdio with local proxy that talks to Context7's cloud server | Free tier used: **only allows accessing *public* repos** | -| **Sourcegraph MCP** | Basically **"Google for code"**: lets your agent search across repos (and branches) using powerful filters (regex, language, and file-path), then open the matching files/line ranges to pull the exact code snippets you're looking for. Also includes **guided search prompts**, so the agent can turn a natural request (e.g., "find all places we construct HttpClient with a 5s timeout") into precise queries and iterate until it finds what you need. | Free tier used: search covers **public repos only**; covering private/org repos requires paid plans | HTTP with locally-run server (that talks to [Sourcegraph Public Code Search](https://sourcegraph.com/search)) | -| **Tavily MCP** | Handles searching, extracting, mapping and crawling the web, with citations | HTTP with Tavily's cloud-hosted server | Free version used: gives 1000 API credits/month (resets on 1st of month); *[click to see amount of credits used for each request type](https://docs.tavily.com/documentation/api-credits#api-credits-costs)* | -| [**Brave MCP**](https://github.com/brave/brave-search-mcp-server) | Privacy-focused fallback/alternative search engine with generous free tier | stdio with private client-managed instances | Free tier used: 2000 queries/month, limited to basic web search only | - -> Important: -> - **The *Fetch* MCP does not support fetching directly from the GitHub website** (e.g. to look up API-/code-related info about public repos) -> - Instead, tell the agent to use one of these solutions, depending on what you need from GitHub: -> - **Best / most straightforward:** -> -> - Use Fetch MCP to fetch from `https://raw.githubusercontent.com/<path>`, where `<path>` is usually something like `<user/org>/<repo>/<dir>/<file>` -> - Use `gh` CLI locally -> -> - **Other solutions:** -> -> - Use Sourcegraph MCP -> - Clone repos locally and use git via Bash to navigate them - -### MCP memory servers - -> [!IMPORTANT] -> **MANDATORY USAGE REQUIREMENT** -> -> **ALL agents (Codex, Gemini, Claude Code) MUST store memories after ANY task involving:** -> - Analysis, investigation, thinking, reasoning, derivation -> - Problem-solving, debugging, optimization -> - Discovery of patterns, gotchas, undocumented behavior -> - Architectural decisions, trade-offs, design choices -> -> **This is NOT OPTIONAL. Memory storage = part of completing the task.** - -| Server | Functionality | How it's run/talked to by agents | Restrictions | -| :----- | :------------ | :------------ | :----------- | -| **Qdrant MCP** | **[MANDATORY]** *semantic memory* layer: MUST save discoveries, solutions, insights, patterns after EVERY analytical task; retrieves by semantic meaning using FastEmbed HNSW vector search | HTTP with locally-running server, backed by *Qdrant DB instance running in a local Docker container* | None | -| **Memory MCP** | **[MANDATORY]** *structured memory* layer (knowledge graph): MUST track entities/relations when working on projects; stores who/what/how relationships, system architecture, dependencies | Stdio with private client-managed instances | None (completely free, local JSONL storage) | - -#### Claude-only: `claude-mem` context management plugin - -**What it is**: A persistent memory compression system that automatically preserves context across Claude Code sessions. Unlike Qdrant/Memory MCPs which require manual saving, claude-mem operates fully automatically through Claude Code's plugin hook system. GitHub repo: [thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - -**How it works automatically**: -- **5 Lifecycle Hooks** capture events without manual intervention: - - `SessionStart`: Injects summaries from last 10 sessions into context with progressive disclosure (token costs visible) - - `UserPromptSubmit`: Creates session record, saves raw user prompts for search - - `PostToolUse`: Fires after EVERY tool execution (Read, Write, Edit, Bash, etc.), captures observations - - `Stop`: Generates session summaries (request, completed, learned, next_steps) - - `SessionEnd`: Marks sessions complete (graceful cleanup, preserves work across `/clear`) -- **Worker Service** (PM2-managed Express server on port 37777): - - Processes observations via Claude Agent SDK - - Extracts structured learnings: decisions, bugfixes, features, refactors, discoveries, changes - - Auto-starts when first session begins -- **SQLite Database** (`~/.claude-mem/claude-mem.db`): - - Stores sessions, observations, summaries, and user prompts - - FTS5 full-text search with SQL injection protection (332 attack tests) - - Tracks files read/modified, concepts, types, and relationships -- **Progressive Disclosure**: Context appears as layered timeline at session start - - Layer 1 (Index): See what exists with token costs (🔴 critical, 🟤 decision, 🔵 informational) - - Layer 2 (Details): Fetch full narratives on-demand via MCP search - - Layer 3 (Perfect Recall): Access source code and original transcripts - -**How agents can use it manually** (7 MCP search tools available in all Claude sessions): -- `search_observations` - Full-text search across observations (title, narrative, facts, concepts) - - Filter by: type, concepts, files, project, date range - - **Always start with `format: "index"` (50-100 tokens/result) before using `format: "full"` (500-1000 tokens/result)** -- `search_sessions` - Full-text search across session summaries -- `search_user_prompts` - Search raw user requests (trace intent → implementation) -- `find_by_concept` - Find observations tagged with specific concepts (e.g., "architecture", "security") -- `find_by_file` - Find all work related to specific files (e.g., "worker-service.ts") -- `find_by_type` - Find by observation type (decision, bugfix, feature, refactor, discovery, change) -- `get_recent_context` - Get recent session context for debugging/recovery -- **Citations**: All results use `claude-mem://` URIs for referencing historical context - -**How Codex and Gemini CLIs can replicate this with Qdrant + Memory MCPs**: - -Since Codex/Gemini lack Claude Code's hook system, they must manually implement similar patterns: - -- **MANDATORY OBSERVATION LOGGING** (supplements automatic hooks): - - **After investigating code**: MUST save discoveries to Qdrant (patterns, gotchas, insights) + Memory MCP (component relationships) - - **After making decisions**: MUST create entities in Memory MCP with relations (e.g., "JWT" -[chosen_for]→ "authentication" -[because]→ "stateless design") - - **After fixing bugs**: MUST store in Qdrant: root cause, symptoms, fix approach, prevention tips, with metadata (file, type, concepts) - - **After solving problems**: MUST store solution + why it worked in Qdrant - - **After optimizing**: MUST store bottleneck found, optimization applied, metrics in Qdrant - - **After analyzing architecture**: MUST create entities/relations in Memory MCP for system structure - - Use Qdrant for semantic retrieval ("find work related to authentication") and Memory MCP for relationship traversal - - **Not storing = incomplete task** - -- **Session summaries** (replaces Stop hook): - - Before ending work, manually create summary and store in Qdrant - - Include: request, completed, learned, next_steps (same structure as claude-mem) - - Tag with project name and date for filtering - -- **Context recovery** (replaces SessionStart hook): - - Start sessions by searching Qdrant for relevant past work (semantic search by project/topic) - - Fetch related entities from Memory MCP knowledge graph - - Build context progressively as needed (similar to progressive disclosure) - -- **File tracking** (replaces automatic file_read/file_modified tracking): - - Manually create Memory MCP observations when reading/modifying files - - Link files to concepts via relations (e.g., "auth.ts" -[implements]→ "JWT authentication") - -- **Key differences**: - - claude-mem is automatic (zero intervention), Qdrant+Memory requires agent discipline - - claude-mem uses FTS5 full-text search, Qdrant uses vector/semantic search (different strengths) - - claude-mem has typed observations (decision, bugfix, etc.), must manually tag in Qdrant/Memory - - claude-mem uses Claude Agent SDK for AI extraction, Qdrant+Memory requires agent self-extraction - - claude-mem tracks tool executions automatically, Qdrant+Memory requires explicit "save this" calls - -- **Advantage of manual approach**: Works across all CLI agents (Codex, Gemini, Claude), not just Claude Code - - -### Other MCP servers - -| Server | Functionality | How it's run/talked to by agents | Restrictions | -| :----- | :------------ | :------------ | :----------- | -| **PAL MCP *(`clink` only)*** | multi-model orchestration; CLI-to-CLI bridge ("clink"); spawn sub-agents; context threading across tools/CLIs | stdio with per-CLI instances | None | -| **Filesystem MCP** | bulk file reads (filtered to `read_multiple_files` only for 30-60% token savings on 10+ files; use native Read/Write/Edit for other operations) | stdio with private client-managed instances | None | -| **Semgrep MCP** | lets your agent **(1)** scan source code locally (no code leaves your machine) using AST-aware, pattern-based rules to *catch security issues, bugs, and risky anti-patterns across many languages* **(2)** run targeted scans (a file, dir, or diff) or full-repo checks **(3)** choose rulesets (built-in or custom YAML rules you write in code-like patterns) **(4)** get structured findings back, by file/line, rule ID, severity, message, code snippet **(5)** autofix suggestions when a rule defines a fix | HTTP with locally-running instance of *Semgrep's free "community edition" server* | Free "community edition" used: see *[full list of Semgrep community edition features](https://semgrep.dev/docs/semgrep-pro-vs-oss)* | -| **Serena MCP** | language-server-powered semantic code navigation, refactoring, and editing; provides IDE-grade symbol search (`find_symbol`, `find_referencing_symbols`), structural edits (`insert_after_symbol`, `rename_symbol`, `replace_symbol_body`), project onboarding/memories, and LSP integration across 20+ languages (Python, TypeScript, Go, Rust, Java, etc.); complements Filesystem MCP by working at the semantic level instead of whole-file operations | HTTP with locally-running server (cloned from GitHub repo), downloads language servers as needed | None | -| **Playwright MCP** | browser automation; interact with web pages via structured accessibility snapshots (no vision models needed); navigate, click, type, extract content; supports Chrome, Firefox, WebKit, with device emulation, storage state, custom scripts; runs headless or headed; all processing local via Playwright | stdio with private client-managed instances | None (free, local execution) | - -## Non-MCP tools - -| Tool | Type | Functionality | -| :--- | :--- | :------------ | -| **[GitHub SpecKit](https://github.github.io/spec-kit/) *(amazing, your agents will never hallucinate or get distracted again)*** | Command-line tool | Enables **Spec-Driven Development** via `specify` CLI, which allows making detailed constitution/spec/plan/tasks for each project; agent-agnostic templates (Copilot/Claude/Gemini) | - -> There's a bit of a learning curve for getting to use GitHub SpecKit, but it's definitely worth it diff --git a/uv.lock b/uv.lock index 71615277..62840a74 100644 --- a/uv.lock +++ b/uv.lock @@ -8,6 +8,7 @@ version = "0.1.0" source = { editable = "." } dependencies = [ { name = "pyyaml" }, + { name = "tomlkit" }, ] [package.dev-dependencies] @@ -22,7 +23,10 @@ dev = [ ] [package.metadata] -requires-dist = [{ name = "pyyaml", specifier = ">=6.0" }] +requires-dist = [ + { name = "pyyaml", specifier = ">=6.0" }, + { name = "tomlkit", specifier = ">=0.12" }, +] [package.metadata.requires-dev] dev = [ @@ -393,6 +397,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/1c/4c/cc276ce57e572c102d9542d383b2cfd551276581dc60004cb94fe8774c11/responses-0.25.8-py3-none-any.whl", hash = "sha256:0c710af92def29c8352ceadff0c3fe340ace27cf5af1bbe46fb71275bcd2831c", size = 34769, upload-time = "2025-08-08T19:01:45.018Z" }, ] +[[package]] +name = "tomlkit" +version = "0.14.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/c3/af/14b24e41977adb296d6bd1fb59402cf7d60ce364f90c890bd2ec65c43b5a/tomlkit-0.14.0.tar.gz", hash = "sha256:cf00efca415dbd57575befb1f6634c4f42d2d87dbba376128adb42c121b87064", size = 187167, upload-time = "2026-01-13T01:14:53.304Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/b5/11/87d6d29fb5d237229d67973a6c9e06e048f01cf4994dee194ab0ea841814/tomlkit-0.14.0-py3-none-any.whl", hash = "sha256:592064ed85b40fa213469f81ac584f67a4f2992509a7c3ea2d632208623a3680", size = 39310, upload-time = "2026-01-13T01:14:51.965Z" }, +] + [[package]] name = "types-pyyaml" version = "6.0.12.20250915"