Problem
squad.agent.md is ~95k characters and is loaded into every session's system prompt in full. Most of that content is reference material the coordinator only needs sometimes — casting algorithms, ceremony config, PRD intake, issue lifecycle, plugin marketplace, etc.
Impact on any LLM harness:
- Per-turn input cost — the full charter is re-sent on every API call. Prompt caching helps but cache misses (session start, idle gaps beyond cache TTL) pay the full cost.
- Context window consumed — a fixed window means every character in the system prompt is unavailable for conversation, tool output, or file content. Long sessions hit compaction sooner.
- Time-to-first-token — the model must process the full prompt before emitting. Scales with size.
The charter already marks ~11 sections as "on-demand reference" with pointers to template files — but the full body is duplicated inline. That duplication is the root cause.
Proposed approach
For each "on-demand reference" section, remove the full body from squad.agent.md and keep only a trigger stub — enough for the coordinator to know when to read the template, not what it says.
Example stub:
## Ceremonies
Structured team alignment meetings (design reviews, retros, planning).
**Triggers:** user requests a ceremony by name; auto-triggered \`before\`/\`after\` rules in \`.squad/ceremonies.md\`.
**On-demand reference:** Read \`.squad/templates/ceremony-reference.md\` for config format, facilitator spawn template, and execution rules.
Stays inline (always-loaded core, critical path):
- Coordinator identity, refusal rules, mode detection
- Routing table, Response Mode Selection, Eager Execution
- Per-harness spawn templates
- Drop-box pattern, Worktree Awareness, Orchestration Logging
- Reviewer Rejection Protocol, Source of Truth Hierarchy
- Anti-patterns
Moves out (on-demand, loaded only when triggered):
- Casting algorithm + universe table
- Ceremony config schema + facilitator template
- PRD intake flow
- Issue lifecycle (branch/PR/merge spawn prompts)
- Human / coding-agent member details
- Ralph's full work-check cycle + idle-watch
- Plugin marketplace installation flow
- MCP config + per-service fallback rules
- Multi-agent artifact format, constraint budget tracking
Expected impact
Charter drops from ~95k → estimated 35–45k chars. On-demand sections load only when triggered — zero cost on turns that don't touch them. Benefits all supported harnesses, not just ones that surface size warnings.
Open questions
- Stale references — install/upgrade should verify all referenced template files exist.
- Discoverability — readers browsing the charter on GitHub lose the full prose; README should index template files.
- Trigger fidelity — stubs need enough signal for the coordinator to know when to load the template. Too terse and routing breaks.
- Import chains — do the supported harnesses follow transitive includes inside instruction files? If yes, stubs could include templates for single-source behavior. If no, agents must read templates at runtime.
- Version stamping — the HTML version comment must stay in whichever file the install script checks.
Acceptance criteria
Problem
squad.agent.mdis ~95k characters and is loaded into every session's system prompt in full. Most of that content is reference material the coordinator only needs sometimes — casting algorithms, ceremony config, PRD intake, issue lifecycle, plugin marketplace, etc.Impact on any LLM harness:
The charter already marks ~11 sections as "on-demand reference" with pointers to template files — but the full body is duplicated inline. That duplication is the root cause.
Proposed approach
For each "on-demand reference" section, remove the full body from
squad.agent.mdand keep only a trigger stub — enough for the coordinator to know when to read the template, not what it says.Example stub:
Stays inline (always-loaded core, critical path):
Moves out (on-demand, loaded only when triggered):
Expected impact
Charter drops from ~95k → estimated 35–45k chars. On-demand sections load only when triggered — zero cost on turns that don't touch them. Benefits all supported harnesses, not just ones that surface size warnings.
Open questions
Acceptance criteria