A multi-agent system that discovers real-world frictions from the web and turns them into a ranked shortlist of startup-idea candidates — built entirely as orchestrated Claude Code subagents that communicate through structured JSON.
Status: Architecturally complete; each agent is individually built and calibrated. It has not yet been run fully end-to-end, and it produces candidates for a downstream feasibility workflow — not validated business cases. See Status.
Most LLM brainstorming collapses to the same handful of tech-flavored ideas. Friction Engine fights that on purpose: it starts from a deliberately heterogeneous set of personas — spanning professions, life stages, and economic classes, explicitly away from the model's default tech-centric "diverse" centroid — sends an agent into the web to find where each persona actually hurts, then runs a multi-round agent debate to generate and pressure-test solutions to that pain.
The output is grounded in observed friction, not speculation.
The architecture is a direct response to three documented failure modes of single-shot LLM ideation: fixation (early ideas anchor everything after them), homogenization (independent prompts converge on the same "hivemind" answers), and the ideation–execution gap (LLM ideas score well until you actually try to build them). Persona heterogeneity attacks homogenization; the multi-round critique loop attacks fixation. The research grounding and citations live in llm-ideation-methodology.md.
Seven agents and one orchestrator. Each phase is an isolated subagent invocation that consumes and emits JSON — no long-lived conversation carries state across phases.
| # | Agent | Model | Role |
|---|---|---|---|
| 1 | persona-generator | Sonnet | Generates 6–8 deliberately diverse personas; self-audits coverage gaps |
| 2 | discovery-agent | Sonnet + Web | One per persona — searches and fetches public sources for real frictions |
| 3 | aggregator | Sonnet | Semantic dedup/merge across parallel outputs (4 modes, incl. a groupthink check) |
| 4 | vetter | Sonnet | Scores each friction through JTBD + Mom Test lenses (opportunity = √(importance × dissatisfaction)) |
| 5 | ideation-agent | Opus | 3-round debate per friction: propose → cross-critique → revise (4 agents × 3 rounds) |
| 6 | light-filter | Sonnet | Feasibility screen (bypassed by default — see below) |
| 7 | output-synthesizer | Sonnet | Composite-scores, ranks, and emits JSON + a human-readable report |
The orchestrator (/run-pipeline) sequences the phases, pauses at a cost-gated approval checkpoint before the Opus ideation burn, and supports resume and telemetry.
personas → discovery (×N personas, parallel) → aggregate
→ vet (×N frictions) → [APPROVAL GATE]
→ ideate (4 agents × 3 rounds / friction) ⇄ aggregate each round
→ filter → synthesize → ranked report
The hard part wasn't the prompts — it was making claude -p subagents behave reliably under orchestration. A few hard-won calibrations, each documented with run data in PIPELINE_STATUS.md:
- Output truncation is silent.
claude -pcaps output at 32K tokens by default; the heavy aggregator would get cut off mid-array with no error. A 5-run isolation study pinned it down — the fix isCLAUDE_CODE_MAX_OUTPUT_TOKENS=128000. - Tool permissions don't inherit.
claude -pignores thetools:frontmatter; without an explicit--allowedTools WebSearch,WebFetch, discovery agents silently fall back to hallucinating frictions from training data (observed: 5 of 8 personas fabricated, 3 of 8 honestly reported zero). The honest-zero failures are what surfaced the bug. - Long jobs outlive the Bash ceiling. The heavy aggregator runs 15–25 min, past the 600s Bash-tool limit — so it dispatches in the background and waits on a completion signal rather than blocking.
- Don't filter twice. The light-filter phase is bypassed by default on purpose: a downstream workflow independently re-evaluates every candidate, so internal feasibility filtering is either duplicate work or premature pruning.
Prerequisites: Claude Code CLI, jq, awk, and a Claude plan with Opus headroom (the ideation step is the cost bottleneck).
/run-pipeline # from a Claude Code session in the project root
/run-pipeline --mock-opus # use Sonnet for ideation while testing
/run-pipeline --resume RUN_ID --limit-frictions N --threshold F
Run artifacts land in data/runs/<timestamp>/ (gitignored).
Each agent is built and individually calibrated; the orchestrator implements all phases, the approval gate, resume, and telemetry. What remains is a full end-to-end production run and the larger downstream feasibility workflow these candidates feed into. Treat the output as candidates to investigate, not vetted conclusions.
MIT — see LICENSE.