-
Notifications
You must be signed in to change notification settings - Fork 295
Open
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Description
Replace the flat todo list with a two-level Planner → Executor hierarchy. The Planner sets strategic milestones with success criteria, iteration budgets, and contingency plans. The Executor (existing ReAct loop) works within the current milestone's scope. The Planner re-evaluates after each milestone completes or when the Executor gets stuck.
Why this matters
The agent's #1 failure mode on long engagements is losing strategic direction. After 15+ iterations, the flat todo list becomes stale and the agent drifts — repeating scans, trying random exploits, or spending 30 iterations on recon when it should have started exploitation 20 iterations ago.
A Planner solves this by:
- Budget enforcement: each milestone gets a max iteration count. "Reconnaissance" gets 5 iterations — if it's not done by then, the planner forces a decision: proceed with what you have or ask the user.
- Contingency planning: "If SQL injection fails on the web app, pivot to SSH brute force." Without this, the agent perseverates on a failing strategy because it has no pre-planned alternative.
- Progress tracking with success criteria: "Reconnaissance is complete when we have open ports, services, and at least 1 potential vulnerability." The current todo list has no completion criteria — the agent never knows when to stop a phase.
- Replanning on failure: when 3+ iterations pass with no progress, the planner activates a contingency plan automatically instead of looping.
Architecture (from README.EMPOWER.md Section 3.1)
┌─────────────────┐
│ PLANNER │ (Strategic: sets milestones, tracks progress)
│ "What to do" │
└────────┬────────┘
│ Plan / Replan
┌────────▼────────┐
│ EXECUTOR │ (Tactical: existing ReAct loop)
│ "How to do it" │
└────────┬────────┘
│ Status update
┌────────▼────────┐
│ PLANNER │ (Evaluate: are we on track?)
└─────────────────┘
What already exists
AgentState.todo_list— lightweight LLM-managed todo (no budgets, no completion criteria)- Phase system (informational → exploitation → post_exploitation) — coarse-grained
- Basic failure loop detection (3+ consecutive failures triggers warning)
conversation_objectivestracking in state
What needs to be built
-
StrategicPlanPydantic model with milestones, success criteria, iteration budgets, contingencies -
_planner_checkconditional node beforethink— evaluates progress, triggers replan - Replanning logic: budget exceeded → activate contingency or ask user
- Upgrade
todo_listtostrategic_planin AgentState - Planner LLM prompt that generates milestone-based plans from user objectives
- Progress evaluation: compare current state against milestone success criteria
- Frontend display of current plan/milestone status (optional)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Projects
Status
Up for grabs