Skip to content

Agent Intelligence: Hierarchical Planning (Plan-and-Act) #50

@samugit83

Description

@samugit83

Description

Replace the flat todo list with a two-level Planner → Executor hierarchy. The Planner sets strategic milestones with success criteria, iteration budgets, and contingency plans. The Executor (existing ReAct loop) works within the current milestone's scope. The Planner re-evaluates after each milestone completes or when the Executor gets stuck.

Why this matters

The agent's #1 failure mode on long engagements is losing strategic direction. After 15+ iterations, the flat todo list becomes stale and the agent drifts — repeating scans, trying random exploits, or spending 30 iterations on recon when it should have started exploitation 20 iterations ago.

A Planner solves this by:

  1. Budget enforcement: each milestone gets a max iteration count. "Reconnaissance" gets 5 iterations — if it's not done by then, the planner forces a decision: proceed with what you have or ask the user.
  2. Contingency planning: "If SQL injection fails on the web app, pivot to SSH brute force." Without this, the agent perseverates on a failing strategy because it has no pre-planned alternative.
  3. Progress tracking with success criteria: "Reconnaissance is complete when we have open ports, services, and at least 1 potential vulnerability." The current todo list has no completion criteria — the agent never knows when to stop a phase.
  4. Replanning on failure: when 3+ iterations pass with no progress, the planner activates a contingency plan automatically instead of looping.

Architecture (from README.EMPOWER.md Section 3.1)

                    ┌─────────────────┐
                    │     PLANNER     │  (Strategic: sets milestones, tracks progress)
                    │  "What to do"   │
                    └────────┬────────┘
                             │ Plan / Replan
                    ┌────────▼────────┐
                    │    EXECUTOR     │  (Tactical: existing ReAct loop)
                    │  "How to do it" │
                    └────────┬────────┘
                             │ Status update
                    ┌────────▼────────┐
                    │    PLANNER      │  (Evaluate: are we on track?)
                    └─────────────────┘

What already exists

  • AgentState.todo_list — lightweight LLM-managed todo (no budgets, no completion criteria)
  • Phase system (informational → exploitation → post_exploitation) — coarse-grained
  • Basic failure loop detection (3+ consecutive failures triggers warning)
  • conversation_objectives tracking in state

What needs to be built

  • StrategicPlan Pydantic model with milestones, success criteria, iteration budgets, contingencies
  • _planner_check conditional node before think — evaluates progress, triggers replan
  • Replanning logic: budget exceeded → activate contingency or ask user
  • Upgrade todo_list to strategic_plan in AgentState
  • Planner LLM prompt that generates milestone-based plans from user objectives
  • Progress evaluation: compare current state against milestone success criteria
  • Frontend display of current plan/milestone status (optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    Status

    Up for grabs

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions