Skip to content

Plan mode: agent never submits a plan or asks for operator approval before running actions #800

@kylejryan

Description

@kylejryan

Summary

While the operator was in plan mode, the agent never called submit_plan, never showed the plan-review UI (Y approve · N reject · or type feedback to refine), and never asked for approval. It just ran a long sequence of read-only recon actions until the operator hit Ctrl-C.

Plan mode's product contract is "agent proposes, operator approves, then we execute." Right now there's no forcing function — the agent can recon indefinitely and never produce a plan, and the operator never gets a chance to approve anything before the agent starts acting.

Screenshot

plan-mode-bug

Status bar reads Plan (shift+tab to cycle). Agent had been listing HITS (200 OK) / everything else with no plan submission. Operator interrupted manually.

Reproduction

  1. /operator, then Shift+Tab to cycle into Plan mode.
  2. Issue any recon-flavored directive ("crawl this target", "enumerate endpoints", etc.).
  3. Watch the agent rack up http_request / crawl calls without ever calling submit_plan. The review modal never appears.
  4. The only escape is Ctrl-C. Shift+Tab cycling out only opens the review modal if a plan file already exists on disk — otherwise it silently drops into manual with no approval gate.

Expected behavior

Plan mode should be a gated two-phase flow:

  1. Propose — agent reconnaissances and writes a plan via write_plan.
  2. Approve — agent calls submit_plan, the run halts, the review modal appears. Operator presses Y, N, or types feedback. Only Y transitions the operator into an execution mode (manual / auto) with the approved plan injected into the system prompt.

Concretely, that means:

  • A successful submit_plan is a hard stop. The agent must not call any further tools until the operator decides.
  • The transition plan → manual|auto happens only as a result of an operator approval, not as a silent fallback from cycling.
  • Plan mode should converge. If the agent goes too many steps without submitting, surface a system message and halt the run instead of letting recon drift forever.

Desired state (what "fixed" looks like)

  • Every transition out of plan into an execution mode is immediately preceded by an operator approval event.
  • After a successful submit_plan, the next event is the review modal — no further tool calls in between.
  • Plan mode runs always reach a terminal state: approved, rejected (back to refine), or halted with a "no plan submitted" message. Never "drifted indefinitely."
  • The plan content the operator approves is the plan content that ends up in the system prompt (no drift between read-time and approve-time).

Invariants worth enforcing

  • plan → manual|auto only happens via an operator approval tied to a specific plan snapshot.
  • submit_plan only returns success when a non-empty plan exists on disk for the active scope. (Already true; worth a regression test.)
  • A successful submit_plan halts the current agent step before any further tool call. (The swarm path already does this at src/core/workflows/pentest.ts:262 with stopWhen: hasToolCall("submit_plan") — the operator dashboard run path doesn't.)
  • While agentMode === "plan", the active tool list is strictly PLAN_MODE_TOOL_NAMES. (Already enforced at src/core/agents/offSecAgent/offensiveSecurityAgent.ts:319-324 — worth a unit test so it can't regress.)

Pointers for whoever picks this up

These are starting points, not a prescription — pick the shape that fits.

  • The minimal fix is one line: add hasToolCall("submit_plan") to commonInput.stopWhen in src/tui/components/operator-dashboard/index.tsx when agentMode === "plan". That alone closes the "agent keeps going after submit_plan" half of the bug and aligns the operator path with the swarm path.
  • The other half — "agent never submits and the cycle silently drops to manual" — likely wants either a recon budget that surfaces a "no plan submitted" halt, or tightening cycleMode so cycling out of plan without a plan file is a louder failure than today.
  • Plan lifecycle state is currently spread across several refs in operator-dashboard/index.tsx (planSubmittedRef, planRejectedRef, planApprovedPendingRunRef, planGateBypassedOnResumeRef, approvedPlanRef, plus operatorMode and showPlanReview). If you're already in there, collapsing them into one source of truth would make the invariants above easier to enforce — but that's a judgment call, not a requirement for this fix.

Tests in src/tui/components/operator-dashboard/logic.test.ts already cover the cycle / approval logic — extending that file is the natural place for regression coverage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions