Skip to content

Agent Intelligence: Metacognitive Monitoring #52

@samugit83

Description

@samugit83

Description

Add a metacognitive monitoring system that runs alongside the think node — detecting reasoning loops, assessing confidence, identifying knowledge gaps, and automatically recommending strategy changes before the agent wastes iterations.

Why this matters

The agent's current failure detection is primitive: "3+ consecutive failures triggers a warning in the prompt." This misses most stuck patterns:

  1. Oscillation loops: the agent alternates between nmap and nuclei endlessly (A→B→A→B) — each individual tool "succeeds" (returns output) but makes no progress. The current detector doesn't catch this because there are no consecutive failures.
  2. Confidence collapse: after 5 failed exploit attempts, the agent keeps trying variants of the same approach with decreasing likelihood of success. It doesn't recognize that its confidence should be near-zero and it should switch strategies entirely.
  3. Knowledge gaps masquerading as tool failures: the agent runs search CVE-2024-XXXXX in Metasploit, gets "no results," and tries again with slightly different syntax. The real problem is the CVE has no Metasploit module — the agent should web_search for alternative exploit code or try a different approach. Without knowledge gap detection, it just retries.
  4. Progress stalls: the agent has been in the "informational" phase for 20 iterations. It keeps gathering more recon data instead of transitioning to exploitation. Nothing is "failing" — it's just not making meaningful forward progress.

What needs to be built

  • MetacognitiveMonitor class with four detection methods:
    • _detect_loop() — exact repeats AND oscillation patterns in last 6 tool calls
    • _assess_confidence() — success/failure ratio + findings count → 0-1 score
    • _detect_knowledge_gap() — "unknown", "not found", "no results" error patterns
    • _check_progress() — iterations since last meaningful finding or phase change
  • MetacognitiveReport output model (loop_detected, confidence_level, knowledge_gap, progress_stalled, strategy_recommendation)
  • Integration into think node: inject warnings into system prompt when issues detected
  • Strategy recommendations: "CHANGE_STRATEGY", "ESCALATE" (ask user), "RESEARCH" (web_search)
  • Confidence threshold for triggering deeper reasoning (ties into System 1/2 if implemented)

What already exists

  • Basic 3-failure loop detection in prompts/base.py (lines 642-671)
  • execution_trace in AgentState with full tool call history
  • current_iteration counter
  • Phase tracking (informational/exploitation/post_exploitation)

Implementation notes

  • Pure Python, no infrastructure changes needed
  • Runs as a synchronous check before each think node LLM call
  • Warnings are injected as text into the system prompt (no architectural changes)
  • Can be toggled via project settings if needed

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    Status

    Up for grabs

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions