Skip to content

v0.8.68 WhaleFlow: context budget management for high-fan-out orchestration #4015

Description

@Hmbown

Problem

When a conductor agent orchestrates 30+ sub-agents, the parent context balloons from agent completion summaries. Each <codewhale:runtime_event kind="subagent_completion"> carries a full self-report (summary, evidence, changes, risks, blockers) — ~1-3KB per agent. At 41 agents, that's 40-120KB of transcript growth from completions alone, before any tool output or model responses.

This causes:

  • Slower model inference (larger prompt)
  • Reduced prefix-cache hit rate (varying completion text breaks the stable prefix)
  • Parent context pressure forcing premature compaction
  • The conductor loses its own working memory to agent noise

Scope

Implement context budget management specifically for orchestrated agent sessions:

  1. Minimized agent briefs — When dispatching an agent from a conductor, send only the context that agent needs (its task + relevant dependency outputs), not the full conductor transcript
  2. Compressed agent results — In the parent transcript, store only: agent_id, status, verdict, files_changed, key_stats. Full agent output goes to artifact files referenced by retrieve_tool_result
  3. Completion batching — When multiple agents complete in rapid succession, batch their results into a single compacted entry
  4. Auto-summarization — After N agent completions (configurable, default 5), auto-compact the parent transcript to summarize completed agent work while preserving active context
  5. Context pressure gating — The conductor should not dispatch more agents if the parent context is above a threshold (e.g., 60%)

Non-goals

  • Not removing agent event data entirely. Full transcripts should be retrievable.
  • Not changing how single-agent sessions work. This is for orchestrated multi-agent sessions.

Acceptance

  • 30 agent completions add <10KB to the parent transcript (vs. ~100KB today)
  • Full agent transcripts are retrievable via artifact references
  • Conductor agent can dispatch new agents without its own context being pushed out
  • Context pressure stays below 60% during a 30-agent orchestration run
  • Tests: measure transcript size after N simulated completions; verify artifact retrieval

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestv0.8.68Targeting v0.8.68whaleflowWhaleFlow branch/leaf workflow runtime and workflow mode

    Projects

    Status
    Backlog

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions