Skip to content

feat: batch subagent spawning#158

Open
hntrl wants to merge 2 commits intomainfrom
hunter/batch-task
Open

feat: batch subagent spawning#158
hntrl wants to merge 2 commits intomainfrom
hunter/batch-task

Conversation

@hntrl
Copy link
Member

@hntrl hntrl commented Jan 31, 2026

This PR introduces a declarative mechanism for spawning large batches of subagent tasks (10-1000+) through shell scripts, bypassing the need for models to make repeated individual tool calls. Instead of relying on the LLM to call task N times (which is unreliable at scale), agents can now use a simple spawn_subagent bash command within the execute tool to declare tasks that get executed in parallel with configurable concurrency.

The Problem

When you have a CSV with 1000 tasks that need to be delegated to subagents, the current approach of having the model call the task tool 1000 times is:

  1. Unreliable - Models struggle to consistently execute repetitive tool calls at scale
  2. Slow - Each tool call requires a round-trip through the model
  3. Context-heavy - 1000 individual ToolMessage responses would blow up the context window
  4. Non-deterministic - The model might skip tasks, repeat tasks, or give up mid-way

Declarative Batch Spawning

This PR introduces a new architecture where:

  1. Declaration happens in shell scripts - The model writes shell code that processes input data (CSV, JSON, etc.) and outputs task markers
  2. Execution is handled by the framework - The agent loop intercepts these markers and triggers parallel execution
  3. Results go to files - To avoid context window explosion, detailed results are written to /batch_results/<timestamp>/

Architecture Overview

sequenceDiagram
    participant Model as LLM
    participant Execute as execute tool
    participant State as Agent State
    participant WrapModel as wrapModelCall
    participant BatchTask as batch_task tool
    participant Subagents as Subagent Pool
    participant FS as Filesystem

    Model->>Execute: execute shell script with spawn_subagent
    Note right of Execute: Script outputs SUBAGENT_TASK markers

    Execute->>Execute: Parse markers from stdout
    Execute->>State: Set pendingBatchTask via Command
    Execute-->>Model: ToolMessage with exit code

    rect rgb(240, 248, 255)
        Note over Model,WrapModel: Interception Phase
        WrapModel->>State: Check pendingBatchTask
        State-->>WrapModel: tasks array found
        WrapModel-->>Model: Inject synthetic batch_task call
    end

    Model->>BatchTask: batch_task with tasks array

    rect rgb(240, 255, 240)
        Note over BatchTask,Subagents: Parallel Execution
        loop Concurrent workers (default 10)
            BatchTask->>Subagents: Execute task
            Subagents-->>BatchTask: Result
        end
    end

    BatchTask->>FS: Write summary.json, results.jsonl
    BatchTask->>State: Clear pendingBatchTask
    BatchTask-->>Model: Summary message

    Note right of Model: Model sees concise summary
Loading

Component Interaction

flowchart TB
    subgraph FilesystemMiddleware
        ET[execute tool]
        MP[Marker Parser]
    end

    subgraph SubAgentMiddleware
        WMC[wrapModelCall hook]
        BT[batch_task tool]
        TT[task tool]
    end

    subgraph SharedState
        PBT[(pendingBatchTask)]
    end

    subgraph Execution
        WP[Worker Pool]
        S1[Subagent 1]
        S2[Subagent 2]
        SN[Subagent N]
    end

    ET -->|detects markers| MP
    MP -->|sets| PBT
    WMC -->|reads| PBT
    WMC -->|injects call to| BT
    BT -->|spawns| WP
    WP --> S1
    WP --> S2
    WP --> SN
    BT -->|clears| PBT

    TT -.->|single task| S1
Loading

Key Design Decisions

1. Why spawn_subagent Instead of a New Tool?

We inject a bash function rather than creating a separate batch_task_from_file tool because:

  • Flexibility: The model can process ANY input format (CSV, JSON, YAML, line-delimited, API responses) using standard shell tools (jq, awk, cut, etc.)
  • Composability: Shell pipelines are more expressive than tool parameters
  • No Schema Lock-in: We don't need to anticipate every possible input format
# CSV processing
cat tasks.csv | while IFS=, read -r id desc priority; do
  spawn_subagent "Process $desc with priority $priority"
done

# JSON processing
jq -r '.items[] | .task' data.json | while read task; do
  spawn_subagent "$task"
done

# API response processing
curl -s api.example.com/tasks | jq -r '.[]' | while read task; do
  spawn_subagent "$task"
done

2. Why Synthetic Tool Call Injection?

The wrapModelCall hook intercepts the model call and returns a synthetic AIMessage with a batch_task tool call. This approach:

  • Preserves Message History: The batch_task call appears in the conversation like any other tool call
  • Maintains Tool Visibility: The model sees that batch_task was called and can reason about results
  • Non-Blocking: Execution happens in the normal tool node, not nested inside execute

3. Why File-Based Results?

With 1000 tasks, individual results would explode the context window. Instead:

  • Summary to Model: "Executed 1000 tasks. 998 succeeded, 2 failed."
  • Details to Files: Full results in /batch_results/<timestamp>/results.jsonl
  • Failures Highlighted: Separate failures.jsonl for easy inspection

4. State Schema for Cross-Middleware Communication

The pendingBatchTask state channel is defined in BOTH FilesystemMiddleware (which sets it) and SubAgentMiddleware (which reads it). This enables:

  • execute tool (in FilesystemMiddleware) to declare pending tasks
  • wrapModelCall (in SubAgentMiddleware) to intercept and trigger batch execution
// Both middleware declare the same state channel
const PendingBatchTaskSchema = z
  .object({
    tasks: z.array(
      z.object({
        description: z.string(),
        type: z.string().optional(),
      }),
    ),
  })
  .nullable()
  .optional();

Changes

libs/deepagents/src/middleware/subagents.ts (+479 lines)

New exports for marker parsing:

  • SUBAGENT_MARKER_PREFIX - Constant for marker detection
  • SubagentTask, ParseSubagentMarkersResult - Types for parsed tasks
  • parseSubagentMarkers(), hasSubagentMarkers() - Parser functions
  • PendingBatchTask, BatchTaskResult, BatchTaskSummary - Batch execution types

New batch execution infrastructure:

  • SubAgentStateSchema - Declares pendingBatchTask state channel
  • executeSingleTask() - Runs one subagent and captures result
  • executeBatchTasks() - Worker pool with configurable concurrency
  • writeBatchResults() - Persists results to filesystem
  • createBatchTaskTool() - The batch_task tool definition

Modified createSubAgentMiddleware:

  • Accepts new backend option for writing results
  • Accepts new batchConcurrency option (default: 10)
  • Adds wrapModelCall hook for synthetic batch_task injection
  • Conditionally adds batch_task tool when backend is provided

Refactored createTaskTool:

  • Now accepts pre-computed subagent graphs instead of options object
  • Cleaner separation of concerns

libs/deepagents/src/middleware/fs.ts (+196 lines)

New constants:

  • SPAWN_SUBAGENT_FUNCTION - Bash function injected into every command
  • EXECUTE_TOOL_DESCRIPTION_BASE - Base description without batch docs
  • EXECUTE_BATCH_SPAWNING_DOCS - Documentation for spawn_subagent usage
  • EXECUTE_TOOL_DESCRIPTION - Full description with batch support

New state schema:

  • PendingBatchTaskSchema - Shared schema for pendingBatchTask channel
  • Added pendingBatchTask to FilesystemStateSchema

Modified createExecuteTool:

  • Auto-injects SPAWN_SUBAGENT_FUNCTION prefix to all commands
  • Detects SUBAGENT_TASK: markers in output
  • Parses markers and extracts clean output
  • Returns Command that sets pendingBatchTask in state

New helper:

  • formatExecuteOutput() - Formats command output with exit code/truncation info

libs/deepagents/src/agent.ts (+3 lines)

  • Passes backend: filesystemBackend to createSubAgentMiddleware
  • Enables batch_task tool in the default agent configuration

libs/deepagents/src/middleware/index.ts (+9 lines)

  • Exports new types and constants for external use

libs/deepagents/src/middleware/subagents.int.test.ts (+211 lines)

New test describe block: "Batch Spawning Integration Tests"

  • should parse spawn_subagent markers from execute output
  • should handle malformed markers gracefully
  • should invoke batch_task tool when spawn_subagent markers are detected with createDeepAgent

libs/deepagents/src/middleware/__fixtures__/subagent_tasks.csv (new)

Test fixture with 1000 sample tasks for batch testing.

Usage Examples

Basic CSV Processing

const agent = createDeepAgent({
  model: "claude-sonnet-4-5-20250929",
  backend: myBackend,
});

await agent.invoke({
  messages: [
    new HumanMessage(
      "Process the tasks in /tasks.csv - for each row, analyze the requirement",
    ),
  ],
});

The model will generate something like:

cat /tasks.csv | while IFS=, read -r id category desc; do
  spawn_subagent "Analyze requirement: $desc"
done

Custom Concurrency

createSubAgentMiddleware({
  defaultModel: "claude-sonnet-4-5-20250929",
  backend: myBackend,
  batchConcurrency: 50, // Run up to 50 subagents in parallel
});

Specifying Subagent Types

# Use specific subagent types
spawn_subagent "Review code" "code-review"
spawn_subagent "Write docs" "documentation"
spawn_subagent "Generic task"  # defaults to "general-purpose"

Testing

# Run unit tests
pnpm test src/middleware/subagents.test.ts

# Run integration tests (requires API key)
pnpm test:int src/middleware/subagents.int.test.ts -t "Batch Spawning"

@everywheredennise-oss
Copy link

🤖 Devin AI is starting automated code review...

1 similar comment
@everywheredennise-oss
Copy link

🤖 Devin AI is starting automated code review...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants