Skip to content

feat: multi-agent support with registry, adapters, and CLI/MCP integration (v0.5.0)#72

Merged
dean0x merged 33 commits intomainfrom
feat/67-multi-agent-support
Mar 7, 2026
Merged

feat: multi-agent support with registry, adapters, and CLI/MCP integration (v0.5.0)#72
dean0x merged 33 commits intomainfrom
feat/67-multi-agent-support

Conversation

@dean0x
Copy link
Owner

@dean0x dean0x commented Mar 5, 2026

Summary

Implements pluggable multi-agent support for Backbeat (closes #67). Tasks can now be executed by different AI coding agents instead of only Claude Code.

  • Agent Registry: InMemoryAgentRegistry with AgentAdapter interface for pluggable agent backends
  • 3 Built-in Adapters: Claude Code, Codex CLI, Gemini CLI — each with correct CLI args, auto-accept flags, and env var handling
  • BaseAgentAdapter: Abstract base class eliminates duplication across adapters (Template Method pattern)
  • MCP Tools: DelegateTask accepts agent field, new ListAgents tool, agent support in ScheduleTask and CreatePipeline
  • CLI: --agent/-a flag on beat run, beat schedule create, beat pipeline create; new beat agents list command
  • Database: Migration v7 adds agent TEXT DEFAULT 'claude' column — backward compatible
  • Task Lifecycle: Agent preserved through retry, resume, and schedule execution
  • Error Handling: AGENT_NOT_FOUND and AGENT_MISCONFIGURED error codes with descriptive messages

Architecture

AgentRegistry (interface)
  └─ InMemoryAgentRegistry
       ├─ ClaudeAdapter  (claude --print --dangerously-skip-permissions --output-format json)
       ├─ CodexAdapter   (codex --quiet --full-auto)
       └─ GeminiAdapter  (gemini -sandbox false)

WorkerPool.spawn(task)
  → registry.get(task.agent ?? 'claude')
  → adapter.spawn(prompt, workingDir, taskId)

Key Design Decisions

  • AgentProvider is a string union type validated at boundaries (Zod + type guard)
  • Agent field is optional everywhere — defaults to 'claude' for full backward compatibility
  • Each adapter encapsulates its own command, args, env filtering, and prompt transformation
  • Only Claude strips nesting env vars (CLAUDECODE, CLAUDE_CODE_*) — other agents preserve their API keys
  • ProcessSpawnerAdapter wraps legacy ProcessSpawner interface for test backward compat

Test plan

  • Build passes (npm run build)
  • Type check passes (npx tsc --noEmit)
  • All 7 test groups pass (1,157+ tests, 0 failures)
  • New test coverage: agent registry, all 3 adapters, domain agent field, MCP agent tools, CLI agent commands
  • Backward compatibility: all pre-existing tests pass without modification
  • Agent preserved through retry/resume/schedule lifecycle
  • Schedule repository correctly parses agent from task_template JSON

Dean Sharon and others added 19 commits March 4, 2026 17:26
Introduce AgentProvider type, AgentAdapter/AgentRegistry interfaces,
AGENT_NOT_FOUND/AGENT_MISCONFIGURED error codes, and agent field on
Task/TaskRequest domain models. Default agent is 'claude' for backward
compatibility.

Co-Authored-By: Claude <noreply@anthropic.com>
Each adapter implements AgentAdapter with provider-specific CLI args,
auto-accept flags, and environment variable stripping to prevent
credential leakage between agent processes.
InMemoryAgentRegistry provides Map-based adapter lookup by provider.
ProcessSpawnerAdapter wraps legacy ProcessSpawner as AgentAdapter
for backward compatibility with existing test infrastructure.
Migration v7 adds 'agent TEXT' column to tasks table, defaulting
existing tasks to 'claude'. Task repository updated with agent field
in all SQL statements, Zod schema, and domain mapping.
Worker pool constructor now accepts AgentRegistry instead of
ProcessSpawner. spawn() resolves the correct adapter via
task.agent field (defaults to 'claude'). Bootstrap wiring
updated to construct InMemoryAgentRegistry.
Ensures task.agent is carried forward when retrying or resuming
tasks, maintaining agent affinity across task lifecycle operations.
Add createAgentRegistryFromSpawner helper for backward-compatible
test setup. Update worker pool unit tests, handler-setup tests,
and integration tests to construct EventDrivenWorkerPool with
AgentRegistry instead of ProcessSpawner.
…e and ListAgents tool

Add multi-agent support to the MCP surface:
- DelegateTask: optional agent field (z.enum) selects which agent runs the task
- ScheduleTask: optional agent field propagated to schedule template
- CreatePipeline: per-step agent override + default agent field
- ListAgents: new MCP tool returns all providers with registration status
- TaskStatus: includes agent in single-task response (defaults to 'claude')
- Domain: add agent field to ScheduleCreateRequest and PipelineStepRequest
…splay

CLI surface for multi-agent support:
- beat run: --agent/-a flag to select agent provider (claude, codex, gemini, aider)
- beat agents list: new command showing all available agents with descriptions
- beat status: shows agent field in task detail output (defaults to 'claude')
- help: updated with agent flag docs, agent commands section, and usage example
…rough schedules

- Bootstrap: register Claude, Codex, Gemini, Aider adapters in production
- Bootstrap: pass AgentRegistry to MCPAdapter constructor
- ScheduleManager: propagate agent field to taskTemplate in createSchedule
- ScheduleManager: propagate per-step agent in createPipeline
New test files:
- agents.test.ts: AGENT_PROVIDERS constant, DEFAULT_AGENT, isAgentProvider guard (10 tests)
- agent-registry.test.ts: InMemoryAgentRegistry get/has/list/dispose (11 tests)
- agent-adapters.test.ts: Claude/Codex/Gemini/Aider spawn args and env stripping (16 tests)

Updated test files:
- mcp-adapter.test.ts: DelegateTask agent field, ListAgents tool (5 tests)
- cli.test.ts: agent flag parsing, agents list command, status display (7 tests)
- domain.test.ts: createTask with agent field (4 tests)

Total new tests: 53 | All suites passing: 1177 tests
- Stop stripping auth/config env vars for non-Claude adapters (P0 fix):
  Gemini adapter was stripping GEMINI_API_KEY, breaking authentication.
  Codex and Aider adapters were unnecessarily stripping config vars.
  Only Claude has documented nesting indicators that need stripping.
- Remove unused ok import from process-spawner-adapter (P2 cleanup)
…tory schema

Three alignment fixes for v0.5.0 multi-agent support:

1. TaskRequestSchema in schedule-repository.ts was missing the `agent`
   field. Zod strips unknown fields by default, so scheduled tasks
   always spawned with default agent regardless of what was specified.

2. CLI `beat schedule create` was missing --agent/-a flag parsing.
   The MCP ScheduleTask tool had agent support but the CLI did not.

3. CLI `beat pipeline` was missing --agent/-a flag parsing.
   The MCP CreatePipeline tool had agent support but the CLI did not.

Co-Authored-By: Claude <noreply@anthropic.com>
Addresses Greptile review feedback — validates agent values from the
database against the known provider enum instead of accepting any string.
…gemini)

Removes aider from AgentProvider type, AGENT_PROVIDERS constant,
all Zod schemas, CLI help/error messages, bootstrap registration,
and test fixtures. Deletes src/implementations/aider-adapter.ts.
Pre-spawn auth check (resolveAuth) validates credentials before spawning
agent processes. Resolution order: env var → config file → CLI in PATH →
AGENT_MISCONFIGURED error with actionable hints.

- Add AGENT_AUTH metadata, checkAgentAuth(), maskApiKey() to agents.ts
- Add loadAgentConfig/saveAgentConfig/resetAgentConfig to configuration.ts
- Add resolveAuth() to BaseAgentAdapter with env injection from config
- Add CLI: beat agents check, beat agents config set/show/reset
- Add MCP: ConfigureAgent tool (check/set/reset), enhance ListAgents
- Add tests: 833 passing across core/implementations/CLI/adapters
- Derive Zod agent enums from AGENT_PROVIDERS_TUPLE (single source of truth)
- Fix envPrefixesToStrip regression: exact match CLAUDECODE, prefix CLAUDE_CODE_
- Remove misleading transformPrompt bash wrapping from ClaudeAdapter
- Change cli-login to cli-installed with "auth not verified" display
- Fix killGracePeriodMs non-null assertion with ?? 5000 fallback
- Fix || to ?? coercion for agent field in task-repository
- Clean unused imports in mock-agent.ts
- Add pipeline-level agent default to PipelineCreateRequest
- Make ProcessSpawnerAdapter provider configurable via constructor
- Replace hardcoded agent strings in CLI error messages
- Use AGENT_PROVIDERS.join() instead of hardcoded agent lists in cli.ts
- Refactor agents list/config show to use ui.note() (matches config show, schedule get)
- Refactor agents check to use ui.step() per row (matches status/schedule list)
- Add Agent field to schedule detail view and creation success info
- Show agent in pipeline visualization title
@greptile-apps
Copy link

greptile-apps bot commented Mar 5, 2026

Confidence Score: 4/5

  • Safe to merge with minor documentation/usability improvements recommended; both findings are non-blocking design notes rather than critical bugs.
  • The PR architecture is solid and well-tested (1,157+ passing tests). The two findings are both low-severity: (1) a missing security warning in the MCP tool description (style/documentation), and (2) a design inconsistency where a domain field is never populated by callers but the fallback still functions correctly. Neither finding blocks functionality or introduces runtime risk. Database migration and agent resolution logic are correct. Tests cover the new agent infrastructure thoroughly.
  • src/adapters/mcp-adapter.ts (add security warning to ConfigureAgent.apiKey description), src/services/schedule-manager.ts (clarify or resolve PipelineCreateRequest.agent design pattern)

Comments Outside Diff (2)

  1. src/adapters/mcp-adapter.ts, line 160 (link)

    API key exposed in MCP protocol messages

    The ConfigureAgent tool's apiKey field description (here and in the OpenAPI spec around line 663) lacks a security warning. Unlike the CLI implementation (which warns users that keys passed as command-line arguments are stored in shell history), there is no equivalent notice for the MCP case where the API key will be:

    1. Visible as plaintext in MCP protocol-level logs
    2. Exposed to the calling LLM (visible in the tool's request arguments)
    3. Observable in any middleware or proxy monitoring the MCP message stream

    While the response is correctly masked (line 1531), the concern is the request body. Consider updating the description to surface this risk:

    "API key to store. Warning: the key will be visible in plain text in MCP protocol logs and to the calling model."
    
  2. src/services/schedule-manager.ts, line 331 (link)

    PipelineCreateRequest.agent field is never populated by callers

    PipelineCreateRequest declares an agent field (in domain.ts line 383) intended as a pipeline-level default for steps that don't specify their own agent. However, neither the CLI nor the MCP adapter populates this field:

    • CLI (pipeline.ts line 49): steps.map((prompt) => ({ prompt, agent })) — the --agent flag is pre-baked into each step
    • MCP (mcp-adapter.ts lines 1375–1380): steps.map((s) => ({ ..., agent: (s.agent ?? data.agent) })) — pipeline-level data.agent is flattened into each step

    As a result, request.agent is always undefined when createPipeline is called in this file (line 331), making the step.agent ?? request.agent fallback unreachable in practice. The two-level hierarchy exists in the domain model but is collapsed at the caller boundaries.

    This isn't a runtime bug (the code works today), but it creates a maintenance hazard: a future caller relying only on request.agent would unknowingly bypass the per-step agents without realizing the fallback isn't guaranteed.

    Consider either:

    1. Propagating the pipeline-level agent to request.agent in callers (instead of pre-baking it into steps), or
    2. Documenting that callers must always set the agent per-step and should not rely on the request-level fallback

Last reviewed commit: 75ff638

exit_code: z.number().nullable(),
dependencies: z.string().nullable(),
continue_from: z.string().nullable(),
agent: z.enum(AGENT_PROVIDERS_TUPLE).nullable(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z.enum(AGENT_PROVIDERS_TUPLE).nullable() will throw a Zod validation error for any row with an unrecognized agent value (e.g., a task created by a future version of backbeat with a new provider, or manually inserted). This makes those tasks permanently unreadable from the repository.

A more forward-compatible approach gracefully falls back to null:

Suggested change
agent: z.enum(AGENT_PROVIDERS_TUPLE).nullable(),
agent: z.enum(AGENT_PROVIDERS_TUPLE).nullable().catch(null),

This preserves strict typing for known providers while avoiding catastrophic parse failures on unknown values.

Dean Sharon added 2 commits March 5, 2026 22:31
- Remove unused AgentRegistry import in mock-agent fixture
- Remove unnecessary null coalescing on killGracePeriodMs (Zod default guarantees value)
- Use this.command instead of auth.command in resolveAuth() for consistency with spawn()
- Type saveAgentConfig key parameter as 'apiKey' literal, move runtime validation to CLI boundary
- Set file permissions on config writes (0o700 dirs, 0o600 files) to protect API keys
- Add agents check/config subcommands to help text
@dean0x
Copy link
Owner Author

dean0x commented Mar 5, 2026

Greptile Review Response

Addressed in bff47cd + bff518c

  1. Biome formatting — auto-fixed array .join(), filter predicate collapse, trailing blank line
  2. Unused imports — removed AgentRegistry import from mock fixture
  3. Unnecessary null coalescingkillGracePeriodMs ?? 5000killGracePeriodMs (Zod default guarantees value)
  4. resolveAuth() binary mismatch — now uses this.command consistently with spawn()
  5. Loose key parameter — typed as 'apiKey' literal; runtime validation moved to CLI boundary
  6. API key file permissions0o700 for dirs, 0o600 for files on all config writes
  7. Help text — added agents check, agents config set/show/reset subcommands

Intentionally unchanged

z.enum() for agent column in TaskRowSchema — Strict enum validation is deliberate (commit 663a4f8). It catches data corruption early. A future agent addition would update the enum value alongside the new adapter. Weakening to .string().nullable() would bypass boundary validation.

Deferred to v0.6.0

CLI-in-PATH false positive for Gemini/Codex authcheckAgentAuth() returns ready: true when CLI binary is found, but CLI installed ≠ authenticated for Gemini/Codex. Display already shows "CLI installed (auth not verified)". Proper per-agent auth verification (checking credential files) belongs in v0.6.0 Agent Failover scope.

retryOf: z.string().optional(),
dependsOn: z.array(z.string()).optional(),
continueFrom: z.string().optional(),
agent: z.enum(AGENT_PROVIDERS_TUPLE).optional(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zod enum will fail for unknown future agent values in stored JSON

z.enum(AGENT_PROVIDERS_TUPLE).optional() will throw a Zod validation error if the task_template JSON stored in the database contains an agent value that isn't in the current AGENT_PROVIDERS_TUPLE (e.g., a schedule created by a future version of Backbeat with a new provider). This will make the schedule permanently unreadable from the repository.

A safer approach is to fall back gracefully to undefined for unrecognized values:

Suggested change
agent: z.enum(AGENT_PROVIDERS_TUPLE).optional(),
agent: z.enum(AGENT_PROVIDERS_TUPLE).optional().catch(undefined),

Using .catch(undefined) preserves strict typing for known providers while silently falling back on unrecognized values, preventing catastrophic parse failures for schedules from newer Backbeat versions.

Comment on lines +161 to +163
export function isCommandInPath(command: string): boolean {
const result = spawnSync('which', [command], { stdio: 'ignore' });
return result.status === 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which command is Unix-only — will silently break on Windows

spawnSync('which', [command]) is only available on Unix/macOS. On Windows, which does not exist — it's where. This means isCommandInPath will always return false on Windows, causing all agent adapters to fail with AGENT_MISCONFIGURED during spawn, even when the CLI binary is properly installed.

The codebase already handles Windows-specific logic elsewhere (database.ts checks process.platform === 'win32'). Consider using the same pattern here:

export function isCommandInPath(command: string): boolean {
  const isWindows = process.platform === 'win32';
  const result = spawnSync(isWindows ? 'where' : 'which', [command], { stdio: 'ignore' });
  return result.status === 0;
}

Dean Sharon added 2 commits March 6, 2026 01:17
Previously `beat agents check` showed "CLI installed (auth not verified)"
with [ready] badge and no guidance. Now shows [check auth] badge with
actionable hints (login command, env var, beat agents config set).
- Remove unused AgentProvider import and redundant type casts in agents.ts
  (TypeScript narrows type after isAgentProvider() guard)
- Add explicit void return type to showHelp() in help.ts
- Extract writeConfigFile() helper and ConfigWriteResult type alias in
  configuration.ts to eliminate 4 duplicated try/catch blocks
- Restore AgentRegistry type import in mock-agent.ts (used as return type)
Comment on lines +38 to +74
const header = ` ${'Agent'.padEnd(10)} ${'CLI'.padEnd(8)} ${'Auth'.padEnd(40)} Status`;
ui.info(header);

for (const provider of AGENT_PROVIDERS) {
const agentConfig = loadAgentConfig(provider);
const status = checkAgentAuth(provider, agentConfig.apiKey);

const cliStatus = status.cliFound ? 'found' : '-';
let authDesc: string;

switch (status.method) {
case 'env-var': {
const key = status.envVar ? process.env[status.envVar] : undefined;
authDesc = `${status.envVar} set${key ? ` (${maskApiKey(key)})` : ''}`;
break;
}
case 'config-file':
authDesc = 'API key stored in config';
break;
case 'cli-installed':
authDesc = 'CLI installed (auth not verified)';
break;
default:
authDesc = 'not configured';
}

let badge: string;
if (status.method === 'cli-installed') {
badge = ui.yellow('[check auth]');
} else if (status.ready) {
badge = ui.cyan('[ready]');
} else {
badge = '[action needed]';
}
ui.step(`${provider.padEnd(10)} ${cliStatus.padEnd(8)} ${authDesc.padEnd(40)} ${badge}`);

if (status.hint && (status.method === 'cli-installed' || !status.ready)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkAgents mixes ui.step for header and data rows

ui.step is used both for the section header ('Agent Auth Status') and for each individual row of the table. Depending on how ui.step renders (e.g. with a spinner or special prefix), the table rows will be visually indistinguishable from a step status, which may look odd. Consider using ui.info for data rows and reserving ui.step for progress indicators, consistent with how other commands structure output.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

- Defend against prompt flag injection: add '--' separator before
  positional prompt arg in Claude and Codex adapters (Gemini is safe
  via --prompt <value> binding)
- Enforce file permissions on existing config files: chmodSync after
  writeFileSync since mode option only applies on file creation
- Remove redundant isCommandInPath call in resolveAuth: spawn() already
  verified CLI exists before calling resolveAuth
- Disable Gemini --yolo Docker sandbox by default: inject
  GEMINI_SANDBOX=false via new additionalEnv hook to avoid requiring
  Docker/Podman for task execution
@dean0x
Copy link
Owner Author

dean0x commented Mar 6, 2026

Greptile Round 2 — Addressed & Non-Issues

Pushed 8e02cbd with fixes for the 4 valid comments. Here's the full breakdown:

Fixed

  1. Prompt flag injection (-- separator) — Claude and Codex adapters now use -- before the positional prompt arg. Prevents prompts like "--help" from being parsed as flags by the child CLI's argv parser. Gemini was already safe (uses --prompt <value>).

  2. writeFileSync mode only on creation — Added chmodSync(_configFilePath, 0o600) after write to enforce permissions on pre-existing config files that may have been created with looser permissions.

  3. isCommandInPath called twice per spawn — Removed the redundant check in resolveAuth(). spawn() already verifies CLI binary exists at line 94 before calling resolveAuth(), so step 3 now assumes login-based auth directly.

  4. Gemini --yolo Docker sandbox — Added additionalEnv hook to BaseAgentAdapter. GeminiAdapter overrides it to inject GEMINI_SANDBOX=false, so Docker/Podman isn't required. Users who want sandbox can set GEMINI_SANDBOX=true in their environment.

Non-Issues (no action taken)

  1. which is Unix-only — Project targets macOS/Linux only. No Windows support planned. Adding where.exe fallback for an unsupported platform isn't warranted.

  2. ProcessSpawnerAdapter kill bypasses SIGTERM→SIGKILL — Intentional backward-compat shim (documented in file header). Delegates to injected ProcessSpawner.kill(). Will be removed when all tests migrate to mock AgentAdapters.

  3. ui.step mixing header/rowsui.step is a generic structured output line. Using it for both header and data rows is intentional and consistent with CLI output patterns.

  4. agents namespace bypasses schema — Agent configs are intentionally decoupled from ConfigurationSchema. They live under agents.<provider> in config.json with their own load/save/validation. Merging into the Zod schema would couple agent extensibility to the core config system.

- Fix additionalEnv override order: spread adapter defaults before
  cleanEnv so user environment variables take precedence (e.g.,
  GEMINI_SANDBOX=true in user env overrides adapter's default false)
- Fix --agent flag without value: show "requires an agent name" error
  instead of misleading "Unknown flag: --agent" in pipeline and
  schedule commands
Dean Sharon added 4 commits March 7, 2026 01:21
Remove hardcoded DEFAULT_AGENT = 'claude' constant and replace with
configurable defaultAgent setting. Agent resolution now follows:
explicit task agent > config default > actionable error.

- Add defaultAgent (optional) to ConfigurationSchema with env var
  BACKBEAT_DEFAULT_AGENT support
- Add resolveDefaultAgent() helper with actionable error messages
- Resolve agent at delegation time in TaskManager.delegate()
- Add defensive guard in worker pool for pre-migration tasks
- Update MCP adapter with dynamic agent descriptions
- Update CLI agents/status commands to use config-based default
- Create GitHub issue #74 for beat init interactive setup

Closes #67
…gent

- Remove unused canCancel import from task-manager
- Remove redundant defaultAgent: undefined from DEFAULT_CONFIG
- Update stale doc comment in worker pool (no longer defaults to claude)
- Remove unused handleWorkerError private method
- Update outdated comments in domain.ts that claimed agent defaults to
  'claude' -- now correctly documents the resolution chain:
  explicit task agent > config defaultAgent > error
- Remove trailing blank line in event-driven-worker-pool.ts
…y in CLI

- Forward original AGENT_NOT_FOUND/AGENT_MISCONFIGURED errors from
  agent registry instead of wrapping in generic WORKER_SPAWN_FAILED
- Add shell history warning when API keys are passed as CLI arguments
Dean Sharon added 4 commits March 7, 2026 02:35
- Fix missed spawn error propagation (adapter.spawn errors were still
  wrapped in WORKER_SPAWN_FAILED; now propagated directly like registry errors)
- Add tests for both error propagation paths: registry lookup and adapter spawn
- Also propagate adapter.spawn() errors directly (matching the
  agentRegistry.get() fix from previous commit)
- Add 2 tests verifying error code preservation for both paths
- Fix biome formatting on API key warning line
Replace slate-blue node-graph design with the standard dark
background + left hero + right terminal card layout used by
all other CLI landing pages.
ScheduleManagerService.createSchedule() now calls resolveDefaultAgent()
so scheduled tasks always have a valid agent stamped at creation time,
matching TaskManager.delegate() behavior. Previously, scheduled tasks
bypassed agent resolution, causing WORKER_SPAWN_FAILED at execution.
@dean0x dean0x merged commit 508eb40 into main Mar 7, 2026
2 checks passed
@dean0x dean0x deleted the feat/67-multi-agent-support branch March 7, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.5.0: Multi-Agent Support

1 participant