feat: add native tool calling and delegate system#88
Conversation
New files: - message-converter.ts: Convert animachat messages to membrane format with cache_control support for Native mode - membrane-inference.ts: Membrane-based inference service with native tool calling for 1-on-1 chats - animachat-delegate/: Delegate system for tool execution - membrane submodule: antra-tess/membrane middleware Modified files: - inference.ts: Add toolOptions parameter - enhanced-inference.ts: Add toolOptions parameter and pass through - .gitignore: Exclude animachat-delegate/dist/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add delegate handler and manager for remote tool execution - Add tool registry with per-participant tool configuration - Add DelegateStatusPanel and DelegateIndicator components - Add useDelegates composable for state management - Add /api/tools routes for tool and delegate management - Add delegate API keys for authentication - Add grouped tool selection in ParticipantsSection - Fix allowAllTools toggle (null vs empty array logic) - Remove membrane submodule, use npm from GitHub instead - Remove animachat-delegate (moved to separate repo) Co-Authored-By: Claude <noreply@anthropic.com>
Eliminate silent tool shadowing when multiple delegates expose same-named
tools by prefixing delegate tools as {delegateName}__{toolName}.
- Rewrite tool-registry with flat prefixed lookup + compat shim
- Add strict delegateId validation (regex, length, reserved names, blocks __)
- Add reconnect race guard to prevent stale tool unregistration
- Clean old tools on re-manifest before registering new set
- Single timeout source of truth in executeWithTimeout (no hardcoded 30s)
- Fix clearTimeout leak in Promise.race
- Remove preferredDelegateId/delegateId from ToolConfig and UI
- Add enabledTools migration with safe ambiguity handling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-picked generic fixes that apply independently of the MCPL/delegate feature work, identified during review of KradThed/feature/mcpl-migration: - Fix findIndex || 0 logic error in branch index computation (3 locations) findIndex returns -1 (truthy), so || 0 never triggers - Add generatedBranchIds.length > 0 bounds check before array access in handleRegenerate and handleEdit (prevents undefined propagation) - Whitelist PATCH body fields in conversation update route (prevents arbitrary field injection, removes raw body logging) - Add runtime type validation for duplicate conversation options (replaces unsafe `as` casts with typeof checks) - Fix invite check endpoint info leak (stop exposing amount/currency to unauthenticated callers) - Add UUID format validation on admin conversation-size endpoint - Clear WebSocket onmessage handler on reconnect (prevents stale callbacks) Credit: Originally identified by Asura (KradThed) in commits 988e58e, ca388bc, and 17ec646 of the feature/mcpl-migration branch.
Greptile SummaryThis PR extracts the tool calling foundation from Key changes:
Issues found:
Confidence Score: 3/5The PR introduces significant new infrastructure with two blocking issues (toolUseId correctness and auth-path DoS) that should be addressed before wide production use. The bug fixes are clean and the overall architecture is well-structured. However, the empty
|
| Filename | Overview |
|---|---|
| deprecated-claude-app/backend/src/delegate/delegate-manager.ts | New delegate lifecycle manager: session registration, tool routing via WebSocket, and pending-call bookkeeping — has a latent bug where failPendingCalls resolves with toolUseId: ''. |
| deprecated-claude-app/backend/src/delegate/delegate-handler.ts | WebSocket upgrade handler for delegate connections; validates delegateId format, supports both JWT and API-key auth, and includes a reconnect race guard — well-structured, no critical issues. |
| deprecated-claude-app/backend/src/database/index.ts | Adds in-memory delegate API key storage with event-sourced replay; validateDelegateApiKey has an O(n) linear scan that triggers bcrypt per call — a DoS vector without rate-limiting. |
| deprecated-claude-app/backend/src/services/membrane-inference.ts | New 982-line inference service wrapping the @animalabs/membrane middleware; logic is sound but contains 30+ verbose debug console.log calls (with emoji) that will pollute production logs. |
| deprecated-claude-app/shared/src/types.ts | Adds ToolConfig, DelegateApiKey, ToolUseContentBlock, and ToolResultContentBlock schemas; ToolResultContentBlockSchema.content is typed as z.string() but the protocol and ToolResult interface both allow array content. |
| deprecated-claude-app/backend/src/tools/tool-registry.ts | Solid tool registry with {delegateName}__{toolName} namespacing, per-user scoping, policy enforcement, and a safe compat shim for unprefixed name resolution. |
| deprecated-claude-app/backend/src/delegate/trigger-handler.ts | Handles delegate-triggered inference (webhook → conversation message → streaming response); authorization is properly scoped to the authenticated user's conversations. |
| deprecated-claude-app/backend/src/routes/tools.ts | Clean REST routes for tool listing, delegate status, and API key lifecycle (create/list/revoke); all endpoints are properly auth-gated and inputs are validated with Zod. |
| deprecated-claude-app/backend/src/routes/conversations.ts | Bug-fix: conversation PATCH now whitelists allowed fields instead of passing raw req.body; duplicate options now use runtime type checks instead of unsafe as casts. |
| deprecated-claude-app/frontend/src/composables/useDelegates.ts | Singleton composable for real-time delegate status via WebSocket; contains an unused oldToolNames variable (dead code from a removed comparison). |
| deprecated-claude-app/backend/src/websocket/handler.ts | Switches to MembraneInferenceService and adds delegate WebSocket routing and buildToolOptions helper; the delegate path is reached before the main token auth check, which is intentional but unguarded against connection-rate abuse. |
| deprecated-claude-app/frontend/src/services/websocket.ts | Bug-fix: ws.onmessage = null is now cleared alongside onclose/onerror during reconnect, preventing stale message callbacks from firing on the old WebSocket instance. |
Sequence Diagram
sequenceDiagram
participant FE as Frontend
participant WS as WebSocket Handler
participant DH as DelegateHandler
participant DM as DelegateManager
participant TR as ToolRegistry
participant MI as MembraneInference
participant DE as Delegate App
DE->>WS: WS connect with delegateId + API key auth
WS->>DH: delegateWebsocketHandler()
DH->>DH: validateDelegateId()
DH->>DH: db.validateDelegateApiKey() [bcrypt]
DH->>DM: registerDelegate(ws, userId, delegateId)
DM->>FE: delegate_status_changed connected
DE->>DH: tool_manifest with tools list
DH->>TR: registerDelegateTools(userId, delegateId, tools)
DH->>DE: tool_manifest_ack with prefixed names
FE->>WS: generate message
WS->>MI: streamCompletion with toolOptions
MI->>MI: membrane.stream → tool_use block emitted
MI->>TR: executeTool name prefixed delegateName__toolName
TR->>DM: executeToolOnDelegate(delegateId, call)
DM->>DE: tool_call_request requestId + tool input
DE-->>DM: tool_call_response result
DM-->>TR: ToolResult
TR-->>MI: ToolResult
MI->>MI: continue tool loop → final text response
MI->>FE: stream chunks with tool_use and tool_result blocks
Comments Outside Diff (2)
-
deprecated-claude-app/backend/src/delegate/delegate-manager.ts, line 869-877 (link)Empty
toolUseIdwhen failing pending callsfailPendingCallsresolves each pending call withtoolUseId: ''. ThePendingToolCallinterface doesn't store the originalcall.id, so there is no way to fill in the correct ID here.In practice this is rescued by
ToolRegistry.executeWithTimeout, which wraps the result withpromise.then(r => ({ ...r, toolUseId }))before returning, overwriting the empty string with the correct ID. However, this is a latent correctness hazard: any call path that invokesexecuteToolOnDelegatewithout going throughexecuteWithTimeout(e.g. a future direct call) will silently emittoolUseId: '', causing the Anthropic API to reject the tool_result block.The fix is to store the tool-use ID in
PendingToolCalland use it infailPendingCalls:interface PendingToolCall { resolve: (result: ToolResult) => void; reject: (error: Error) => void; timeout: ReturnType<typeof setTimeout>; delegateId: string; toolName: string; toolUseId: string; // ← add this }Then in
executeToolOnDelegate:this.pendingCalls.set(requestId, { resolve, reject, timeout, delegateId, toolName: call.name, toolUseId: call.id, // ← store it });And in
failPendingCalls:pending.resolve({ toolUseId: pending.toolUseId, // ← use stored ID content: `Delegate "${delegateId}" disconnected during tool execution (tool: ${pending.toolName})`, isError: true, });Prompt To Fix With AI
This is a comment left during a code review. Path: deprecated-claude-app/backend/src/delegate/delegate-manager.ts Line: 869-877 Comment: **Empty `toolUseId` when failing pending calls** `failPendingCalls` resolves each pending call with `toolUseId: ''`. The `PendingToolCall` interface doesn't store the original `call.id`, so there is no way to fill in the correct ID here. In practice this is rescued by `ToolRegistry.executeWithTimeout`, which wraps the result with `promise.then(r => ({ ...r, toolUseId }))` before returning, overwriting the empty string with the correct ID. However, this is a latent correctness hazard: any call path that invokes `executeToolOnDelegate` without going through `executeWithTimeout` (e.g. a future direct call) will silently emit `toolUseId: ''`, causing the Anthropic API to reject the tool_result block. The fix is to store the tool-use ID in `PendingToolCall` and use it in `failPendingCalls`: ``` interface PendingToolCall { resolve: (result: ToolResult) => void; reject: (error: Error) => void; timeout: ReturnType<typeof setTimeout>; delegateId: string; toolName: string; toolUseId: string; // ← add this } ``` Then in `executeToolOnDelegate`: ``` this.pendingCalls.set(requestId, { resolve, reject, timeout, delegateId, toolName: call.name, toolUseId: call.id, // ← store it }); ``` And in `failPendingCalls`: ``` pending.resolve({ toolUseId: pending.toolUseId, // ← use stored ID content: `Delegate "${delegateId}" disconnected during tool execution (tool: ${pending.toolName})`, isError: true, }); ``` How can I resolve this? If you propose a fix, please make it concise.
-
deprecated-claude-app/backend/src/database/index.ts, line 225-263 (link)Linear scan on every authentication is a DoS vector
validateDelegateApiKeyiterates over the entiredelegateApiKeysmap for every incoming WebSocket connection. Because every match triggers abcrypt.compare(intentionally slow, ~100 ms), an attacker who can open many parallel WebSocket connections with arbitraryapiKeyvalues will force the server to run N bcrypt operations in parallel, exhausting CPU.The delegate WebSocket path is also reached before the normal token check (see
websocket/handler.tsline 699) and has no rate limiting.Two complementary mitigations are needed:
- Index by prefix — add
private delegateApiKeysByPrefix: Map<string, string> = new Map()(prefix → keyId) to make the look-up O(1) before the bcrypt step. - Rate-limit the WebSocket upgrade path — track failed attempts per remote IP/userId and add a short back-off (e.g. with
express-rate-limiton the HTTP upgrade handler or a per-IP failure counter).
Prompt To Fix With AI
This is a comment left during a code review. Path: deprecated-claude-app/backend/src/database/index.ts Line: 225-263 Comment: **Linear scan on every authentication is a DoS vector** `validateDelegateApiKey` iterates over the entire `delegateApiKeys` map for every incoming WebSocket connection. Because every match triggers a `bcrypt.compare` (intentionally slow, ~100 ms), an attacker who can open many parallel WebSocket connections with arbitrary `apiKey` values will force the server to run N bcrypt operations in parallel, exhausting CPU. The delegate WebSocket path is also reached before the normal token check (see `websocket/handler.ts` line 699) and has no rate limiting. Two complementary mitigations are needed: 1. **Index by prefix** — add `private delegateApiKeysByPrefix: Map<string, string> = new Map()` (prefix → keyId) to make the look-up O(1) before the bcrypt step. 2. **Rate-limit the WebSocket upgrade path** — track failed attempts per remote IP/userId and add a short back-off (e.g. with `express-rate-limit` on the HTTP upgrade handler or a per-IP failure counter). How can I resolve this? If you propose a fix, please make it concise.
- Index by prefix — add
Prompt To Fix All With AI
This is a comment left during a code review.
Path: deprecated-claude-app/backend/src/delegate/delegate-manager.ts
Line: 869-877
Comment:
**Empty `toolUseId` when failing pending calls**
`failPendingCalls` resolves each pending call with `toolUseId: ''`. The `PendingToolCall` interface doesn't store the original `call.id`, so there is no way to fill in the correct ID here.
In practice this is rescued by `ToolRegistry.executeWithTimeout`, which wraps the result with `promise.then(r => ({ ...r, toolUseId }))` before returning, overwriting the empty string with the correct ID. However, this is a latent correctness hazard: any call path that invokes `executeToolOnDelegate` without going through `executeWithTimeout` (e.g. a future direct call) will silently emit `toolUseId: ''`, causing the Anthropic API to reject the tool_result block.
The fix is to store the tool-use ID in `PendingToolCall` and use it in `failPendingCalls`:
```
interface PendingToolCall {
resolve: (result: ToolResult) => void;
reject: (error: Error) => void;
timeout: ReturnType<typeof setTimeout>;
delegateId: string;
toolName: string;
toolUseId: string; // ← add this
}
```
Then in `executeToolOnDelegate`:
```
this.pendingCalls.set(requestId, {
resolve,
reject,
timeout,
delegateId,
toolName: call.name,
toolUseId: call.id, // ← store it
});
```
And in `failPendingCalls`:
```
pending.resolve({
toolUseId: pending.toolUseId, // ← use stored ID
content: `Delegate "${delegateId}" disconnected during tool execution (tool: ${pending.toolName})`,
isError: true,
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: deprecated-claude-app/backend/src/database/index.ts
Line: 225-263
Comment:
**Linear scan on every authentication is a DoS vector**
`validateDelegateApiKey` iterates over the entire `delegateApiKeys` map for every incoming WebSocket connection. Because every match triggers a `bcrypt.compare` (intentionally slow, ~100 ms), an attacker who can open many parallel WebSocket connections with arbitrary `apiKey` values will force the server to run N bcrypt operations in parallel, exhausting CPU.
The delegate WebSocket path is also reached before the normal token check (see `websocket/handler.ts` line 699) and has no rate limiting.
Two complementary mitigations are needed:
1. **Index by prefix** — add `private delegateApiKeysByPrefix: Map<string, string> = new Map()` (prefix → keyId) to make the look-up O(1) before the bcrypt step.
2. **Rate-limit the WebSocket upgrade path** — track failed attempts per remote IP/userId and add a short back-off (e.g. with `express-rate-limit` on the HTTP upgrade handler or a per-IP failure counter).
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: deprecated-claude-app/shared/src/types.ts
Line: 534-542
Comment:
**`ToolResultContentBlockSchema.content` is too narrow**
`content` is declared as `z.string()`, but both the delegate protocol (`ToolCallResponseMessageSchema.result.content`) and the `ToolResult` interface allow `string | Array<{type: string; ...}>`. The Anthropic API also allows tool results with multiple content blocks (e.g. text + image).
When a delegate returns a structured array result, `handleToolCallResponse` in `DelegateManager` faithfully passes the array through, but the moment that content block is validated against `ContentBlockSchema` it will fail to parse (or silently coerce to `''` through the `|| ''` fallback in `convertMembraneContentBlock`).
```suggestion
export const ToolResultContentBlockSchema = z.object({
type: z.literal('tool_result'),
tool_use_id: z.string(), // Matches the tool_use id
content: z.union([z.string(), z.array(z.record(z.unknown()))]), // Result content (string or array of content blocks)
is_error: z.boolean().optional() // Whether the result is an error
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: deprecated-claude-app/backend/src/services/membrane-inference.ts
Line: 1870-1873
Comment:
**Verbose debug logging throughout production inference path**
`MembraneInferenceService` and `message-converter.ts` contain 30+ `console.log` calls (several with emoji decorators like `🚀`, `📤`, `📥`, `🎯`, `🧹`, `📦`, `📊`) that fire on every inference request. These will be emitted to production server logs and make them difficult to read. The same issue exists in `prepareMessages` and the cache-control debug block around lines 2113–2158.
Consider gating these behind an environment-variable flag or the existing `Logger.debug()` utility used elsewhere in the codebase:
```
// Replace console.log with Logger.debug or guard:
if (process.env.MEMBRANE_DEBUG) {
console.log(`[MembraneInference] ...`);
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: deprecated-claude-app/frontend/src/composables/useDelegates.ts
Line: 135-137
Comment:
**`oldToolNames` is computed but never used**
`oldToolNames` is assigned on line 135 but never referenced — only `newToolNames` is computed for comparison and it too is unused (the comment `// No need to set refreshing - silent update` implies the comparison was removed during cleanup but the variable wasn't).
```suggestion
toolFetchTimeout = setTimeout(async () => {
await fetchTools();
// Silent update — no loading state needed
toolFetchTimeout = null;
}, 500); // Increased debounce to 500ms
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: portable bug fixes from KradThed fo..." | Re-trigger Greptile
| transcript: z.string().optional() // Text transcript of the audio | ||
| }); | ||
|
|
||
| // Tool use content block - when model calls a tool | ||
| export const ToolUseContentBlockSchema = z.object({ | ||
| type: z.literal('tool_use'), | ||
| id: z.string(), // Tool use ID for matching with result | ||
| name: z.string(), // Tool name | ||
| input: z.record(z.unknown()) // Tool input parameters |
There was a problem hiding this comment.
ToolResultContentBlockSchema.content is too narrow
content is declared as z.string(), but both the delegate protocol (ToolCallResponseMessageSchema.result.content) and the ToolResult interface allow string | Array<{type: string; ...}>. The Anthropic API also allows tool results with multiple content blocks (e.g. text + image).
When a delegate returns a structured array result, handleToolCallResponse in DelegateManager faithfully passes the array through, but the moment that content block is validated against ContentBlockSchema it will fail to parse (or silently coerce to '' through the || '' fallback in convertMembraneContentBlock).
| transcript: z.string().optional() // Text transcript of the audio | |
| }); | |
| // Tool use content block - when model calls a tool | |
| export const ToolUseContentBlockSchema = z.object({ | |
| type: z.literal('tool_use'), | |
| id: z.string(), // Tool use ID for matching with result | |
| name: z.string(), // Tool name | |
| input: z.record(z.unknown()) // Tool input parameters | |
| export const ToolResultContentBlockSchema = z.object({ | |
| type: z.literal('tool_result'), | |
| tool_use_id: z.string(), // Matches the tool_use id | |
| content: z.union([z.string(), z.array(z.record(z.unknown()))]), // Result content (string or array of content blocks) | |
| is_error: z.boolean().optional() // Whether the result is an error | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: deprecated-claude-app/shared/src/types.ts
Line: 534-542
Comment:
**`ToolResultContentBlockSchema.content` is too narrow**
`content` is declared as `z.string()`, but both the delegate protocol (`ToolCallResponseMessageSchema.result.content`) and the `ToolResult` interface allow `string | Array<{type: string; ...}>`. The Anthropic API also allows tool results with multiple content blocks (e.g. text + image).
When a delegate returns a structured array result, `handleToolCallResponse` in `DelegateManager` faithfully passes the array through, but the moment that content block is validated against `ContentBlockSchema` it will fail to parse (or silently coerce to `''` through the `|| ''` fallback in `convertMembraneContentBlock`).
```suggestion
export const ToolResultContentBlockSchema = z.object({
type: z.literal('tool_result'),
tool_use_id: z.string(), // Matches the tool_use id
content: z.union([z.string(), z.array(z.record(z.unknown()))]), // Result content (string or array of content blocks)
is_error: z.boolean().optional() // Whether the result is an error
});
```
How can I resolve this? If you propose a fix, please make it concise.
Summary
Extracts the tool calling foundation from KradThed's
feature/mcpl-migrationfork, which implements a much larger set of features (MCPL protocol, sub-agent orchestration, event sourcing, security hardening). This PR isolates just the tool calling layer — the part that can be cleanly separated — plus a set of portable bug fixes identified during the review.What's included
Tool calling foundation (3 cherry-picked commits from KradThed/feature/mcpl-migration):
MembraneInferenceService,MessageConverter)DelegateHandler,DelegateManager,TriggerHandler)ToolRegistry,ServerTools)delegate__toolNameprefixes for conflict-free multi-delegate supportToolConfig,DelegateApiKey,ToolUseContentBlock,ToolResultContentBlockPortable bug fixes (manually extracted from the fork's review):
findIndex || 0logic error in branch index computation (3 locations) —findIndexreturns-1(truthy), so|| 0never triggersgeneratedBranchIds.length > 0bounds check inhandleRegenerateandhandleEditascasts)onmessagehandler on reconnect (prevents stale callbacks)What's intentionally excluded
The KradThed fork contains ~25 commits / 123 files / ~37k lines of insertions. After dependency analysis, we determined that the following layers are deeply entangled with each other and cannot be cleanly extracted without the full stack:
delegate-handler.ts,protocol.ts, andtool-registry.tsConflict resolution notes
Cherry-picks required conflict resolution against main's recent PRs:
personaContextparameter to inference signatures — resolved by keeping bothpersonaContextandtoolOptionsparametersSec-WebSocket-Protocolheader approachZero new TypeScript errors introduced (verified via
tsc --noEmitdiff against main).Credit
Tool calling implementation by Asura (@KradThed). Bug fixes originally identified in commits
988e58e,ca388bc, and17ec646of thefeature/mcpl-migrationbranch. Dependency analysis and extraction by @AzothCat.Test plan
🤖 Generated with Claude Code