Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 1 addition & 70 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,70 +1 @@
## Current task

Implement `agent` module.

Filesystem tools (local and sandbox)

1. read
2. write / search-replace
3. ls / glob
4. grep
5. bash

Durability

1. state tracking
2. durable execution

Agent features

1. skills
2. web tools / browser
3. memory management
4. sub-agents

## References

See `.reference`

### agent-sdk — Vercel Agent SDK (TypeScript, primary inspiration)
Opinionated sandboxed coding agent framework. LLM loop + sandbox + tools + session persistence. Biased towards Next.js/Vercel, but the tool design and sandbox interface are the reference spec.
- 7 built-in tools: Read, Write, Edit, List, Grep, Bash, JavaScript (meta-tool that orchestrates other tools via code)
- Sandbox abstraction with two bindings (local dev, Vercel VM) — exec, writeFiles, lifecycle (start/stop/snapshot)
- Process manager — persistent CWD per session, background process support
- Skills — SKILL.md discovery with YAML frontmatter, progressive disclosure
- Durable sessions with send/stream/interrupt
- Storage backends (local filesystem, Vercel Postgres, custom HTTP)
- Prompt caching (Anthropic/OpenAI automatic cache breakpoints)
- No web/search tools, no sub-agents, no memory, no context management

### deepagents — LangChain Deep Agents (Python, patterns to draw from)
Ready-to-run agent harness built on LangGraph. Middleware stack architecture where each capability is a composable layer. Most feature-complete of the three.
- 7 filesystem tools: ls, read_file, write_file, edit_file, glob, grep, execute
- Backend protocol ABC with pluggable impls (in-memory state, local filesystem, local shell, LangGraph store, composite routing)
- Sub-agent spawning via `task` tool with isolated context windows
- Auto-summarization when context hits 85% of window, evicts history to file
- Large result eviction — results >20k tokens written to file, replaced with preview
- Memory — AGENTS.md files injected into system prompt, self-modifiable by agent
- Skills — SKILL.md progressive disclosure (same pattern as Vercel)
- Patch dangling tool calls from interrupted sessions
- Web tools (CLI only): web_search (Tavily), fetch_url (HTML to markdown), http_request

### pi — Python Intelligence (terminal coding agent)
Terminal agent (Rust PTY + Python). Uses pydantic-ai. Local filesystem tools are pure pathlib, trivially portable. Shell tools depend on Pi's Rust binary (not portable). Key portable code:
- list_files, read_file, read_chunk — pure pathlib
- search_replace — exact match + rapidfuzz fuzzy fallback (threshold 80%)
- rewrite — write + mkdir + difflib diff
- exec (raw) — `asyncio.create_subprocess_exec`
- `@suppress_errors` decorator
- edit_lock (`asyncio.Lock` for concurrent write safety)

### riff — Code Generation Agent (web app)
Same tool patterns as Pi but targeting remote Daytona sandboxes. Uses pydantic-ai. Key portable code beyond what Pi has:
- grep — builds ripgrep command with flags
- tree — directory structure with exclude patterns
- lint() after edit — ruff for Python, biome for TS
- TodoList/Todo/TodoStatus pydantic models
- add_todos, mark_todos, todo_status tools
- `@suppress_errors` with recursive timeout detection (prefer over Pi's)
- repair_stray_tool_calls — patches dangling tool calls

1. treat `stream_step` and `stream_loop` as user code. they are convenience functions that could be reimplemented by the user, they *must* stay clean.
1 change: 0 additions & 1 deletion examples/fastapi-vite/backend/agent.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""Agent logic for the chat demo."""

import os

from typing import Any

import vercel_ai_sdk as ai
Expand Down
3 changes: 1 addition & 2 deletions examples/fastapi-vite/backend/routes/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@
import vercel_ai_sdk as ai
import vercel_ai_sdk.ai_sdk_ui

from .. import agent
from .. import storage
from .. import agent, storage

router = fastapi.APIRouter()
file_storage = storage.FileStorage()
Expand Down
2 changes: 1 addition & 1 deletion examples/multiagent-textual/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@
import asyncio
import json

import rich.text
import textual
import textual.app
import textual.containers
import textual.widgets
import textual.worker
import rich.text
import websockets

import vercel_ai_sdk as ai
Expand Down
1 change: 0 additions & 1 deletion examples/multiagent-textual/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
import json
import os
import warnings

from typing import Any

import fastapi
Expand Down
3 changes: 1 addition & 2 deletions examples/samples/custom_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import asyncio
import os
from collections.abc import AsyncGenerator

from typing import Any

import vercel_ai_sdk as ai
Expand All @@ -29,7 +28,7 @@ async def custom_stream_step(
messages: list[ai.Message],
tools: list[ai.Tool[..., Any]],
label: str | None = None,
) -> AsyncGenerator[ai.Message, None]:
) -> AsyncGenerator[ai.Message]:
"""Wraps llm.stream to inject a label on every message."""
async for msg in llm.stream(messages=messages, tools=tools):
msg.label = label
Expand Down
3 changes: 1 addition & 2 deletions examples/samples/mcp.py → examples/samples/mcp_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@

import asyncio
import os
from typing import Any

import rich

from typing import Any

import vercel_ai_sdk as ai


Expand Down
55 changes: 55 additions & 0 deletions examples/samples/structured_output.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import asyncio
import os

import pydantic

import vercel_ai_sdk as ai


class WeatherForecast(pydantic.BaseModel):
city: str
temperature: float
conditions: str
humidity: int
wind_speed: float


async def main() -> None:
# OpenAI-compatible provider
# llm = ai.openai.OpenAIModel(
# model="anthropic/claude-opus-4.6",
# base_url="https://ai-gateway.vercel.sh/v1",
# api_key=os.environ.get("AI_GATEWAY_API_KEY"),
# )

# Anthropic provider
llm = ai.anthropic.AnthropicModel(
model="claude-opus-4-6",
api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

messages = ai.make_messages(
system="You are a weather assistant. Respond with realistic weather data.",
user="What's the weather like in San Francisco right now?",
)

# Streaming: watch the JSON arrive incrementally, get validated output at the end
print("--- Streaming ---")
async for msg in llm.stream(messages, output_type=WeatherForecast):
if msg.text_delta:
print(msg.text_delta, end="", flush=True)
if msg.output:
print(f"\n\nParsed: {msg.output}")

# Non-streaming: get the validated output directly
print("\n--- Buffer ---")
msg = await llm.buffer(messages, output_type=WeatherForecast)
print(f"City: {msg.output.city}")
print(f"Temperature: {msg.output.temperature}")
print(f"Conditions: {msg.output.conditions}")
print(f"Humidity: {msg.output.humidity}%")
print(f"Wind: {msg.output.wind_speed} mph")


if __name__ == "__main__":
asyncio.run(main())
1 change: 0 additions & 1 deletion examples/temporal-durable/activities.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
import vercel_ai_sdk as ai
import vercel_ai_sdk.anthropic


# ── Tool activities (one per tool, plain functions) ───────────────


Expand Down
3 changes: 1 addition & 2 deletions examples/temporal-durable/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,9 @@
import sys
import uuid

import activities
import temporalio.client
import temporalio.worker

import activities
import workflow

TASK_QUEUE = "agent-durable"
Expand Down
8 changes: 5 additions & 3 deletions examples/temporal-durable/workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,15 @@
from collections.abc import AsyncGenerator, Awaitable, Callable, Sequence
from typing import override

import pydantic
import temporalio.common
import temporalio.workflow

with temporalio.workflow.unsafe.imports_passed_through():
import vercel_ai_sdk as ai

import activities

import vercel_ai_sdk as ai


class DurableModel(ai.LanguageModel):
def __init__(
Expand All @@ -29,7 +30,8 @@ async def stream(
self,
messages: list[ai.Message],
tools: Sequence[ai.ToolLike] | None = None,
) -> AsyncGenerator[ai.Message, None]:
output_type: type[pydantic.BaseModel] | None = None,
) -> AsyncGenerator[ai.Message]:
result = await self.call_fn(
activities.LLMCallParams(
messages=[m.model_dump() for m in messages],
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ authors = [
]
requires-python = ">=3.12"
dependencies = [
"anthropic>=0.40.0",
"anthropic>=0.83.0",
"httpx>=0.28.1",
"mcp>=1.18.0",
"openai>=2.14.0",
Expand Down
2 changes: 2 additions & 0 deletions src/vercel_ai_sdk/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
Part,
PartState,
ReasoningPart,
StructuredOutputPart,
TextPart,
ToolDelta,
ToolPart,
Expand Down Expand Up @@ -47,6 +48,7 @@
"StreamResult",
"Hook",
"HookPart",
"StructuredOutputPart",
"Checkpoint",
# Functions
"tool",
Expand Down
31 changes: 27 additions & 4 deletions src/vercel_ai_sdk/anthropic/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from typing import Any, override

import anthropic
import pydantic

from .. import core

Expand Down Expand Up @@ -119,6 +120,7 @@ async def stream_events(
self,
messages: list[core.messages.Message],
tools: Sequence[core.tools.ToolLike] | None = None,
output_type: type[pydantic.BaseModel] | None = None,
) -> AsyncGenerator[core.llm.StreamEvent]:
"""Yield raw stream events from Anthropic API."""
system_prompt, anthropic_messages = _messages_to_anthropic(messages)
Expand All @@ -140,12 +142,18 @@ async def stream_events(
"budget_tokens": self._budget_tokens,
}

# Structured output: SDK handles schema transformation internally
if output_type is not None:
kwargs["output_format"] = output_type

# Track block types by index to know what End event to emit
block_types: dict[int, str] = {} # index -> "text" | "thinking" | "tool_use"
tool_ids: dict[int, str] = {} # index -> tool_call_id
signature_buffer: dict[int, str] = {} # index -> accumulated signature

async with self._client.messages.stream(**kwargs) as stream:
stream_cm = self._client.messages.stream(**kwargs)

async with stream_cm as stream:
async for event in stream:
if event.type == "content_block_start":
block = event.content_block
Expand Down Expand Up @@ -208,8 +216,23 @@ async def stream(
self,
messages: list[core.messages.Message],
tools: Sequence[core.tools.ToolLike] | None = None,
output_type: type[pydantic.BaseModel] | None = None,
) -> AsyncGenerator[core.messages.Message]:
"""Stream Messages (uses StreamProcessor internally)."""
"""Stream Messages (uses StreamHandler internally)."""
handler = core.llm.StreamHandler()
async for event in self.stream_events(messages, tools):
yield handler.handle_event(event)
msg: core.messages.Message | None = None
async for event in self.stream_events(messages, tools, output_type=output_type):
msg = handler.handle_event(event)
yield msg

# After stream completes, validate and attach structured output part
if output_type is not None and msg is not None and msg.text:
data = json.loads(msg.text)
output_type.model_validate(data) # fail fast on bad data
part = core.messages.StructuredOutputPart(
data=data,
output_type_name=f"{output_type.__module__}.{output_type.__qualname__}",
)
msg = msg.model_copy()
msg.parts = [*msg.parts, part]
yield msg
6 changes: 5 additions & 1 deletion src/vercel_ai_sdk/core/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import dataclasses
from collections.abc import AsyncGenerator, Sequence

import pydantic

from . import messages as messages_
from . import tools as tools_

Expand Down Expand Up @@ -216,6 +218,7 @@ async def stream(
self,
messages: list[messages_.Message],
tools: Sequence[tools_.ToolLike] | None = None,
output_type: type[pydantic.BaseModel] | None = None,
) -> AsyncGenerator[messages_.Message]:
raise NotImplementedError
yield
Expand All @@ -224,10 +227,11 @@ async def buffer(
self,
messages: list[messages_.Message],
tools: Sequence[tools_.ToolLike] | None = None,
output_type: type[pydantic.BaseModel] | None = None,
) -> messages_.Message:
"""Drain the stream and return the final message."""
final = None
async for msg in self.stream(messages, tools):
async for msg in self.stream(messages, tools, output_type=output_type):
final = msg
if final is None:
raise ValueError("LLM produced no messages")
Expand Down
Loading