AgentCast extends the Cloudflare Agents SDK with browser automation and live screencasting. Get full visibility into what your agents are doing—watch them navigate, interact, and extract data from web applications in real-time.
- Watch agents work: Stream live video of your browser agent's actions via CDP—see exactly what it's doing as it happens
- Control via natural language: Talk to your agent with plain English instructions (powered by Stagehand)
- Built-in browser environment: No extra setup—browser automation comes out of the box with all Agents SDK features
- Live debugging: Debug agents by watching them, not guessing from logs
- Session tracking: Monitor what your agents are doing via activity timestamps and session status
npm install agentcast agents @cloudflare/containers zodRequirements:
- Cloudflare Workers with Containers binding
- Cloudflare Agents SDK (peer dependency)
- Runs exclusively on Cloudflare infrastructure
import { BrowserAgent } from 'agentcast';
import { z } from 'zod';
export class WebAgent extends BrowserAgent {
protected getContainer() {
return this.env.CONTAINER;
}
// Step 1: Initialize and get the CDP URL for watching
async start() {
await this.initialize({
sessionId: this.name,
viewport: { width: 1280, height: 720 }
});
return {
cdpUrl: this.cdpUrl, // WebSocket URL for CDP connection
viewUrl: this.viewUrl // Browser URL to watch the session
};
}
// Step 2: Perform actions while someone watches
async run() {
await this.act('goto example.com');
await this.act('click the login button');
await this.act('fill in the email field with test@example.com');
await this.act('scroll down and click the submit button');
// Extract structured data
const data = await this.extract(z.object({
title: z.string(),
price: z.number(),
}));
return { data };
}
}- Stagehand handles the natural language → browser actions translation
- CDP screencast streams frames to connected viewers
- Cloudflare Containers run the browser instances
- Workers orchestrate sessions and expose the API
The act() method uses Stagehand's agent, so you can give complex, conversational instructions and it figures out how to execute them.
Abstract base class that extends Agent with browser capabilities.
Methods:
initialize(options): Promise<void>- Initialize browser session (called automatically by other methods, but can be called manually)goto(url: string): Promise<void>- Navigate to URLact(instruction: string): Promise<void>- Execute natural language instructionextract<T>(schema: z.ZodSchema<T>): Promise<T>- Extract structured data from the pageresize(viewport: { width: number; height: number }): Promise<void>- Resize browserreload(): Promise<void>- Reload current pagestop(): Promise<void>- Stop browser sessionconfirm(prompt: string, options?: { timeout?: number }): Promise<boolean>- Pause execution and wait for confirmation (viewer integration coming soon)
Properties:
viewUrl: string- URL to view the browser sessioncdpUrl: string- WebSocket URL for CDP connectionstatus: SessionStatus- Current session status
The embedded viewer (viewUrl) lets you watch and control the browser in real-time:
- Mouse: Click, scroll, drag the browser from the viewer
- Keyboard: Type directly into the browser via the viewer
- Touch: Full touch event support
The viewer iframe exposes a window.agentcastViewer API for programmatic control:
// Insert bulk text (faster than keyboard events)
window.agentcastViewer.insertText('Large amount of text...');
// Send CDP commands directly
window.agentcastViewer.sendCdpCommand('Page.navigate', { url: 'https://example.com' });
// Handle file uploads
window.agentcastViewer.uploadFiles(['/path/to/file']);
// Control zoom level
window.agentcastViewer.setZoom(1.5);
// Toggle metrics overlay (FPS, ping, frame drops)
window.agentcastViewer.toggleMetrics();
// Copy to clipboard
await window.agentcastViewer.copyToClipboard('text to copy');The viewer automatically captures and forwards:
- Console logs: Browser console output appears in viewer logs
- Browser logs: Console API and Log domain events
- Frame metrics: Real-time FPS counter
- Connection metrics: Ping/latency measurements
- Frame drops: Connection quality indicator
Send arbitrary Chrome DevTools Protocol commands from parent window:
// From parent window (demo or orchestration layer)
iframe.contentWindow.postMessage({
type: 'cdp_command',
method: 'Page.captureScreenshot',
params: { format: 'png' }
}, '*');
// Listen for responses
window.addEventListener('message', (event) => {
if (event.data.type === 'cdp_response') {
console.log('Screenshot:', event.data.result);
}
});When a file chooser dialog opens in the browser:
// Intercept and handle file uploads
iframe.contentWindow.postMessage({
type: 'file_upload',
files: ['/local/path/to/file.pdf']
}, '*');Browser logs are automatically forwarded to parent:
// Listen for browser console messages
window.addEventListener('message', (event) => {
if (event.data.type === 'viewer-log') {
console.log('Browser:', event.data.message, event.data.level);
}
});When first frame is received, a metrics overlay displays:
- FPS: Frames per second
- Ping: Round-trip latency to CDP endpoint
- Drops: Frame drop counter
Toggle visibility with window.agentcastViewer.toggleMetrics()
Abstract Method:
protected abstract getContainer(): DurableObjectNamespace- Must be implemented to return container namespace
Types:
SessionState- Internal session state type (exported for advanced use cases)
Each agent session tracks activity and status:
export class WebAgent extends BrowserAgent {
async run() {
await this.act('navigate to example.com');
await this.act('click the button');
const data = await this.extract(z.object({ title: z.string() }));
// Check current status and activity
console.log(this.status); // 'ready' | 'starting' | 'error' | 'stopped'
console.log(this.viewUrl); // URL to watch the session
console.log(this.cdpUrl); // CDP WebSocket endpoint
}
}Properties:
status- Current session status (starting, ready, error, stopped)viewUrl- URL to watch the browser in real-timecdpUrl- WebSocket URL for Chrome DevTools Protocol connection
For client applications (SvelteKit, React, etc.), import only constants and types:
import { CLIENT_STATUS, SERVER_STATUS, type SessionStatus } from 'agentcast';These are safe to use in any environment and don't require Workers runtime.
- Cloudflare Workers with Containers binding
- Cloudflare Agents SDK (peer dependency)
- Zod for schema validation
MIT