Skip to content

Pure Browser Automation CLI for AI Agents - Fast, anti-detection, session persistence, @en element references

License

Notifications You must be signed in to change notification settings

hubo1989/hyper-agent-browser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

hyper-agent-browser (hab)

Pure Browser Automation CLI for AI Agents

npm version TypeScript Bun License

πŸ“– δΈ­ζ–‡ζ–‡ζ‘£ (Chinese Documentation)

✨ Features

  • 🎯 @eN Element References - No manual selectors needed, auto-generates @e1, @e2 references
  • πŸ” Session Persistence - Maintains login state, supports multi-account isolation
  • 🎭 Anti-Detection - Built on Patchright, bypasses automation detection
  • ⚑ Fast Startup - Bun runtime, cold start ~25ms
  • πŸ€– AI Agent Friendly - Designed for Claude Code and other AI agents
  • πŸ”’ Security Hardened - Sandbox isolation, permission control, session protection
  • πŸ“Š Data Extraction - Auto-extract tables/lists/forms/metadata
  • 🌐 Network Monitoring - Intercept XHR/Fetch requests, get API data directly
  • ⏳ Smart Waiting - Network idle + DOM stable dual strategy

πŸš€ Quick Start

Installation

Using npm (Recommended)

# Global install
npm install -g hyper-agent-browser

# Or use Bun
bun install -g hyper-agent-browser

# Or use npx (no install needed)
npx hyper-agent-browser --version

From Source

git clone https://github.com/anthropics/hyper-agent-browser.git
cd hyper-agent-browser
bun install
bun run build  # Build binary to dist/hab

Download Pre-built Binary

Visit GitHub Releases to download binaries for your platform.

Basic Usage

# 1. Open a webpage (headed mode to see browser)
hab --headed open https://google.com

# 2. Get interactive elements snapshot
hab snapshot -i

# Output example:
# URL: https://google.com
# Title: Google
#
# Interactive Elements:
# @e1  [textbox]   "Search" (focused)
# @e2  [button]    "Google Search"
# @e3  [button]    "I'm Feeling Lucky"
# @e4  [link]      "Gmail"
# @e5  [link]      "Images"

# 3. Use @eN references to interact
hab fill @e1 "Bun JavaScript runtime"
hab press Enter

# 4. Wait for page load
hab wait 2000

# 5. Take screenshot
hab screenshot -o result.png

Session Management (Multi-Account Isolation)

# Personal Gmail account
hab -s personal-gmail open https://mail.google.com
hab -s personal-gmail snapshot -i

# Work Gmail account
hab -s work-gmail open https://mail.google.com
hab -s work-gmail snapshot -i

# List all sessions
hab sessions

# Close specific session
hab close -s personal-gmail

Data Extraction

# Extract table data
hab open https://example.com/users
hab extract-table > users.json

# Extract list data (auto-detect product/article lists)
hab extract-list --selector ".product-list" > products.json

# Extract form state
hab extract-form > form_data.json

# Extract page metadata (SEO/OG/Schema.org)
hab extract-meta --include seo,og > metadata.json

Network Monitoring

# Start network listener
LISTENER_ID=$(hab network-start --filter xhr,fetch --url-pattern "*/api/*" | jq -r '.listenerId')

# Perform actions (pagination/clicks)
hab click @e5
hab wait-idle

# Stop listener and get all API data
hab network-stop $LISTENER_ID > api_data.json

Smart Waiting

# Wait for page fully idle (network + DOM)
hab wait-idle --timeout 30000

# Wait for element visible
hab wait-element "css=.data-row" --state visible

# Wait for loading animation to disappear
hab wait-element "css=.loading" --state detached

πŸ“– Command Reference

Navigation Commands

Command Description Example
open <url> Open webpage hab open https://example.com
reload Refresh current page hab reload
back Go back hab back
forward Go forward hab forward

Action Commands

Command Description Example
click <selector> Click element hab click @e1
fill <selector> <value> Fill input field hab fill @e1 "hello"
type <text> Type text character by character hab type "password"
press <key> Press key hab press Enter
scroll <direction> [amount] Scroll page hab scroll down 500
hover <selector> Hover over element hab hover @e3
select <selector> <value> Select dropdown option hab select @e2 "Option 1"
wait <ms|condition> Wait for time or condition hab wait 3000

Info Commands

Command Description Example
snapshot [-i|--interactive] Get page snapshot hab snapshot -i
screenshot [-o <file>] [--full-page] Take screenshot hab screenshot -o page.png
url Get current URL hab url
title Get page title hab title
evaluate <script> Execute JavaScript hab evaluate "document.title"

Session Commands

Command Description Example
sessions List all sessions hab sessions
close [-s <name>] Close session hab close -s gmail

Global Options

Option Description Default
-s, --session <name> Session name default
--headed Headed mode (show browser) false
--channel <chrome|msedge> Browser type chrome
--timeout <ms> Timeout 30000

πŸ€– AI Agent Integration (Claude Code)

hyper-agent-browser is designed for AI agents and integrates seamlessly with Claude Code.

Install Skill File

# Method 1: Copy from local repo
mkdir -p ~/.claude/skills/hyper-agent-browser
cp skills/hyper-agent-browser.md ~/.claude/skills/hyper-agent-browser/skill.md

# Method 2: Direct download
mkdir -p ~/.claude/skills/hyper-agent-browser
curl -o ~/.claude/skills/hyper-agent-browser/skill.md \
  https://raw.githubusercontent.com/anthropics/hyper-agent-browser/main/skills/hyper-agent-browser.md

Usage Examples

After installing the skill, Claude Code will automatically recognize and use hab commands:

"Help me open Google, search for 'Bun runtime' and take a screenshot"
"Log into my Gmail account and find the number of unread emails"
"Visit Twitter and get all tweet titles from the homepage"

Claude will automatically:

  1. Use hab open to open the webpage
  2. Use hab snapshot -i to get element references
  3. Analyze the snapshot to find target elements (e.g., @e5)
  4. Use hab click @e5 and other commands to complete the task

πŸ“‹ Selector Format

Format Example Description Recommended
@eN @e1, @e5 Element reference (from snapshot) ⭐⭐⭐⭐⭐
css= css=#login CSS selector ⭐⭐⭐
text= text=Sign in Text match ⭐⭐⭐⭐
xpath= xpath=//button XPath selector ⭐⭐

Recommended: Use @eN references:

  • No manual selector writing
  • Auto-handles dynamic IDs/Classes
  • AI Agent friendly

πŸ”’ Security Features

  • βœ… evaluate Sandbox - Whitelist mode, blocks dangerous operations
  • βœ… Session File Protection - Permissions set to 0o600
  • βœ… Chrome Extension Verification - Whitelist + dangerous permission filtering
  • βœ… System Keychain Isolation - Isolated password storage by default
  • βœ… Config Key Whitelist - Prevents dangerous browser argument injection

πŸ—οΈ Architecture

src/
β”œβ”€β”€ cli.ts              # CLI entry (Commander.js)
β”œβ”€β”€ browser/
β”‚   └── manager.ts      # Browser lifecycle management
β”œβ”€β”€ daemon/
β”‚   β”œβ”€β”€ server.ts       # Daemon server
β”‚   β”œβ”€β”€ client.ts       # Daemon client
β”‚   └── browser-pool.ts # Browser instance pool
β”œβ”€β”€ session/
β”‚   β”œβ”€β”€ manager.ts      # Session management
β”‚   └── store.ts        # UserData persistence
β”œβ”€β”€ commands/
β”‚   β”œβ”€β”€ navigation.ts   # open/reload/back/forward
β”‚   β”œβ”€β”€ actions.ts      # click/fill/type/press/scroll
β”‚   β”œβ”€β”€ info.ts         # snapshot/screenshot/evaluate
β”‚   β”œβ”€β”€ extract.ts      # Data extraction commands
β”‚   └── network.ts      # Network monitoring
β”œβ”€β”€ snapshot/
β”‚   β”œβ”€β”€ accessibility.ts    # Extract from Accessibility Tree
β”‚   β”œβ”€β”€ dom-extractor.ts    # DOM extractor (fallback)
β”‚   └── reference-store.ts  # @eN mapping storage
└── utils/
    β”œβ”€β”€ selector.ts     # Selector parsing
    β”œβ”€β”€ config.ts       # Config management
    └── errors.ts       # Error handling

πŸ“Š Tech Stack

  • Bun 1.2.21 - JavaScript runtime
  • Patchright 1.57.0 - Anti-detection Playwright fork
  • Commander.js 12.1.0 - CLI framework
  • Zod 3.25.76 - Data validation
  • Biome 1.9.4 - Code linting

πŸ› οΈ Development

# Clone repo
git clone https://github.com/anthropics/hyper-agent-browser.git
cd hyper-agent-browser

# Install dependencies
bun install

# Development mode
bun dev -- --headed open https://google.com

# Run tests
bun test

# Type check
bun run typecheck

# Lint
bun run lint

# Build
bun run build       # Current platform
bun run build:all   # All platforms

πŸ“š Documentation

🀝 Contributing

Pull Requests welcome! Please ensure:

  • βœ… TypeScript type check passes: bun run typecheck
  • βœ… Tests pass: bun test
  • βœ… Lint passes: bun run lint

πŸ“„ License

MIT

πŸ”— Links

πŸ™ Acknowledgments


Made with ❀️ for AI Agents

About

Pure Browser Automation CLI for AI Agents - Fast, anti-detection, session persistence, @en element references

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •