Condor Eye

Give your AI agents the ability to see what's on screen.

Condor Eye is a lightweight, transparent overlay app that captures screen regions and sends them to the Anthropic API for visual analysis. It exposes an HTTP API and an MCP server so Claude Code (or any agent) can take screenshots, describe UI content, locate windows, and annotate with a drawing pen — all programmatically.

Built with Tauri 2 (Rust + WebView2). Runs natively on Windows.

Features

Transparent overlay — frameless, always-on-top window with a resizable capture frame. Whatever's visible through the frame is what gets captured.
AI-powered vision — sends screenshots to the Anthropic API (Claude) and returns structured descriptions or JSON extractions.
HTTP API — POST /api/capture, POST /api/locate, GET /api/status on port 9050. Any tool that can make HTTP requests can use Condor Eye.
MCP server — exposes condor_eye_capture, condor_eye_locate, condor_eye_windows, and condor_eye_status as Claude Code tools via the Model Context Protocol.
Focus box — draggable/resizable highlight region for focusing the AI's attention on a specific area within the capture frame.
Drawing pen — freehand annotations powered by perfect-freehand. Five color presets, pressure simulation, included in screenshots sent to the API.
Extraction profiles — configurable JSON profiles for different capture scenarios (depth ladders, candlestick charts, quote screens, heatmaps).
Ground truth comparison — optional Redis integration to compare AI extractions against known-good data.
Global hotkey — Ctrl+Shift+C triggers capture from anywhere.

Architecture

┌─────────────────────────────────────────────────────┐
│  WebView2 Frontend (src/)                           │
│  ┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ │
│  │ Capture  │ │ Focus    │ │ Draw   │ │ Results  │ │
│  │ Frame    │ │ Box      │ │ Canvas │ │ Panel    │ │
│  └──────────┘ └──────────┘ └────────┘ └──────────┘ │
├─────────────────────────────────────────────────────┤
│  Rust Backend (src-tauri/)                          │
│  ┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ │
│  │ capture  │ │ claude   │ │ http   │ │ compare  │ │
│  │ .rs      │ │ .rs      │ │ _api   │ │ .rs      │ │
│  │ Win32 GDI│ │ Anthropic│ │ axum   │ │ Diff     │ │
│  └──────────┘ └──────────┘ └────────┘ └──────────┘ │
├─────────────────────────────────────────────────────┤
│  MCP Server (mcp/)                                  │
│  Node.js stdio transport — wraps HTTP API           │
└─────────────────────────────────────────────────────┘

Getting Started

Prerequisites

Rust toolchain on Windows (via rustup)
Node.js 18+ (for the MCP server)
Anthropic API key — set ANTHROPIC_API_KEY in your environment or a .env file

Build & Run

# Development mode
cargo tauri dev

# Production build
cargo tauri build

# Run tests
cd src-tauri && cargo test

Environment Variables

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	—	Your Anthropic API key
`CLAUDE_MODEL`	No	`claude-haiku-4-5-20251001`	Model for vision calls
`CONDOR_EYE_PORT`	No	`9050`	HTTP API port
`CONDOR_EYE_BIND`	No	`0.0.0.0`	HTTP API bind address
`REDIS_URL`	No	`redis://127.0.0.1:6379`	Redis for ground truth comparison

The app looks for .env files in several locations (first found wins):

Current working directory
Parent directory (for dev mode)
%APPDATA%/Condor Eye/.env (installed app)
Next to the executable

WSL Users

If developing from WSL, use cargo.exe to invoke the Windows Rust toolchain:

cargo.exe tauri dev
cargo.exe tauri build

The HTTP API binds to 0.0.0.0 by default, so you can reach it from WSL at the Windows host IP (usually the WSL gateway).

HTTP API

The app starts an axum HTTP server on port 9050.

`POST /api/capture`

Capture a screen region and get an AI description.

curl -X POST http://localhost:9050/api/capture \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Describe what you see.", "include_image": true}'

Parameters:

prompt — what to ask the AI about the screenshot
region — {x, y, width, height} in pixels (omit to use the app's frame)
hwnd — window handle to bring to foreground before capture
keys — key combos to send before capture (e.g., ["ctrl+3"] to switch tabs)
include_image — return base64 PNG in response
profile — extraction profile name (default: depth)

`POST /api/locate`

Full-screen capture to find a window or UI element.

curl -X POST http://localhost:9050/api/locate \
  -H "Content-Type: application/json" \
  -d '{"query": "the Chrome browser window"}'

`GET /api/status`

Health check and configuration.

curl http://localhost:9050/api/status

MCP Server

The MCP server in mcp/ wraps the HTTP API for Claude Code integration.

Register globally

claude mcp add --scope user condor-eye -- node /path/to/mcp/index.js

Available tools

Tool	Description
`condor_eye_capture`	Capture + AI description of screen content
`condor_eye_locate`	Find a window or UI element on screen
`condor_eye_windows`	List visible windows (free, no API call)
`condor_eye_status`	Health check

`/screen` command

The repo includes a Claude Code custom command at .claude/commands/screen.md. After cloning, just type /screen in any Claude Code session:

Command	Action
`/screen`	Capture the overlay frame region
`/screen firefox`	Find Firefox, bring to front, capture
`/screen chrome tab 3`	Focus Chrome, switch to tab 3, capture
`/screen full`	Capture the entire screen

Extraction Profiles

JSON files in profiles/ define extraction behavior:

Profile	Use case
`depth.json`	L2 depth / DOM ladder
`candle.json`	Candlestick chart OHLC
`quote.json`	Quote screen bid/ask/last
`heatmap.json`	Heatmap intensity
`custom.json`	User-editable template

Profiles specify the extraction prompt, optional truth source for comparison, and comparison configuration.

Cost

Condor Eye uses claude-haiku-4-5-20251001 by default — the fastest and cheapest Claude model. A typical capture costs ~$0.003-0.005 in API usage. You can switch to a more capable model via the CLAUDE_MODEL environment variable.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.claude/commands		.claude/commands
audio-mini-ui		audio-mini-ui
docs		docs
mcp		mcp
profiles		profiles
src-tauri		src-tauri
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
build.bat		build.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Condor Eye

Features

Architecture

Getting Started

Prerequisites

Build & Run

Environment Variables

WSL Users

HTTP API

`POST /api/capture`

`POST /api/locate`

`GET /api/status`

MCP Server

Register globally

Available tools

`/screen` command

Extraction Profiles

Cost

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Condor Eye

Features

Architecture

Getting Started

Prerequisites

Build & Run

Environment Variables

WSL Users

HTTP API

POST /api/capture

POST /api/locate

GET /api/status

MCP Server

Register globally

Available tools

/screen command

Extraction Profiles

Cost

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/capture`

`POST /api/locate`

`GET /api/status`

`/screen` command

Packages