Modular npm packages for mobile device video streaming and control — iOS simulators, iOS physical devices, and Android devices.
| Package | Description | Platform |
|---|---|---|
| @device-stream/core | Shared types, interfaces, and utilities | All |
| @device-stream/ios-simulator | iOS Simulator streaming via ScreenCaptureKit + polling | macOS |
| @device-stream/ios-device | iOS physical device streaming via WDA MJPEG | macOS |
| @device-stream/android | Android streaming via TangoADB + scrcpy | All |
| @device-stream/android-server | On-device HTTP server for fast Android control | All |
| @device-stream/agent | LLM agent intelligence layer — navigation graphs, skeleton fingerprinting, compact state | All |
| @device-stream/stream-only | Stream device screens to browsers via WebSocket — no control APIs | All |
| Dependency | Version | Install |
|---|---|---|
| Node.js | >= 18.0.0 | nodejs.org or nvm install 18 |
| npm | >= 9 | Included with Node.js |
| TypeScript | >= 5.3 | Installed as devDependency |
| Dependency | Version | Install | Purpose |
|---|---|---|---|
| ADB | Latest | Android SDK Platform-Tools | Device communication |
| Android SDK | API 28+ | Android Studio or sdkmanager |
ANDROID_HOME must be set |
| Java / JDK | 17+ | brew install openjdk@17 or adoptium.net |
Gradle build for android-server |
| scrcpy server | 3.3.1 | Auto-fetched via fetch-scrcpy-server postinstall |
H.264 streaming |
| Dependency | Version | Install | Purpose |
|---|---|---|---|
| Xcode | 15+ | Mac App Store | xcrun simctl, xcodebuild |
| Xcode CLI Tools | Latest | xcode-select --install |
Command-line build tools |
| Appium | Latest | npm install -g appium |
WebDriverAgent host |
| Appium XCUITest | Latest | appium driver install xcuitest |
WDA for simulator control |
| Swift (optional) | 5.9+ | Included with Xcode | Build sim-capture tool |
| Dependency | Version | Install | Purpose |
|---|---|---|---|
| go-ios | Latest | brew install go-ios |
USB communication + port forwarding |
| WebDriverAgent | Latest | Signed & installed on device | Automation + MJPEG stream |
| qvh (optional) | Latest | Build from source | H.264 QuickTime fallback |
cd device-stream
# Install dependencies
npm install
# Build all TypeScript packages
npm run build
# Build sim-capture for iOS Simulator (macOS only, optional)
npm run build:sim-captureThe simplest way to get started — stream device screens with zero control APIs:
import { createStreamServer } from '@device-stream/stream-only';
const server = await createStreamServer({ port: 3456 });
// List available devices (Android + iOS Simulator)
const devices = await server.listDevices();
// Start streaming a device
await server.startStream(devices[0].serial);
// Browsers connect via WebSocket at /stream/{serial}
// REST: GET /devices returns available devices as JSONSupports Android (H.264 via scrcpy) and iOS Simulator (MJPEG via ScreenCaptureKit). See docs/stream-only.md for full API reference.
import { AgentSession } from '@device-stream/agent';
import {
compressTree,
serializeToCompactText,
computeFingerprint,
NavigationGraph,
findElement,
computeCenter,
} from '@device-stream/agent';
// Compress a UI hierarchy tree for LLM consumption
const compressed = compressTree(rawHierarchy);
const compactText = serializeToCompactText(compressed);
// Fingerprint screens for navigation tracking
const fingerprint = computeFingerprint(compressed);
// Build navigation graphs across screens
const graph = new NavigationGraph();
graph.addEdge({ from: 'home', to: 'settings', action: { type: 'tap', target: 'Settings' } });
// Find elements and compute tap coordinates
const element = findElement(compressed, { text: 'Submit' });
const center = computeCenter(element);import { ScrcpyService } from '@device-stream/android';
// 1. Start the on-device HTTP server (run once per device)
// cd packages/android-server && npm run build && npm run start
// 2. Use the device-server HTTP API directly
const res = await fetch('http://localhost:9008/screenshot?quality=80&scale=1');
const jpeg = await res.arrayBuffer();
// Tap
await fetch('http://localhost:9008/tap', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ x: 540, y: 960 }),
});
// Type text
await fetch('http://localhost:9008/type', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: 'hello world' }),
});
// Get combined device state (screenshot + UI hierarchy + info)
const state = await fetch('http://localhost:9008/state', { method: 'POST' });
const { screenshot, hierarchy, info } = await state.json();import {
createIOSSimulatorManager,
CaptureService,
createCaptureService,
} from '@device-stream/ios-simulator';
// Create and boot a simulator
const manager = createIOSSimulatorManager();
const device = await manager.createDevice({
name: 'Test iPhone',
deviceType: 'iPhone 15',
runtime: 'iOS-17-2',
});
await manager.bootDevice(device.udid);
// Stream via ScreenCaptureKit (high-performance binary protocol)
const capture = createCaptureService();
capture.on('frame', (udid, jpegBuffer) => {
// Send to WebSocket, save to disk, etc.
});
await capture.start(device.udid, { fps: 30, quality: 80 });
// Control via WDA (once WebDriverAgent is running on port 8100)
await fetch('http://localhost:8100/session', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ capabilities: {} }),
});import { DeviceService } from '@device-stream/ios-device';
// go-ios handles USB port forwarding
// WDA MJPEG stream on port 9100, control API on port 8100
const mjpegUrl = 'http://localhost:9100';
const wdaUrl = 'http://localhost:8100';A standalone test app is included to verify streaming and input control for all platforms:
cd device-stream
npx tsx test-app/server.ts
# Open http://localhost:3456The test app auto-detects running device servers and provides a browser UI with:
- Live screen streaming
- Tap (click on screen)
- Swipe (click + drag)
- Hardware keys (Home, Back, etc.)
- Text input
Build and start the on-device HTTP server:
cd packages/android-server
npm run build # Build APKs via Gradle (once)
npm run start # Install + port-forward + start
# Or specify a device: npm run start -- emulator-5554This installs an instrumentation test APK that runs a NanoHTTPD server directly on the device, forwarded to localhost:9008.
| Endpoint | Method | Latency | Description |
|---|---|---|---|
/ping |
GET | <10ms | Health check |
/screenshot |
GET | ~50-100ms | JPEG screenshot (?quality=80&scale=1) |
/tap |
POST | ~20-50ms | Tap at coordinates ({"x":540,"y":960}) |
/swipe |
POST | ~50-100ms | Swipe gesture ({"startX":..,"startY":..,"endX":..,"endY":..,"steps":10}) |
/type |
POST | ~30-100ms | Type text via clipboard paste ({"text":"hello"}) |
/key |
POST | ~20ms | Key press ({"key":"back"} or {"keyCode":4}) |
/longPress |
POST | ~50ms | Long press ({"x":540,"y":960,"durationMs":1000}) |
/hierarchy |
GET | ~100-200ms | UI element tree (?maxElements=50) |
/info |
GET | ~50ms | Device metadata (screen size, package, activity) |
/state |
POST | ~200ms | Combined screenshot + hierarchy + info |
/waitForIdle |
POST | ~0-2000ms | Wait for UI idle ({"timeoutMs":2000}) |
Why this is fast: The server uses UiAutomation and UiDevice APIs directly on the device, bypassing ADB shell overhead. Typical latency is 20-200ms vs 2-4s for adb shell commands.
# 1. Boot a simulator
xcrun simctl boot <UDID>
# 2. Build WDA for the simulator (once, ~30s)
WDA_DIR="$HOME/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent"
xcodebuild build-for-testing \
-project "$WDA_DIR/WebDriverAgent.xcodeproj" \
-scheme WebDriverAgentRunner \
-destination "platform=iOS Simulator,id=<UDID>" \
-derivedDataPath /tmp/wda-build \
COMPILER_INDEX_STORE_ENABLE=NO
# 3. Install WDA on the simulator
xcrun simctl install <UDID> \
"$WDA_DIR/build/Debug-iphonesimulator/WebDriverAgentRunner-Runner.app"
# 4. Start WDA (runs on port 8100)
xcodebuild test-without-building \
-project "$WDA_DIR/WebDriverAgent.xcodeproj" \
-scheme WebDriverAgentRunner \
-destination "platform=iOS Simulator,id=<UDID>" \
-derivedDataPath /tmp/wda-build \
USE_PORT=8100WDA HTTP API at http://localhost:8100:
| Endpoint | Method | Latency | Description |
|---|---|---|---|
/status |
GET | <10ms | Health check |
/screenshot |
GET | ~200-300ms | Base64 PNG screenshot |
/session |
POST | ~50ms | Create automation session |
/session/:id/actions |
POST | ~300-500ms | Tap, swipe, long press (W3C Actions) |
/session/:id/element/:id/value |
POST | ~200ms | Type text into element |
/session/:id/source |
GET | ~2-3s | Full UI hierarchy (XML) |
/wda/homescreen |
POST | ~100ms | Press Home button |
Coordinate system: WDA uses points (not pixels). For a 3x retina display, the screenshot is 1179x2556px but coordinates are 393x852 points.
device-stream/
├── packages/
│ ├── core/ # @device-stream/core
│ │ └── src/
│ │ ├── types.ts # Shared types (Device, FarmDevice, DeviceStatus)
│ │ ├── interfaces.ts # DeviceService interface
│ │ ├── protocol.ts # WebSocket protocol
│ │ └── mutex.ts # Thread-safety utilities
│ │
│ ├── ios-simulator/ # @device-stream/ios-simulator
│ │ └── src/
│ │ ├── simulator-manager.ts # Simulator lifecycle (create, boot, install, etc.)
│ │ ├── stream-service.ts # WebSocket relay
│ │ └── capture-service.ts # ScreenCaptureKit binary protocol
│ │
│ ├── ios-device/ # @device-stream/ios-device
│ │ └── src/
│ │ ├── device-service.ts # Main service
│ │ ├── go-ios-client.ts # go-ios CLI wrapper
│ │ ├── wda-client.ts # WebDriverAgent HTTP client
│ │ ├── wda-hierarchy-parser.ts # WDA UI hierarchy parser
│ │ ├── mjpeg-client.ts # MJPEG stream parser
│ │ └── quicktime-capture.ts # H.264 fallback
│ │
│ ├── android/ # @device-stream/android
│ │ └── src/
│ │ ├── device-service.ts # ADB device management
│ │ ├── scrcpy-service.ts # Scrcpy H.264 streaming
│ │ ├── scrcpy-setup.ts # Server deployment
│ │ └── hierarchy-parser.ts # Android UI hierarchy parser
│ │
│ ├── android-server/ # @device-stream/android-server (private)
│ │ ├── package.json
│ │ ├── scripts/
│ │ │ ├── build.sh # Build APKs via Gradle
│ │ │ ├── setup.sh # Check/build APKs
│ │ │ └── start.sh # Install + port-forward + start
│ │ └── gradle-project/
│ │ ├── app/ # Stub host APK
│ │ └── server/ # Instrumentation test (NanoHTTPD + UiAutomator)
│ │ └── src/androidTest/java/com/fromapptoviral/deviceserver/
│ │ ├── DeviceHttpServer.kt # HTTP router
│ │ ├── DeviceServerInstrumentation.kt # Entry point
│ │ └── handlers/ # Tap, Swipe, Type, Screenshot, etc.
│ │
│ ├── agent/ # @device-stream/agent
│ │ └── src/
│ │ ├── agent-session.ts # High-level agent session management
│ │ ├── types.ts # Agent-specific types
│ │ ├── navigation-graph.ts # Screen-to-screen navigation tracking
│ │ ├── skeleton-fingerprint.ts # Zone classification + screen fingerprinting
│ │ ├── compact-serializer.ts # LLM-optimized state serialization
│ │ ├── element-actions.ts # Element finding + coordinate computation
│ │ └── tree-compressor.ts # UI hierarchy compression
│ │
│ └── stream-only/ # @device-stream/stream-only
│ └── src/
│ ├── stream-server.ts # HTTP + WebSocket server
│ ├── types.ts # Stream-specific types
│ ├── android-streamer.ts # Android H.264 streaming via scrcpy
│ └── simulator-streamer.ts # iOS Simulator MJPEG streaming
│
├── native-servers/
│ ├── android-device-server/ # Kotlin Android instrumentation test (Gradle)
│ └── ios-xctest-server/ # Swift XCTest TCP JSON-RPC server (iOS equivalent of android-device-server)
│
├── tools/
│ └── sim-capture/ # ScreenCaptureKit Swift binary (MJPEG encoder)
│
├── test-app/ # Standalone test viewer
│ ├── server.ts # HTTP + WS server
│ ├── index.html # Browser UI with streaming + controls
│ ├── poc-stream-only.ts # Stream-only mode proof of concept
│ └── poc-viewer.html # POC viewer
│
└── docs/ # Documentation
└── stream-only.md # Stream-only package reference
| Operation | ADB shell | device-server (9008) | simctl | WDA (8100) |
|---|---|---|---|---|
| Screenshot | ~2-4s | ~50-100ms | ~1-2s | ~200-300ms |
| Tap | ~1-2s | ~20-50ms | N/A | ~300-500ms |
| Swipe | ~1-2s | ~50-100ms | N/A | ~300-500ms |
| Type text | ~2-3s | ~30-100ms | N/A | ~200ms |
| UI Hierarchy | ~3-5s | ~100-200ms | N/A | ~2-3s |
| Combined state | ~8-10s | ~200ms | N/A | N/A |
npm run build # Build all packages (core first, then the rest)
npm run build:core # Build core only
npm run build:agent # Build agent package only
npm run build:android # Build android package only
npm run build:ios-simulator # Build ios-simulator package only
npm run build:ios-device # Build ios-device package only
npm run build:android-server # Build on-device server APKs
npm run build:stream-only # Build stream-only package only
npm run build:sim-capture # Build sim-capture Swift binary (macOS)
npm run clean # Clean all dist/ folders
npm run lint # TypeScript type-check all packages
npm run test # Run tests across all packagesMIT