Skip to content

venkateshamatam/airbag

Repository files navigation

Airbag

Runtime safety layer for browser-using AI agents. The agent declares an action, Airbag checks it against a deterministic rule set built from real incidents, and a desktop modal asks the human before anything dangerous executes.

Why

Browser AI agents do real things on real websites now. They buy stuff, click checkout, run shell commands, touch inboxes. They also do really dumb stuff. OpenAI's Operator bought eggs nobody asked it to buy. The Replit agent wiped a production database. The HashJack attack hides instructions inside URL fragments and slips them into agent context. Airbag goes between the agent and the click.

What it does

  • The Python SDK wraps browser-use's Tools.act() so every action declared by an agent gets checked first.
  • A local Bun daemon runs the action through 36 deterministic rules covering payment, account and auth, destructive infrastructure, email, social engineering, and DOM-level prompt injection.
  • A matched rule pauses the action, sends a WebSocket event to the desktop app, and waits for approval. No decision in 60 seconds = reject.
  • Every check writes to Postgres so the action history is auditable.

Project layout

src/daemon/        Bun + TypeScript daemon, the hot path
db/                Drizzle ORM schema + Postgres client
drizzle/           Generated SQL migrations
apps/desktop/      Electron app, the always-on-top permission modal
sdk/python/        airbag-sdk-py, the browser-use bridge
sidecar/           Node 22 Tier 2 inference worker, spawned by the daemon
docker-compose.yml Postgres 17, ClickHouse 25.4, Redis 7 for local dev

Quickstart

You need Bun 1.2+, Docker Desktop, Python 3.12+, and uv.

bun install
cp .env.example .env
bun run stack:up        # postgres, clickhouse, redis as docker containers
bun run db:migrate      # apply schema to local postgres
bun run dev             # daemon starts at http://127.0.0.1:9876

Postgres binds to 5433 and Redis to 6380 so the docker stack doesn't fight any Homebrew-installed instances. ClickHouse uses 8123.

Smoke test the daemon

# health
curl http://127.0.0.1:9876/health

# allowed action
curl -X POST http://127.0.0.1:9876/v1/check \
  -H 'content-type: application/json' \
  -d '{
    "userId": "00000000-0000-0000-0000-000000000001",
    "sessionId": "00000000-0000-0000-0000-000000000002",
    "agentKind": "browser-use",
    "actionKind": "navigate",
    "url": "https://example.com/articles/today",
    "payload": {}
  }'

# paused action (Instacart checkout). Waits up to 60s for POST /v1/resolve.
curl -X POST http://127.0.0.1:9876/v1/check \
  -H 'content-type: application/json' \
  -d '{
    "userId": "00000000-0000-0000-0000-000000000001",
    "sessionId": "00000000-0000-0000-0000-000000000002",
    "agentKind": "browser-use",
    "actionKind": "click",
    "url": "https://www.instacart.com/store/checkout",
    "payload": {"button": "place-order"}
  }'

# session ledger
curl http://127.0.0.1:9876/v1/sessions/00000000-0000-0000-0000-000000000002/actions

Use it from Python

cd sdk/python
uv sync --extra browser-use
from browser_use import Agent
from airbag import AirbagClient, AirbagTools

agent = Agent(
    task="buy a vacuum under $200",
    llm=...,                              # bring your own LLM
    tools=AirbagTools(AirbagClient()),
)
await agent.run()

When the agent reaches for a risky action, the SDK blocks on the daemon. The daemon broadcasts a pending event over WebSocket and the desktop app shows the modal.

Enable Tier 2 advisory classifier (optional)

Tier 2 runs Llama Prompt Guard 2 22M as an advisory layer that scores actions for prompt-injection / jailbreak confidence. It never overrides the deterministic rule set; the score is recorded in the action ledger for review (modal rendering of the score is a follow-up).

It runs in a Node 22 sidecar process the daemon spawns when the pinned model is on disk (or AIRBAG_TIER2_ENABLED=1 is set explicitly). The sidecar exists because onnxruntime-node (which @huggingface/transformers depends on for native CPU inference) officially targets Node and Electron, not Bun.

bun install                # installs root + sidecar workspace
bun run model:download     # fetches the pinned ONNX revision into models/
bun run dev                # daemon auto-spawns the sidecar

scripts/download-model.ts pulls a pinned HuggingFace revision and writes a SHA-256 lockfile (models/prompt-guard-2-22m.lock.json). The sidecar refuses to start unless every required file matches a hash in the lockfile, and it refuses to fetch from HuggingFace at runtime unless AIRBAG_TIER2_ALLOW_REMOTE=1 is set. The Llama 4 Community License covering Prompt Guard 2 is reproduced in NOTICE and models/prompt-guard-2-22m/LICENSE.

Tier 2 is dev-only today. The packaged Electron build (bun run build and apps/desktop installers) does not yet bundle the Node 22 runtime, sidecar source, or model files. The sidecar runs against the source tree via bun run dev. Production packaging is a separate follow-up.

Run the desktop app

cd apps/desktop
bun install
bun run dev             # electron-vite dev server
bun run build           # compiled assets in apps/desktop/out/
bun run package         # mac/win/linux installers in apps/desktop/release/

Stack

Layer Tech
Daemon Bun + TypeScript + Hono + Drizzle + Zod
Primary DB Postgres 17 (Neon in production)
Analytics ClickHouse 25.4 (ClickHouse Cloud in production)
Cache Redis 7 (Upstash in production)
Desktop Electron 33 + electron-vite + electron-builder
Object storage Cloudflare R2
Edge Cloudflare Workers + Durable Objects + Hyperdrive
Risk classifier Tier 1 deterministic rules + Llama Prompt Guard 2 22M (advisory, opt-in) in a Node sidecar via @huggingface/transformers

Common commands

bun run dev             # daemon with hot reload
bun run test            # bun test
bun run typecheck       # tsc --noEmit
bun run lint            # biome check
bun run build           # compile single binary to dist/airbagd
bun run db:generate     # generate a new migration after editing db/schema.ts
bun run db:migrate      # apply pending migrations
bun run db:studio       # open Drizzle Studio
bun run stack:up        # docker compose up -d
bun run stack:down      # docker compose down
bun run stack:reset     # nuke local volumes and restart fresh
bun run stack:logs      # tail container logs

For the Python SDK:

cd sdk/python
uv sync --extra dev
uv run pytest
uv run ruff check .
uv run mypy airbag

License

The Airbag SDK and tooling are MIT-licensed (sdk/python/pyproject.toml); the broader project license is not finalized yet. Bundled third-party model weights are governed by their upstream licenses, listed in NOTICE.

About

runtime safety layer for browser agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors