Lightweight AI telemetry, analytics, and cost monitoring for production systems.
- 🤖 AI: Integrates with model usage to capture requests, tokens, and costs.
- 📊 Analytics: Workspace-level dashboards, recent activity, and endpoints.
- 💰 Cost Monitoring: Per-model and per-request cost tracking and summaries.
- ⚡ Realtime Telemetry: SSE streaming for live dashboard updates.
- 🔑 API Keys: Workspace-scoped API keys with rotation guidance.
- 🛰️ Monitoring: Health checks, backups, and retention tooling.
- 🚀 Deployment: Deployment guides and Procfile for hosted deployments.
- 📦 SDK: Lightweight TypeScript SDK for easy instrumentation.
- 🏢 Workspaces: Isolated analytics per workspace for multi-tenant use.
- TypeScript (backend, SDK, frontend)
- Node.js / npm
- Express (backend)
- React + Vite (frontend)
- PostgreSQL (Neon-compatible), SQLite (local development)
- Server-Sent Events (SSE) for realtime streaming
- Vitest for tests
- Heroku and Vercel deployment workflows
- What is TokenWatch?
- How TokenWatch Works
- Feature highlights
- Quick Start
- Installation
- API Key Lifecycle
- Troubleshooting
- Deployment
- Operations & maintenance (short)
- Important directories
- Next reading
- Architecture
- Getting help / community
- Contributing
- Contributors
TokenWatch is a lightweight AI telemetry and cost monitoring platform. It helps teams track token usage, model costs, latency, failures, and endpoint activity through a lightweight SDK and dashboard.
Many tools require proxying AI traffic through a third-party service. TokenWatch takes a different approach: instrument your application directly, keep provider integrations unchanged, retain control of request flow, and monitor usage through telemetry.
- TokenWatch instruments your app directly instead of forcing traffic through a proxy.
- Your provider SDKs stay unchanged, so you keep the integration patterns you already use.
- You keep control of request flow while still collecting telemetry for analytics and cost monitoring.
- Backend: ingest API, analytics, authentication, and storage.
- Dashboard: workspaces, analytics, and realtime monitoring.
- SDK: telemetry collection, batching, and delivery.
This repository contains three main parts:
backend/— TypeScript Express API, authentication, ingest pipeline, analytics, and SSE streaming backed by PostgreSQL (Neon-compatible).frontend/— React + Vite dashboard for workspace-level analytics, request logs and realtime updates.sdk/— Small, dependency‑free TypeScript SDK that batches and delivers telemetry to the ingest API.
A compact architecture diagram showing the main data flows. See the full Architecture Guide for details.
flowchart LR
subgraph Application
SDK["TokenWatch SDK"]
end
Backend["Backend API"]
DB[(PostgreSQL)]
Frontend["Dashboard"]
SDK -->|POST /api/ingest| Backend
Backend --> DB
Backend -->|SSE /api/telemetry/stream| Frontend
Frontend -->|API calls| Backend
The README below is a concise developer guide: quick start, core concepts, and where to look for implementation details.
- Signing up creates a default workspace automatically.
- Telemetry is isolated per workspace.
- Switching workspaces changes the analytics context in the dashboard.
- Deleting or changing workspaces affects what you can see in analytics and recent activity.
- API keys are workspace-scoped.
- API keys authenticate telemetry ingestion.
- Rotating a key invalidates the previous key.
- SDK deployments must be updated after rotation.
Warning: If you rotate an API key, update every deployed SDK instance that uses it before the old key is removed from service.
npm install @zn_/tokenwatch- Workspace ID
- Dashboard → Sidebar → Copy Workspace ID
- API Key
- Dashboard → Settings → API Keys
- apiUrl
- Local: http://localhost:3001
- Hosted: your deployed backend URL
- Install the package.
npm install @zn_/tokenwatch- Initialize the SDK.
import { TokenWatch } from "@zn_/tokenwatch";
TokenWatch.init({
apiUrl: "http://localhost:3001",
workspaceId: "ws_xxxxxxxx",
apiKey: "tw_live_xxxxxxxx"
});- Send telemetry.
await TokenWatch.track(
"llm.request.completed",
{
route: "/api/chat",
provider: "openai",
model: "gpt-4o",
input_tokens: 120,
output_tokens: 80,
cost_usd: 0.0042,
latency_ms: 640
}
);- Flush before exit.
await TokenWatch.flush();- Verify in dashboard.
Look in Overview, Recent Activity, Endpoints, and Models after the first event lands.
- ✓ Overview page updates
- ✓ Recent Activity shows a new row
- ✓ Endpoint appears in analytics
- ✓ Stream status shows connected
Warning: Telemetry is batched. Short-lived scripts and serverless functions may exit before queued events are delivered.
Always call:
await TokenWatch.flush();before shutdown.
-
No data appearing?
- Is backend running?
- Is
apiUrlcorrect? - Is
workspaceIdcorrect? - Is API key valid?
- Did you call
flush()? - Are filters cleared?
-
If events appear in the API but not the dashboard, verify the workspace selection in the sidebar and refresh the page.
-
If the realtime stream disconnects, SSE reconnects automatically; localhost restarts can temporarily disconnect the stream.
See Deployment Guide for local development commands, hosted deployment checklists, backups, and retention guidance.
The frontend should point at the backend API URL. The backend stores telemetry in PostgreSQL and exposes the ingest API for dashboard and SDK traffic.
- Local frontend example: use
http://localhost:3001for the backend. - Production frontend example: use your deployed backend URL.
- SSE endpoint requirement:
/api/telemetry/streammust remain reachable for realtime dashboard updates. - Reverse proxy consideration: avoid buffering SSE responses and preserve long-lived connections.
- SDK queues events in memory, batches them, and POSTs to
POST /api/ingestwithX-API-Key. - Backend authenticates the key, normalizes telemetry, writes the
requeststable in PostgreSQL, emits an event ontelemetryBus, and invalidates analytics caches. - Frontend subscribes to
/api/telemetry/stream(SSE) for workspace-scoped live rows and refreshes analytics views.
- Workspace isolation (API keys).
- SDK batching, retries, and graceful shutdown (
flush()). - Realtime SSE updates and cache invalidation for low-latency dashboards.
- Opt-in retention and backup scripts for operational maintenance.
backend/src/routes— API routes and ingestion endpoints.backend/src/services— analytics, realtime streaming, ingestion, and workspace logic.backend/src/db— PostgreSQL schema, connection, and migrations.sdk/src— SDK client, transport, batching, and runtime state.frontend/src/pages— dashboard pages and analytics views.frontend/src/components— reusable UI and realtime dashboard components.
- Health endpoint:
GET /api/health— returns database connection status and operational counters (active SSE connections, simulators). - Backups:
node dist/scripts/backup.jsusespg_dumpand saves SQL dumps tobackend/data/backups. - Retention:
dist/scripts/retention.jsis dry-run by default. UseTELEMETRY_RETENTION_APPLY=trueto delete.
- 🏗️ Architecture Guide — runtime flow, ingest pipeline, SSE, and scaling tradeoffs.
- 🚀 Deployment Guide — production setup, environment variables, backups, and retention.
- 🛠️ Operations Guide — monitoring, maintenance, health checks, and operational workflows.
- 📦 SDK Documentation — installation, examples, batching, retries, and production usage.
- Report bugs or request features: Issues
- Ask questions and discuss usage: Discussions
Note: When opening an issue, include reproduction steps and relevant logs to help triage quickly.
- Use
NODE_ENV=productionand a strongJWT_SECRETfor non-local deployments. - Keep workspace API keys secret and server-side; do not embed them in browser shipping code.
Big thanks to everyone helping make TokenWatch better 🚀
Every contribution - whether code, documentation, testing, or feedback—is greatly appreciated ❤️