Kontex CLI

Local HTTP proxy + dashboard for AI agent developers.
One command intercepts every LLM API call, saves a full snapshot locally, and opens a "Control Room" dashboard — no cloud, no config, no data leaves your machine.

If this is useful, a star on GitHub goes a long way — it helps other agent developers find it.

Why Kontex?

When you're building AI agents, you need to answer questions like:

Which LLM call caused the bad output?
What was the exact context when the agent went off-track?
Can I replay this run with a different response at step 3?

Kontex intercepts every call at the proxy layer, so you get full observability with zero changes to your agent code — just point your base URL at localhost:8080.

What it does

Your agent  →  localhost:8080  →  OpenAI / Anthropic / Ollama / any LLM API
                    │
                    ├── Saves raw prompt + response to .kontex.db (SQLite)
                    ├── Optionally trims context (lossless, toggleable)
                    └── Serves dashboard at GET /

Key features

Feature	Description
Proxy	Intercepts every `POST /*` call and forwards to your upstream LLM
Snapshots	Saves the full untrimmed prompt and response to SQLite — nothing is lost
Context trimmer	Structurally lossless trimming applied before the upstream call — toggleable from the dashboard
Session grouping	Groups related agent runs into sessions via a request header
Multi-agent graph	Swim-lane view showing every agent's trajectory and cross-agent links
Live pause	Pause a request mid-flight, inspect it, then resume with edited messages
Fork & replay	Branch from any snapshot with a human-edited response; downstream calls replay deterministically
Branch chain	Create a new agent task from any snapshot, staying in the same session

Requirements

Node.js 18+
npm 9+

Installation

Option A — global install (recommended)

npm install -g kontex-proxy
kontex start

Option B — clone and build

git clone https://github.com/pankaj-agrawalla/kontex-cli.git
cd kontex-cli
npm install
cd web && npm install && cd ..
npm run build

Configuration

Copy .env.example and edit as needed:

cp .env.example .env

# .env
KONTEX_PORT=8080           # Port for the proxy + dashboard (default: 8080)
UPSTREAM_URL=https://api.openai.com   # LLM API to forward requests to

To use with Ollama locally:

UPSTREAM_URL=http://localhost:11434

To use with Anthropic:

UPSTREAM_URL=https://api.anthropic.com

Usage

Start the server

kontex start

The browser opens automatically at http://localhost:8080.

Or with a custom port:

kontex start --port 9000

Point your agents at Kontex

Change your agent's base URL from the LLM provider to the Kontex proxy:

http://localhost:8080

No other code changes are required. All requests are transparently proxied.

Example — OpenAI SDK:

import OpenAI from "openai"

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "http://localhost:8080/v1",   // ← point at Kontex
})

Example — LangChain:

import { ChatOpenAI } from "@langchain/openai"

const llm = new ChatOpenAI({
  openAIApiKey: process.env.OPENAI_API_KEY,
  configuration: {
    baseURL: "http://localhost:8080/v1",  // ← point at Kontex
  },
})

Example — raw fetch:

await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": `Bearer ${apiKey}` },
  body: JSON.stringify({ model: "gpt-4o", messages }),
})

Optional request headers

These headers unlock richer dashboard views. They are stripped before forwarding upstream — your LLM never sees them.

Header	Purpose
`X-Kontex-Task-Id`	Groups snapshots into a named agent task (swim lane in the graph). Defaults to `"default"` if omitted.
`X-Kontex-Session-Id`	Groups all tasks from one run into a single session entry in the sidebar.
`X-Kontex-Parent-Task-Id`	Records a cross-agent link (draws an amber dashed edge). Send on the first turn only of a child agent.
`X-Kontex-Fork-Id`	Enables deterministic replay. Set to the task ID you forked from.

Without any headers, everything still works — all snapshots land under the "default" task and appear in the dashboard.

With headers (recommended for multi-agent workflows):

const headers = {
  "X-Kontex-Task-Id": "planner-agent",
  "X-Kontex-Session-Id": "run-2024-001",
  // first turn of a child agent only:
  "X-Kontex-Parent-Task-Id": "planner-agent",
}

The Dashboard

Open http://localhost:8080 in your browser.

Sidebar (left)

Lists all sessions ordered newest-first
Each entry shows the session ID, timestamp, agent count, and snapshot count
Click a session to load its graph
Context trimmer toggle at the bottom — turn trimming on or off in real time

Graph (center)

One swim-lane column per agent task
Nodes = individual LLM calls (snapshots)
Gray edges = within the same agent
Amber dashed animated edges = cross-agent links (parent → child)
Amber-bordered nodes = human-edited snapshots
Click any node to open the snapshot drawer

Snapshot drawer (right)

Opens when you click a node. Shows:

The full conversation messages sent to the LLM
Live Pause — pauses the next request from this task mid-flight so you can inspect and edit messages before they reach the LLM
Fork & Edit — save a human-edited version of the messages; the next replay of this prompt hash will return your edited version instead of calling the LLM
Branch chain here — create a new agent task (in the same session) branching from this point, with an editable LLM response

Found this useful in your stack? Share it with your team or post it in your AI/agent dev community — this project grows entirely through word of mouth.

Context trimmer

The trimmer applies three structurally lossless passes before forwarding to the upstream LLM:

Tool result truncation — long tool/function responses are sliced to prevent runaway context growth
Middle-turn compression — older assistant turns in the middle of a long conversation are shortened
System prompt deduplication — repeated system content across turns is reduced

The raw untrimmed payload is always saved to the database — trimming only affects what is forwarded upstream.

Toggle it on/off live from the sidebar without restarting the server.

Multi-agent workflow example

const SESSION_ID = `run-${Date.now()}`

// Agent 1 — Planner
const plannerResponse = await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${apiKey}`,
    "X-Kontex-Task-Id": "planner",
    "X-Kontex-Session-Id": SESSION_ID,
  },
  body: JSON.stringify({ model: "gpt-4o", messages: plannerMessages }),
})

// Agent 2 — Coder (links back to planner)
const coderResponse = await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${apiKey}`,
    "X-Kontex-Task-Id": "coder",
    "X-Kontex-Session-Id": SESSION_ID,
    "X-Kontex-Parent-Task-Id": "planner",   // ← first turn only
  },
  body: JSON.stringify({ model: "gpt-4o", messages: coderMessages }),
})

This produces a dashboard with two swim lanes and an amber edge from Planner → Coder, grouped under one session.

Database

All data is stored in .kontex.db (SQLite) in the project root. The file is created automatically on first run.

To start completely fresh:

rm .kontex.db
kontex start

Schema

CREATE TABLE Snapshots (
  id                 TEXT PRIMARY KEY,   -- cuid
  task_id            TEXT NOT NULL,      -- from X-Kontex-Task-Id header
  parent_id          TEXT,               -- previous snapshot in the same task
  parent_task_id     TEXT,               -- from X-Kontex-Parent-Task-Id header
  session_id         TEXT,               -- from X-Kontex-Session-Id header
  prompt_hash        TEXT NOT NULL,      -- MD5 of messages array (for replay lookup)
  raw_prompt_payload TEXT NOT NULL,      -- original untrimmed JSON body
  llm_response       TEXT,              -- raw response from upstream
  is_human_edited    INTEGER DEFAULT 0, -- 1 if created via fork
  created_at         INTEGER NOT NULL   -- Unix ms
);

Internal API

These endpoints power the dashboard. You can also call them directly.

Method	Path	Description
`GET`	`/health`	Health check
`GET`	`/api/sessions`	List all sessions
`GET`	`/api/tasks`	List all task IDs
`GET`	`/api/graph?session=<id>`	Combined graph (nodes + edges) for a session
`GET`	`/api/tasks/:id/graph`	Graph for a single task
`GET`	`/api/snapshots/:id`	Full snapshot detail
`POST`	`/api/snapshots/:id/pause`	Pause the next request on this snapshot
`POST`	`/api/snapshots/:id/resolve`	Resume a paused request with edited messages
`POST`	`/api/snapshots/:id/fork`	Create a human-edited snapshot (same task)
`POST`	`/api/snapshots/:id/fork-chain`	Create a new task branching from this snapshot
`GET`	`/api/trimmer`	Get trimmer state `{ enabled: boolean }`
`POST`	`/api/trimmer/toggle`	Toggle trimmer on/off

Development

Run the backend and frontend separately with hot reload:

# Terminal 1 — backend
npm run dev

# Terminal 2 — frontend
cd web && npm run dev

The Vite dev server runs on port 5173 and proxies /api to localhost:8080.

E2E test

Requires Ollama running locally with llama3.2:1b:

ollama pull llama3.2:1b
npm run build
npm run e2e

Simulates a 3-agent pipeline (Planner → Coder → Reviewer), verifies snapshots, cross-agent edges, session grouping, fork/replay, and edge cases. Exits 0 on full pass.

Stay in the loop

We're building something bigger around Kontex CLI — team dashboards, session sharing, and deeper agent observability are on the roadmap.

Watch this repo (GitHub Watch) to get notified on releases
Star it (GitHub Star) to show support and help others discover it
Open an issue to share what you're building — it directly shapes what gets built next

Contributing

Issues and PRs are welcome. Please open an issue first for significant changes.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
web		web
.env.example		.env.example
.gitignore		.gitignore
.kontex.db-shm		.kontex.db-shm
.kontex.db-wal		.kontex.db-wal
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kontex CLI

Why Kontex?

What it does

Key features

Requirements

Installation

Option A — global install (recommended)

Option B — clone and build

Configuration

Usage

Start the server

Point your agents at Kontex

Optional request headers

The Dashboard

Sidebar (left)

Graph (center)

Snapshot drawer (right)

Context trimmer

Multi-agent workflow example

Database

Schema

Internal API

Development

E2E test

Stay in the loop

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kontex CLI

Why Kontex?

What it does

Key features

Requirements

Installation

Option A — global install (recommended)

Option B — clone and build

Configuration

Usage

Start the server

Point your agents at Kontex

Optional request headers

The Dashboard

Sidebar (left)

Graph (center)

Snapshot drawer (right)

Context trimmer

Multi-agent workflow example

Database

Schema

Internal API

Development

E2E test

Stay in the loop

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages