Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
9473d09
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
c608efc
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
046ac3b
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
7b4cbbe
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a0551ff
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a36312e
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
d70e2c4
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
489a28a
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
53a7035
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
3b078d0
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
4b4d249
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
70ed186
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
e03dfcc
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
b538176
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
f021fd6
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
3b54d3f
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
e1dd56d
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
800c4f9
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
f13e641
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
fbcf12d
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a93b377
docs(readme): add quickstart pointer, update project layout, fix benc…
levleontiev Mar 17, 2026
a4ad21c
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
411c6c7
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
288c9c7
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
80365c9
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
e778599
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
399cd93
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
e327a0c
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
ee2ab17
fix(quickstart): fix review issues — reverse_proxy mode, correct conf…
levleontiev Mar 18, 2026
348789e
refactor: replace provider-failover recipe with circuit-breaker
Mar 18, 2026
58fa21e
Merge branch 'main' into feature/issue-32-quickstart
levleontiev Mar 19, 2026
7a13c7f
docs: add wrapper mode to README + integration links in comparison se…
levleontiev Mar 19, 2026
480a409
docs: rewrite LLM token budget section to showcase wrapper mode
levleontiev Mar 19, 2026
c297873
docs: wrapper mode selector pathPrefix "/" covers all providers
levleontiev Mar 19, 2026
989cc04
docs: replace ASCII architecture diagrams with Mermaid sequence diagrams
levleontiev Mar 19, 2026
a2bb9da
docs: fix JWT wording — Fairvisor parses claims, does not validate si…
levleontiev Mar 19, 2026
4759b72
trim README: remove benchmark methodology, Contributing section; fix …
levleontiev Mar 19, 2026
5fac6e2
README: rework tagline, add hook + mode selector, drop Policy as code…
levleontiev Mar 22, 2026
b4e2b4d
README: remove hero latency/RPS line
levleontiev Mar 22, 2026
c75e2a5
README: broaden hook paragraph — not LLM-only
levleontiev Mar 22, 2026
51d45ae
docs: add 'Why we built this' section and ToC entry
levleontiev Mar 22, 2026
0a9d1ed
docs: README final polish — tagline, quickstart restructure, badge fi…
levleontiev Mar 22, 2026
c7d1345
fix(recipes): replace unsupported loop_detector rule with spec-level …
levleontiev Mar 22, 2026
a7f935d
fix(recipes): change cost_based period "30d" → "7d" (only 5m/1h/1d/7d…
levleontiev Mar 22, 2026
8566c15
docs: rewrite 'no external state' bullet to focus on request latency
levleontiev Mar 22, 2026
1e9ffc6
docs: reword 'does not replace' — Fairvisor can run standalone or alo…
levleontiev Mar 22, 2026
96de2a1
docs(readme): fix quickstart clone path
codex Mar 22, 2026
a4b1663
fix(quickstart): build local image instead of ghcr
codex Mar 22, 2026
4a6a52c
docs(quickstart): rename compose service to fairvisor
codex Mar 22, 2026
a739b5b
chore(quickstart): reduce readyz healthcheck frequency
codex Mar 22, 2026
3ba11ce
chore(quickstart): remove mock_llm healthcheck
codex Mar 22, 2026
78dc693
chore(quickstart): restore mock_llm healthcheck at low frequency
codex Mar 22, 2026
cfd558e
fix(docker): add maxminddb dev lib and tune map hashes
codex Mar 22, 2026
111d14d
fix(nginx): increase map hash size for quickstart
codex Mar 22, 2026
a033204
fix(nginx): set map_hash_max_size 131072 to cover all 85k ASN entries
levleontiev Mar 22, 2026
7772219
fix(nginx): map_hash_max_size 262144 — need ~3x entries for collision…
levleontiev Mar 22, 2026
8a73777
fix(cli): add bin/fairvisor-cli with corrected -I path for cli.* modules
levleontiev Mar 22, 2026
6092703
fix(cli): Dockerfile.cli → bin/fairvisor-cli, ENTRYPOINT updated
levleontiev Mar 22, 2026
fac3632
docs(cli): rename fairvisor → fairvisor-cli, fix resty -I path in docs
levleontiev Mar 22, 2026
f2059fc
docs(readme): fairvisor → fairvisor-cli in CLI section
levleontiev Mar 22, 2026
72150be
Merge branch 'main' into feature/issue-32-quickstart
levleontiev Mar 23, 2026
ade005a
feat(limiter): improve token count accuracy and max_completion_tokens…
codex Mar 23, 2026
96fbdc1
fix(limiter): fix luacheck warnings — trailing whitespace and unused …
codex Mar 23, 2026
843a26c
test(limiter): add scenarios for max_completion_tokens and improved J…
codex Mar 23, 2026
64b3b58
test(limiter): cover max_tokens body field and default fallback paths
codex Mar 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
299 changes: 172 additions & 127 deletions README.md

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions bin/fairvisor-cli
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash
# Requires: OpenResty 'resty' in PATH (e.g. openresty package or OPENRESTY_HOME)

SCRIPT_DIR="$(cd "$(dirname "$0")/.." && pwd)"

exec resty -I "${SCRIPT_DIR}/src" -I "${SCRIPT_DIR}" \
"${SCRIPT_DIR}/cli/main.lua" "$@"
36 changes: 18 additions & 18 deletions cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,40 +12,40 @@ Command-line tool for scaffolding policies, validating configs, dry-run testing,
From the repo root:

```bash
./bin/fairvisor <command> [options]
./bin/fairvisor-cli <command> [options]
```

Or with `resty` directly (e.g. from another directory, adjusting `-I` paths):

```bash
resty -I /path/to/fv-oss/src -I /path/to/fv-oss/cli /path/to/fv-oss/cli/main.lua <command> [options]
resty -I /path/to/fv-oss/src -I /path/to/fv-oss /path/to/fv-oss/cli/main.lua <command> [options]
```

`bin/fairvisor` sets `-I` to the repo's `src` and `cli` so that `require("cli.commands.init")` and `require("fairvisor.bundle_loader")` resolve correctly.
`bin/fairvisor-cli` sets `-I` to the repo's `src` and root (for `cli.*` modules) so that `require("cli.commands.init")` and `require("fairvisor.bundle_loader")` resolve correctly.

## Commands

| Command | Description |
|--------|-------------|
| `fairvisor init [--template=api\|llm\|webhook]` | Generate `policy.json` and `edge.env.example` in the current directory. |
| `fairvisor validate <file\|->` | Validate policy JSON; exit 0 if valid, non-zero with errors otherwise. |
| `fairvisor test <file> [--requests=<file>] [--format=table\|json]` | Dry-run mock requests through the rule engine. |
| `fairvisor connect [--token=TOKEN] [--url=URL] [--output=PATH]` | Write credentials, verify SaaS connection, optionally download initial bundle. |
| `fairvisor status [--edge-url=URL] [--format=table\|json]` | Show policy version, SaaS connection, counters. |
| `fairvisor logs [--action=ACTION] [--reason=REASON]` | Stream structured logs with optional filters. |
| `fairvisor version` | Print CLI version. |
| `fairvisor help` | Print command list and usage. |
| `fairvisor-cli init [--template=api\|llm\|webhook]` | Generate `policy.json` and `edge.env.example` in the current directory. |
| `fairvisor-cli validate <file\|->` | Validate policy JSON; exit 0 if valid, non-zero with errors otherwise. |
| `fairvisor-cli test <file> [--requests=<file>] [--format=table\|json]` | Dry-run mock requests through the rule engine. |
| `fairvisor-cli connect [--token=TOKEN] [--url=URL] [--output=PATH]` | Write credentials, verify SaaS connection, optionally download initial bundle. |
| `fairvisor-cli status [--edge-url=URL] [--format=table\|json]` | Show policy version, SaaS connection, counters. |
| `fairvisor-cli logs [--action=ACTION] [--reason=REASON]` | Stream structured logs with optional filters. |
| `fairvisor-cli version` | Print CLI version. |
| `fairvisor-cli help` | Print command list and usage. |

## Examples

```bash
fairvisor init
fairvisor init --template=llm
fairvisor validate policy.json
fairvisor test policy.json
fairvisor connect --token=eyJ...
fairvisor version
fairvisor help
fairvisor-cli init
fairvisor-cli init --template=llm
fairvisor-cli validate policy.json
fairvisor-cli test policy.json
fairvisor-cli connect --token=eyJ...
fairvisor-cli version
fairvisor-cli help
```

## Tests
Expand Down
1 change: 1 addition & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ RUN apt-get update && apt-get upgrade -y --no-install-recommends \
gettext-base \
python3 \
libmaxminddb0 \
libmaxminddb-dev \
mmdb-bin \
&& rm -rf /var/lib/apt/lists/*

Expand Down
6 changes: 3 additions & 3 deletions docker/Dockerfile.cli
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ WORKDIR /opt/fairvisor

COPY src /opt/fairvisor/src
COPY cli /opt/fairvisor/cli
COPY bin/fairvisor /opt/fairvisor/bin/fairvisor
COPY bin/fairvisor-cli /opt/fairvisor/bin/fairvisor-cli

RUN chmod +x /opt/fairvisor/bin/fairvisor
RUN chmod +x /opt/fairvisor/bin/fairvisor-cli

ENTRYPOINT ["/opt/fairvisor/bin/fairvisor"]
ENTRYPOINT ["/opt/fairvisor/bin/fairvisor-cli"]
8 changes: 6 additions & 2 deletions docker/nginx.conf.template
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ worker_shutdown_timeout 35s;
http {
resolver 127.0.0.11 ipv6=off valid=30s;
resolver_timeout 2s;
map_hash_max_size 262144;
map_hash_bucket_size 64;

geo $is_tor_exit {
default 0;
Expand All @@ -51,7 +53,8 @@ http {

location = /livez {
default_type text/plain;
return 200 "ok\n";
return 200 "ok
";
}

location = /readyz {
Expand Down Expand Up @@ -102,7 +105,8 @@ http {
}

default_type text/plain;
return 404 "not found\n";
return 404 "not found
";
}
}
}
111 changes: 111 additions & 0 deletions examples/quickstart/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Fairvisor Edge — Quickstart

Go from `git clone` to working policy enforcement in one step.

## Prerequisites

- Docker with Compose V2 (`docker compose version`)
- Port 8080 free on localhost

## Start

```bash
docker compose up -d
```

The first run builds the `fairvisor` image locally from `docker/Dockerfile`, so no
GHCR login is required.

Wait for the `fairvisor` service to report healthy:

```bash
docker compose ps
# fairvisor should show "healthy"
```

## Verify enforcement

This quickstart runs in `FAIRVISOR_MODE=reverse_proxy`. Requests to `/v1/*`
are enforced by the TPM policy and forwarded to a local mock LLM backend.
No real API keys are required.

**Allowed request** — should return `200`:

```bash
curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d @../../fixtures/normal_request.json
```

Expected response body shape matches `../../fixtures/allow_response.json`.

**Over-limit request** — should return `429`:

```bash
curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d @../../fixtures/over_limit_request.json
```

Expected response body shape: `../../fixtures/reject_tpm_exceeded.json`.
The response will also include:
- `X-Fairvisor-Reason: tpm_exceeded`
- `Retry-After: 60`
- `RateLimit-Limit: 100` (matches the quickstart policy `tokens_per_minute`)
- `RateLimit-Remaining: 0`

## How the policy works

The quickstart policy (`policy.json`) enforces a TPM limit keyed on `ip:address`:

- `tokens_per_minute: 100` — allows roughly 2 small requests per minute
- `tokens_per_day: 1000` — daily cap
- `default_max_completion: 50` — pessimistic reservation per request when `max_tokens` is not set

Sending `over_limit_request.json` (which sets `max_tokens: 200000`) immediately
exceeds the 100-token per-minute budget and triggers a `429`.

## Wrapper mode (real provider routing)

Wrapper mode routes requests to real upstream providers using provider-prefixed paths
and a composite Bearer token. It requires real provider API keys and cannot be
demonstrated with this mock stack.

**Path and auth format:**

```
POST /openai/v1/chat/completions
Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY
```

Where:
- `CLIENT_JWT` — signed JWT identifying the calling client/tenant (used for policy enforcement)
- `UPSTREAM_KEY` — real upstream API key forwarded to the provider (e.g. `sk-...` for OpenAI)

Fairvisor strips the composite header, injects the correct provider auth before forwarding,
and **never returns upstream auth headers to the caller**
(see `../../fixtures/allow_response.json`).

**Provider-prefixed paths:**

| Path prefix | Upstream | Auth header injected |
|---|---|---|
| `/openai/v1/...` | `https://api.openai.com/v1/...` | `Authorization: Bearer UPSTREAM_KEY` |
| `/anthropic/v1/...` | `https://api.anthropic.com/v1/...` | `x-api-key: UPSTREAM_KEY` |
| `/gemini/v1beta/...` | `https://generativelanguage.googleapis.com/v1beta/...` | `x-goog-api-key: UPSTREAM_KEY` |

To run in wrapper mode, change the compose env to `FAIRVISOR_MODE: wrapper` and
supply real credentials in the `Authorization` header.

## Teardown

```bash
docker compose down
```

## Next steps

- See `../recipes/` for team budgets, runaway agent guard, and provider failover scenarios
- See `../../fixtures/` for all sample request/response artifacts
- See [fairvisor/benchmark](https://github.com/fairvisor/benchmark) for performance benchmarks
- See [docs/install/](../../docs/install/) for Kubernetes, VM, and SaaS deployment options
59 changes: 59 additions & 0 deletions examples/quickstart/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Fairvisor Edge — Quickstart stack (standalone + reverse proxy mode)
#
# Usage:
# docker compose up -d
# curl -s http://localhost:8080/readyz # health check
# curl -s -X POST http://localhost:8080/v1/chat/completions \
# -H "Content-Type: application/json" \
# -d @../../fixtures/normal_request.json # expect 200
# curl -s -X POST http://localhost:8080/v1/chat/completions \
# -H "Content-Type: application/json" \
# -d @../../fixtures/over_limit_request.json # expect 429
#
# This stack runs in FAIRVISOR_MODE=reverse_proxy — requests to /v1/* are
# enforced by policy then forwarded to the local mock LLM backend.
# No real API keys required.
#
# Wrapper mode (routing by provider prefix, real upstream keys) is documented
# in README.md under "Wrapper mode". It requires real provider credentials and
# cannot be demonstrated with this mock stack.
#
# This file is also the base for the e2e-smoke CI check.
# CI expects the same port and volume contract; update CI too if those change.

services:
fairvisor:
build:
context: ../..
dockerfile: docker/Dockerfile
ports:
- "8080:8080"
environment:
FAIRVISOR_CONFIG_FILE: /etc/fairvisor/policy.json
FAIRVISOR_MODE: reverse_proxy
FAIRVISOR_BACKEND_URL: http://mock_llm:80
FAIRVISOR_SHARED_DICT_SIZE: 32m
FAIRVISOR_LOG_LEVEL: info
FAIRVISOR_WORKER_PROCESSES: "1"
volumes:
- ./policy.json:/etc/fairvisor/policy.json:ro
depends_on:
mock_llm:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://127.0.0.1:8080/readyz"]
interval: 2m
timeout: 2s
retries: 15
start_period: 5s

mock_llm:
image: nginx:1.27-alpine
volumes:
- ./mock-llm.conf:/etc/nginx/nginx.conf:ro
healthcheck:
test: ["CMD", "wget", "-q", "-O", "-", "http://127.0.0.1:80/"]
interval: 2m
timeout: 2s
retries: 10
start_period: 5s
10 changes: 10 additions & 0 deletions examples/quickstart/mock-llm.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
events {}
http {
server {
listen 80;
location / {
default_type application/json;
return 200 '{"id":"chatcmpl-qs","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"Hello from the mock backend!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":8,"total_tokens":18}}';
}
}
}
31 changes: 31 additions & 0 deletions examples/quickstart/policy.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{
"bundle_version": 1,
"issued_at": "2026-01-01T00:00:00Z",
"expires_at": "2030-01-01T00:00:00Z",
"policies": [
{
"id": "quickstart-tpm-policy",
"spec": {
"selector": {
"pathPrefix": "/v1/",
"methods": ["POST"]
},
"mode": "enforce",
"rules": [
{
"name": "tpm-limit",
"limit_keys": ["ip:address"],
"algorithm": "token_bucket_llm",
"algorithm_config": {
"tokens_per_minute": 100,
"tokens_per_day": 1000,
"burst_tokens": 100,
"default_max_completion": 50
}
}
]
}
}
],
"kill_switches": []
}
43 changes: 43 additions & 0 deletions examples/recipes/circuit-breaker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Recipe: Circuit Breaker — Cost Spike Auto-Shutdown

Automatically block all LLM traffic when the aggregate token spend rate
exceeds a budget threshold, then self-reset after a cooldown period.

## How it works

- Normal traffic: per-org TPM limit enforced (`100 000 tokens/min`)
- Spike detection: if the rolling spend rate hits `500 000 tokens/min`
the circuit breaker opens and **all requests return `429`** with
`X-Fairvisor-Reason: circuit_breaker_open`
- Auto-reset: after 10 minutes without breaker-triggering load, the
circuit resets automatically — no manual intervention needed
- `alert: true` logs the trip event to the Fairvisor audit log

## Deploy

```bash
cp policy.json /etc/fairvisor/policy.json
```

## Expected behaviour

```bash
# Normal request — passes
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer <jwt>:<upstream-key>" \
http://localhost:8080/v1/chat/completions \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"hi"}]}'
# → 200

# After spend spike trips the breaker:
# → 429 X-Fairvisor-Reason: circuit_breaker_open
# Retry-After: 600
```

## Tuning

| Field | Description |
|---|---|
| `spend_rate_threshold_per_minute` | Tokens/min rolling spend that opens the breaker |
| `auto_reset_after_minutes` | Cooldown before automatic reset (0 = manual only) |
| `tokens_per_minute` | Per-org steady-state limit (independent of breaker) |
Loading
Loading