-
-
Notifications
You must be signed in to change notification settings - Fork 0
API Endpoints
All /v1/* and /proxy/* endpoints require Authorization: Bearer <client_token> or Bearer <api_key_token>.
API keys support RBAC — roles (admin/user/read-only), agent scoping, and endpoint restrictions. See Secrets for full RBAC configuration.
| Endpoint | Auth | Description |
|---|---|---|
POST /v1/chat/completions |
Bearer | OpenAI-compatible chat with SSE streaming, agent fallback, MCP multi-turn tool loop |
GET /v1/models |
Bearer | Live model catalog from discovered providers + static registry (includes context window, max tokens, capabilities, pricing) |
POST /v1/embeddings |
Bearer | Passthrough proxy with token counting and rate limiting |
POST /v1/responses |
Bearer | Passthrough proxy with URL resolution and retry logic |
POST /v1/images/generations |
Bearer | Image generation (OpenAI-compatible) |
POST /v1/audio/transcriptions |
Bearer | Audio transcription (multipart form-data support) |
POST /v1/audio/speech |
Bearer | Text-to-speech synthesis |
POST /v1/moderations |
Bearer | Content moderation |
POST /v1/rerank |
Bearer | Re-ranking (Cohere/Jina-compatible) |
POST /v1/a2a |
Bearer | Agent-to-Agent protocol (Google A2A) |
GET /v1/files |
Bearer | File listing, upload, retrieval, deletion |
POST /v1/batches |
Bearer | Batch API operations |
POST /proxy/{provider}/* |
Bearer | Arbitrary provider endpoint passthrough (all HTTP methods, SSE streaming auto-detect) |
GET /healthz |
None | Engine health probe (returns OK if gateway is running) |
GET /statsz |
None | Token usage per model, circuit breaker states, MCP server status |
GET /metrics |
None | Prometheus-compatible metrics |
GET /debug/pprof |
Bearer | Go pprof profiling (CPU, memory, goroutines). Requires debug.pprof_enabled: true in config |
POST /v1/chat/completions supports both stream: true (default) and stream: false.
When stream: false, the upstream SSE is buffered into a complete JSON response before returning. All pipeline features (redaction, routing, circuit breaker, MCP loop) work the same way.
The /proxy/{provider}/* endpoint routes to any provider endpoint:
POST /proxy/anthropic/v1/messages
GET /proxy/gemini/v1/models
POST /proxy/openai/v1/files
See Passthrough Proxy for details.
Simple health check for load balancers and orchestration:
curl http://localhost:8080/healthz
# OKUsage statistics and system state:
curl http://localhost:8080/statszReturns: per-model request/error/token counters, circuit breaker states, MCP server status, latency data.
Prometheus-compatible metrics for monitoring:
curl http://localhost:8080/metricsIncludes: request counts, token usage, latency histograms, circuit breaker states, rate limiter status, overflow guard triggers, MCP active goroutines.
Go pprof profiling endpoints for performance analysis:
curl http://localhost:8080/debug/pprof/Available profiles:
-
heap- Memory heap sampling -
goroutine- Goroutine stack traces -
profile- CPU profiling (30s by default) -
block- Blocking operations -
mutex- Contention analysis
Enable in config:
{
"debug": {
"pprof_enabled": true
}
}Then use go tool pprof:
go tool pprof http://localhost:8080/debug/pprof/profileSecurity: Requires Authorization: Bearer <token> header like all /v1/* endpoints. Only enable in production with proper access controls.
- Quick Start — First run
- Passthrough Proxy — Passthrough endpoint reference
- Configuration — Config sections
Getting Started
- Home — Project overview
- Quick Start — Install and run in 5 minutes
- Client Setup — OpenCode, Cursor, and other clients
- Deployment — Bare metal, container, Kubernetes
Core Concepts
- Configuration — Config reference and examples
- Providers — 22 providers, capabilities, special behaviors
- Routing — Latency-aware routing and fallback chains
- Architecture — Package overview and request lifecycle
- MCP Integration — MCP server integration
Reference
- Passthrough Proxy — Raw provider endpoint proxying
- Secrets — Systemd credentials and container secrets
- Model Discovery — Dynamic model catalog fetching
- API Endpoints — Endpoint reference
Operations
- Demo — Test all pipeline tiers
- Troubleshooting — Common issues and solutions
- FAQ — Frequently asked questions
- Security — Security policy and vulnerability reporting
Project
- Roadmap — Planned features
- Disclaimer — Legal disclaimer