A blazing-fast, zero-dependency LLM proxy gateway. Drop it in front of any LLM provider and get caching, metrics, retries, and observability β without changing your application code.
# 1. Clone
git clone https://github.com/llmsocket/llmsocket-gateway && cd llmsocket-gateway
# 2. Configure (copy template, add your keys)
cp .env.example .env
# Edit .env β add at least one provider key
# 3. Run
go run ./cmd/gateway
# 4. Test
curl http://localhost:4000/health
curl http://localhost:4000/readyThe banner shows exactly which providers were detected:
βββ βββ ββββ ββββββββββββ βββββββ ββββββββββ ββββββββββββββββββββ
βββ βββ βββββ βββββββββββββββββββββββββββββββββ βββββββββββββββββββββ
βββ βββ ββββββββββββββββββββββ ββββββ βββββββ ββββββ βββ
βββ βββ ββββββββββββββββββββββ ββββββ βββββββ ββββββ βββ
βββββββββββββββββββ βββ βββββββββββββββββββββββββββββββ βββββββββββ βββ
βββββββββββββββββββ βββββββββββ βββββββ ββββββββββ βββββββββββ βββ
AI Gateway Β· vdev Β· development
βββββββββββββββββββββββββββββββββββββ
Gateway β http://localhost:4000
Metrics β http://localhost:4000/metrics
Dashboard β http://localhost:4000/ui
Config β http://localhost:4000/config
Providers β anthropic, groq, openai
Cache β true (TTL: 1h0m0s, max: 10000)
Retry β true (max 2 retries on [429 502 503 504])
All config lives in .env (or real environment variables β env wins over .env).
cp .env.example .env
# Edit .env| Variable | Default | Description |
|---|---|---|
GATEWAY_PORT |
4000 |
HTTP listen port |
GATEWAY_ENV |
development |
development or production (JSON logs) |
GATEWAY_DEBUG |
false |
Verbose debug logging |
GATEWAY_TIMEOUT_MS |
120000 |
Upstream timeout in milliseconds |
Add whichever providers you use. Unset providers are simply not registered.
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
GROQ_API_KEY=gsk_...
MISTRAL_API_KEY=...
TOGETHER_API_KEY=...
DEEPSEEK_API_KEY=...
XAI_API_KEY=xai-...
FIREWORKS_API_KEY=...
PERPLEXITY_API_KEY=pplx-...
COHERE_API_KEY=...
OPENROUTER_API_KEY=sk-or-...
# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o
# Ollama (local, no key needed)
OLLAMA_BASE_URL=http://localhost:11434Override the base URL for any provider (useful for proxies, private deployments):
GATEWAY_OPENAI_BASE_URL=https://my-proxy.example.com
GATEWAY_ANTHROPIC_BASE_URL=https://anthropic.corp.internalGATEWAY_CACHE_ENABLED=true
GATEWAY_CACHE_TTL_SECONDS=3600
GATEWAY_CACHE_MAX_SIZE=10000
GATEWAY_RATELIMIT_ENABLED=false
GATEWAY_RATELIMIT_RPM=1000
GATEWAY_RETRY_ENABLED=true
GATEWAY_RETRY_MAX=2Each provider gets a dedicated path prefix. Just set base_url in your client:
| Provider | Gateway path | Example |
|---|---|---|
| OpenAI | /openai/v1 |
http://localhost:4000/openai/v1 |
| Anthropic | /anthropic/v1 |
http://localhost:4000/anthropic/v1 |
| Groq | /groq/openai/v1 |
http://localhost:4000/groq/openai/v1 |
| Mistral | /mistral/v1 |
http://localhost:4000/mistral/v1 |
| DeepSeek | /deepseek/v1 |
http://localhost:4000/deepseek/v1 |
| Gemini | /gemini/... |
http://localhost:4000/gemini |
| Ollama | /ollama/v1 |
http://localhost:4000/ollama/v1 |
| OpenRouter | /openrouter/api/v1 |
http://localhost:4000/openrouter/api/v1 |
Route to any provider using a header β your app uses a single base URL:
curl http://localhost:4000/v1/chat/completions \
-H "x-gateway-provider: groq" \
-d '{"model":"llama3-70b-8192","messages":[...]}'Override the server-configured key for a single request:
curl http://localhost:4000/openai/v1/chat/completions \
-H "x-gateway-api-key: sk-SPECIFIC-KEY" \
-d '...'# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")
# After β everything else stays identical
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000/openai")// TypeScript
const client = new OpenAI({ baseURL: "http://localhost:4000/openai" });# curl
curl http://localhost:4000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'| Endpoint | Description |
|---|---|
GET /health |
Always returns 200 if the server is running |
GET /ready |
200 if β₯1 provider configured, 503 otherwise |
GET /config |
Active config (no secrets) |
GET /metrics |
Prometheus text format |
GET /metrics/json |
Full metrics as JSON |
GET /ui |
Live dashboard (HTML) |
# prometheus.yml
scrape_configs:
- job_name: gateway
static_configs:
- targets: ['localhost:4000']
metrics_path: /metricsKey metrics exposed:
gateway_requests_totalβ total proxied requestsgateway_errors_totalβ total 4xx/5xx responsesgateway_cache_hits_totalβ cache hit countgateway_tokens_total{provider, model, type}β token usagegateway_cost_usd_total{provider, model}β estimated USD costgateway_upstream_latency_p99_ms{provider, model}β P99 upstream latencygateway_gateway_overhead_ms{provider, model}β gateway-added latency
Each request is appended to requests.jsonl (configurable):
{"ts":"2025-01-15T10:23:45Z","request_id":"abc123","provider":"openai","model":"gpt-4o-mini","status":200,"latency_ms":342,"tokens_in":120,"tokens_out":80,"cache_hit":false}All gateway errors follow a consistent structure:
{
"error": {
"code": "missing_api_key",
"message": "no API key configured for provider \"groq\"",
"provider": "groq",
"request_id": "req_abc123",
"hint": "set GROQ_API_KEY in your .env file or environment, or pass the x-gateway-api-key header"
}
}| Error code | Cause |
|---|---|
missing_api_key |
No key for the requested provider |
unknown_provider |
Provider name not recognized |
upstream_error |
Network/timeout error calling the provider |
invalid_provider_url |
Malformed base URL in config |
request_body_read_error |
Could not read the request body |
Config errors (startup) include field, got, and hint:
βββββββββββββββββββββββββββββββββββββββββ
β Configuration Error β
βββββββββββββββββββββββββββββββββββββββββ
β must be a valid integer (field: GATEWAY_PORT, got: "abc")
π‘ example: GATEWAY_PORT=4000
See examples/ directory:
curl_examples.shβ bash/curl quick testspython_openai.pyβ Python (OpenAI SDK, streaming, per-key override, Ollama)node_example.jsβ Node.js (OpenAI SDK)
# Python
pip install openai
python examples/python_openai.py openai
python examples/python_openai.py stream
python examples/python_openai.py health
# curl
bash examples/curl_examples.shdocker-compose up
# Or with your .env file
docker run --env-file .env -p 4000:4000 gateway:latestClient β GATEWAY β Provider (OpenAI / Anthropic / Groq / ...)
β
βββ .env loader (with line-level error reporting)
βββ LRU cache (in-memory, zero deps)
βββ Retry with exponential backoff
βββ Token extraction (all provider formats)
βββ Cost estimation
βββ Prometheus metrics
βββ JSONL request log
| Provider | Path | Key env var |
|---|---|---|
| OpenAI | /openai |
OPENAI_API_KEY |
| Anthropic | /anthropic |
ANTHROPIC_API_KEY |
| Google Gemini | /gemini |
GEMINI_API_KEY |
| Groq | /groq |
GROQ_API_KEY |
| Mistral | /mistral |
MISTRAL_API_KEY |
| Together AI | /together |
TOGETHER_API_KEY |
| DeepSeek | /deepseek |
DEEPSEEK_API_KEY |
| xAI / Grok | /xai |
XAI_API_KEY |
| Fireworks | /fireworks |
FIREWORKS_API_KEY |
| Perplexity | /perplexity |
PERPLEXITY_API_KEY |
| Cohere | /cohere |
COHERE_API_KEY |
| OpenRouter | /openrouter |
OPENROUTER_API_KEY |
| Azure OpenAI | /azure |
AZURE_OPENAI_API_KEY |
| AWS Bedrock | /bedrock |
AWS_ACCESS_KEY_ID |
| Anyscale | /anyscale |
ANYSCALE_API_KEY |
| Ollama (local) | /ollama |
OLLAMA_BASE_URL |



