Language: English | 简体中文
FerroGate is an open-source Rust API gateway and AI gateway built on Cloudflare Pingora. It gives teams a self-hostable control point for LLM traffic: routing, virtual API keys, provider adapters, policy checks, token usage accounting, observability, admin APIs, and automatic HTTPS.
The project is developed as the open-source gateway foundation behind Token4AI Cloud.
- Pingora gateway runtime for HTTP reverse proxying, route matching, upstream pools, path/header rewrites, request IDs, tracing IDs, streaming responses, graceful shutdown, and listener-level graceful upgrade.
- OpenAI-compatible AI API with
GET /v1/models,POST /v1/chat/completions, andPOST /v1/responses, including non-streaming and streaming SSE forwarding. - Provider adapters for OpenAI-compatible APIs, OpenAI, Anthropic, Gemini, Grok/xAI, and Azure OpenAI.
- Model registry and fallback routing with logical model names, provider model mapping, priority fallback, weighted fallback, tenant visibility, and provider allow/deny controls.
- Caddy-style config file compatibility through
Ferrogate/Caddyfileparsing for familiar reverse-proxy routes, matchers, TLS, logging, and gateway settings, alongside structured TOML configuration. - Virtual API keys and policy checks with hashed keys, tenant context, scopes, disabled/expired keys, model/provider allowlists and denylists, minimal deny-rule policy evaluation, request rate limits, and token budgets.
- Token usage and billing events using provider-reported usage when available, gateway estimates when needed, and a request reservation / settlement flow inspired by production AI gateways.
- Observability with structured request logs, billing events, configurable in-memory retention, usage aggregates, Prometheus metrics, request/trace ID propagation, and OTLP/HTTP metrics/logs/traces export.
- Admin API and dashboard for gateway status, providers, models, API keys, tenants, policies, request logs, billing events, usage aggregates, audit events, provider health, config validation, and process-local reload.
- Automatic HTTPS with manual TLS, ACME HTTP-01, and ACME DNS-01 through a built-in Cloudflare provider. ACME provider credentials are read from the configuration file, not environment variables or Python scripts.
- Supply-chain and security gates with formatting, clippy, locked metadata, high-confidence secret scanning, cargo-deny, cargo-audit, and GitHub Actions.
The planned MVP and production-readiness implementation slice is complete for the open-source gateway codebase.
Validated end-to-end:
- HTTP reverse proxy runtime on Pingora.
- OpenAI-compatible AI gateway paths.
- Provider adapters and fallback routing.
- Virtual API key auth, policy checks, rate limits, and token budget handling.
- Request logs, billing events, usage aggregates, metrics, and OTLP planning.
- Admin API, API key and policy CRUD, static dashboard, config validation, and process-local reload.
- Manual TLS, ACME HTTP-01, and ACME DNS-01.
- Real Let's Encrypt staging and production issuance for both HTTP-01 and Cloudflare DNS-01 during live validation.
Still intentionally scoped as next-stage production work:
- Persistent storage implementations for API keys, tenants, policy, billing, request logs, and audit logs. Current runtime state is primarily config and in-memory repository driven.
- Full Admin API CRUD control plane beyond the current API key and policy resources.
- Background ACME renewal and hot certificate reload. Current ACME behavior is startup-time issuance/reuse.
- Expanded DNS provider set beyond the built-in Cloudflare provider and the generic external hook boundary.
crates/
ferrogate-cli CLI, Pingora runtime wiring, gateway handlers
ferrogate-config Caddyfile/TOML config model and parser
ferrogate-providers AI provider adapters and model registry
ferrogate-auth Tenant and RBAC domain models
ferrogate-policy Policy decision models and engine
ferrogate-storage Repository traits and in-memory storage
ferrogate-billing Token usage, cost, and billing event models
ferrogate-observability Metrics, spans, exporter contracts
ferrogate-runtime Reload and runtime lifecycle state machine
config/ Example TOML configuration
Ferrogate/Caddyfile Default Caddyfile-style development config
scripts/security-check.sh Local security and supply-chain gate
Prerequisites:
- Rust toolchain compatible with the workspace
rust-version. cmake,g++,make, andpkg-configfor Pingora's native compression dependency chain.
Run the default development gateway:
cargo run -- run --config Ferrogate/CaddyfileValidate configuration:
cargo run -- validate --config Ferrogate/Caddyfile
cargo run -- validate --config config/ferrogate.example.tomlProbe the gateway:
curl http://127.0.0.1:8080/healthz
curl http://127.0.0.1:8080/proxy/httpbin/get
curl -H 'Authorization: Bearer dev-secret' http://127.0.0.1:8080/v1/modelsSend an OpenAI-compatible chat request:
curl -X POST http://127.0.0.1:8080/v1/chat/completions \
-H 'Authorization: Bearer dev-secret' \
-H 'Content-Type: application/json' \
-d '{"model":"fast-chat","messages":[{"role":"user","content":"hello"}]}'Send an OpenAI-compatible Responses API request:
curl -X POST http://127.0.0.1:8080/v1/responses \
-H 'Authorization: Bearer dev-secret' \
-H 'Content-Type: application/json' \
-d '{"model":"fast-chat","input":"hello"}'Open the admin dashboard:
http://127.0.0.1:8080/admin
FerroGate loads Ferrogate/Caddyfile by default. TOML is also supported for
structured self-hosting and tests.
ferrogate run --config Ferrogate/Caddyfile
ferrogate run --config /etc/ferrogate/ferrogate.toml:8080 {
log
respond /healthz "ok" 200
ai_gateway {
provider openai {
kind openai-compatible
base_url https://api.openai.com/v1
api_key {env.OPENAI_API_KEY}
}
model fast-chat -> openai:gpt-4o-mini {
capabilities chat streaming
input_price_per_1m 0.15
output_price_per_1m 0.60
}
api_key key_dev {
key {$FERROGATE_DEV_KEY}
scopes models.read chat.completions admin.read
allowed_models fast-chat
allowed_providers openai
request_limit_per_minute 60
monthly_token_budget 1000000
}
}
route /v1/* {
reverse_proxy https://api.openai.com {
header_up Authorization "Bearer {env.OPENAI_API_KEY}"
}
}
}listen = "0.0.0.0:8080"
[admin]
listen = "127.0.0.1:2019"
[telemetry]
access_log = "error"
access_log_sample_rate = 100
access_log_error_rate_limit_per_sec = 100
[storage]
request_log_retention_records = 10000
audit_event_retention_records = 10000
billing_event_retention_records = 10000
admin_list_default_limit = 100
admin_list_max_limit = 1000
[reliability]
provider_circuit_breaker_failure_threshold = 3
provider_circuit_breaker_cooldown_secs = 30
provider_dispatch_timeout_secs = 10
provider_dispatch_max_retries = 1
provider_response_body_max_bytes = 16777216
graceful_shutdown_grace_period_secs = 3
graceful_shutdown_timeout_secs = 15
[[providers]]
name = "openai"
kind = "openai-compatible"
base_url = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"
[[models]]
name = "fast-chat"
provider = "openai"
provider_model = "gpt-4o-mini"
capabilities = ["chat", "streaming"]
input_price_per_1m = "0.15"
output_price_per_1m = "0.60"
[[api_keys]]
id = "dev"
key = "dev-secret"
scopes = ["models.read", "chat.completions", "admin.read"]
allowed_models = ["fast-chat"]
allowed_providers = ["openai"]
request_limit_per_minute = 60
monthly_token_budget = 1000000For production, prefer key_hash generated by ferrogate hash-key over plain
development keys.
ferrogate hash-key --secret 'your-client-secret'FerroGate supports manual TLS certificates and startup-time ACME issuance.
[tls]
enabled = true
cert_path = "/etc/ferrogate/certs/fullchain.pem"
key_path = "/etc/ferrogate/certs/privkey.pem"
http2 = trueHTTP-01 requires public inbound access to port 80 for the challenge and port 443 for HTTPS service.
listen = "0.0.0.0:443"
[tls]
enabled = true
http2 = true
[tls.acme]
enabled = true
domains = ["api.example.com"]
email = "ops@example.com"
directory_url = "https://acme-v02.api.letsencrypt.org/directory"
terms_agreed = true
challenge = "http-01"
http_challenge_listen = "0.0.0.0:80"
storage_dir = "/var/lib/ferrogate/acme-http"DNS-01 does not require public port 80 and is required for wildcard certificates. Cloudflare credentials are configured in the FerroGate config file.
listen = "0.0.0.0:443"
[tls]
enabled = true
http2 = true
[tls.acme]
enabled = true
domains = ["api.example.com"]
email = "ops@example.com"
directory_url = "https://acme-v02.api.letsencrypt.org/directory"
terms_agreed = true
challenge = "dns-01"
storage_dir = "/var/lib/ferrogate/acme-dns"
dns_provider = "cloudflare"
dns_config = { api_token = "cf-token", zone_name = "example.com" }
dns_propagation_delay_secs = 30Caddyfile-style DNS-01:
api.example.com {
tls {
issuer acme {
email ops@example.com
}
storage /var/lib/ferrogate/acme-dns
dns cloudflare {
api_token cf-token
zone_name example.com
}
}
}FerroGate also keeps a provider-neutral external hook boundary for DNS providers that are not built in. Hooks receive a 0600 JSON payload file path and are invoked as:
<hook> <set|cleanup> <payload-json-path>
Validate-only reload report:
ferrogate reload --config Ferrogate/CaddyfileProcess-local reload through a running Admin API:
ferrogate reload \
--config Ferrogate/Caddyfile \
--admin-url http://127.0.0.1:8080 \
--admin-token "$FERROGATE_ADMIN_TOKEN"Listener-level reload through Pingora graceful upgrade:
ferrogate reload --config Ferrogate/Caddyfile --graceful-upgradeProcess-local reload is used only when the listen socket and TLS listener fingerprint do not change. Listener/TLS changes require graceful upgrade.
Common endpoints:
GET /v1/models
POST /v1/chat/completions
POST /v1/responses
GET /admin/v1/status
GET /admin/v1/providers
GET /admin/v1/provider-health
GET /admin/v1/models
GET /admin/v1/api-keys
GET /admin/v1/tenants
GET /admin/v1/policies
GET /admin/v1/request-logs
GET /admin/v1/billing-events
GET /admin/v1/usage-aggregates
GET /admin/v1/audit-events
POST /admin/v1/config/validate
POST /admin/v1/config/reload
GET /metrics
GET /admin
Read endpoints require admin.read when API keys are configured. Chat
completions require chat.completions, Responses API requests require
responses.create, and config validation and reload require admin.write.
Stable releases use date-based tags such as v2026.05.05.
Pull the published GitHub Packages image and run it with a mounted config:
docker pull ghcr.io/lianluo-esign/ferrogate:v2026.05.05
docker run --rm \
-p 8080:8080 \
-v "$PWD/config/ferrogate.example.toml:/etc/ferrogate/ferrogate.toml:ro" \
-e FERROGATE_CONFIG=/etc/ferrogate/ferrogate.toml \
ghcr.io/lianluo-esign/ferrogate:v2026.05.05Build a local image when developing Docker changes:
docker build -t ferrogate .For automatic HTTPS, publish the relevant ports and mount ACME storage:
docker run --rm \
-p 80:80 \
-p 443:443 \
-v /etc/ferrogate/ferrogate.toml:/etc/ferrogate/ferrogate.toml:ro \
-v /var/lib/ferrogate/acme:/var/lib/ferrogate/acme \
-e FERROGATE_CONFIG=/etc/ferrogate/ferrogate.toml \
ghcr.io/lianluo-esign/ferrogate:v2026.05.05Run the local gate before committing:
./scripts/security-check.shStrict mode requires cargo-deny and cargo-audit:
FERROGATE_SECURITY_REQUIRE_TOOLS=1 ./scripts/security-check.shInstall the supply-chain tools:
cargo install cargo-deny --version 0.19.4 --locked
cargo install cargo-audit --version 0.22.1 --lockedThe security gate runs:
cargo fmt --checkcargo clippy --workspace --all-targets --all-features -- -D warningscargo metadata --locked- high-confidence secret scanning
cargo deny check licenses bans sourcescargo audit
Known residual audit warnings are documented in .cargo/audit.toml and the
development plan. They currently come from Pingora transitive dependencies and
are monitored separately from direct FerroGate code.
- Project roadmap:
docs/roadmap.md - Performance testing guide:
docs/performance-testing.md - Example TOML configuration:
config/ferrogate.example.toml - Default Caddyfile-style configuration:
Ferrogate/Caddyfile
Internal development planning notes are maintained outside this product repository.
- Keep changes small and reviewable.
- Follow the existing Rust module boundaries and Caddyfile adapter style.
- Run
./scripts/security-check.shbefore opening a PR. - Update public product documentation when behavior, configuration, operations, or architecture changes.
- Do not commit provider secrets, ACME tokens, private keys, or generated certificates.
Licensed under the Apache License, Version 2.0. See LICENSE.