Claude Messages API proxy for OpenAI-compatible backends.
It lets Claude Code or other Claude clients talk to a backend that exposes /v1/chat/completions, while preserving Claude-style streaming, tool calls, token counting, and model-friendly errors.
If you want the hosted Chutes setup, use the installer:
./install_claude_code.shThat script:
- installs or updates Node.js and
@anthropic-ai/claude-code - fetches models from
https://llm.chutes.ai/v1/models - writes
~/.claude/settings.json - points Claude Code at
https://claude.chutes.ai
This is the easiest way to get Claude Code working against the hosted proxy.
BACKEND_URL=http://127.0.0.1:8000/v1/chat/completions \
HOST_PORT=8080 \
cargo run --releaseThe binary listens on HOST_PORT, which defaults to 8080.
For a simple local setup without TLS:
HOST_PORT=8181 \
CADDY_PORT=8180 \
CADDY_TLS=false \
BACKEND_URL=https://llm.chutes.ai/v1/chat/completions \
docker compose up -d --buildWith Compose:
- the proxy listens directly on
HOST_PORT - Caddy fronts it on
CADDY_PORT .env.exampleis a starting point, not a one-size-fits-all local config
If you are not using the installer, configure Claude Code the same way the bootstrap script does:
{
"model": "zai-org/GLM-4.5-Air",
"alwaysThinkingEnabled": true,
"env": {
"ANTHROPIC_BASE_URL": "http://127.0.0.1:8080",
"ANTHROPIC_AUTH_TOKEN": "cpk_your_backend_key",
"API_TIMEOUT_MS": "6000000"
}
}Put that in ~/.claude/settings.json and adjust the URL, token, and model for your setup.
Important:
- the proxy forwards the client bearer token to the backend
- use a backend-compatible token such as
cpk_* - Anthropic OAuth tokens like
sk-ant-*are rejected
POST /v1/messagesPOST /v1/messages/count_tokensGET /health
- Claude-style SSE streaming and
stream: false - text, image, system, tool use, and tool result blocks
- multi-turn conversations
- model discovery and case correction when
/v1/modelsis available - thinking blocks when the backend exposes reasoning output
- only inline base64 document inputs are translated; URL/file-backed documents are degraded instead of forwarded
- prompt caching, citations, server tools, and audio are not implemented
- best results come from backends that expose both
/v1/chat/completionsand/v1/models
BACKEND_URLOpenAI-compatible chat completions endpoint Default:http://127.0.0.1:8000/v1/chat/completionsHOST_PORTListen port for the Rust proxy Default:8080BACKEND_TIMEOUT_SECSBackend request timeout Default:600ENABLE_CIRCUIT_BREAKEROptional backend failure protection Default:falseRUST_LOGLog level Default:info
cargo fmt --all --check
cargo clippy --all-targets -- -D warnings
cargo test --all-targets
./test.sh --ci --allMore test details live in tests/README.md.