diff --git a/README.md b/README.md index 19473a5..3d19471 100644 --- a/README.md +++ b/README.md @@ -23,6 +23,7 @@ Runs locally on Linux, macOS, and Windows, with first-class workstation guidance - [Quickstart](#quickstart) - [Why fusionAIze Gate](#why-fusionaize-gate) - [How It Works](#how-it-works) +- [Anthropic Bridge](#anthropic-bridge-optional) - [API Surface](#api-surface) - [How fusionAIze Gate Compares](#how-fusionaize-gate-compares) - [Deployment](#deployment) @@ -211,23 +212,55 @@ which -a faigate ## How It Works ```text -Client (OpenClaw, n8n, CLI, custom app) - | - v -http://127.0.0.1:8090/v1 - | - +--> policy rules - +--> static rules - +--> heuristic rules - +--> optional request hooks - +--> optional routing modes (auto / eco / premium / free / custom) - +--> optional client profile defaults - +--> optional LLM classifier - | - +--> provider selection and fallback - |- cloud APIs - |- proxy providers - `- local workers + fusionAIze Gate + + +---------------------+ +--------------------+ +--------------------+ + | Claude-native | | OpenAI-native | | Automation / CLI | + | clients | | clients | | clients | + | | | | | | + | Claude Code | | OpenClaw | | n8n | + | Claude Desktop | | opencode | | curl / scripts | + | Anthropic SDK tools | | OpenAI SDK apps | | custom apps | + +---------------------+ +--------------------+ +--------------------+ + \ | / + v v v + +-------------------------------------------------------------+ + | One local endpoint | + | | + | http://127.0.0.1:8090 | + | OpenAI-compatible + Anthropic-compatible bridge | + +-------------------------------------------------------------+ + | + v + +-------------------------------------------------------------+ + | Routing core - Chooses the best route for the job | + | | + | - quality / cost / speed / heuristics / policies | + | - client profiles / routing modes / hooks | + | - health / readiness / fallback | + +-------------------------------------------------------------+ + | + +------------------------+------------------------+ + | | | + v v v ++------------------------+ +------------------------+ +------------------------+ +| Direct providers | | Aggregators / mirrors | | Local workers / models | +| Anthropic | | Kilo | | Ollama | +| OpenAI | | BLACKBOX | | vLLM | +| Google | | OpenRouter | | LM Studio | +| DeepSeek | | | | LAN GPU workers | ++------------------------+ +------------------------+ +------------------------+ + + +----------------------------------------------------------------------+ + | Stable session continuity | + | | + | Keep one local endpoint across Claude-native, OpenAI-native, and | + | automation-driven workflows. When Anthropic quota, one provider | + | account, or one route path is exhausted, Gate can continue through | + | another healthy direct route, aggregator route, or local worker | + | without retooling clients. Hooks, health checks, readiness, and | + | fallback stay in one gateway core. | + +----------------------------------------------------------------------+ ``` Routing is layered on purpose: @@ -241,6 +274,58 @@ Routing is layered on purpose: For OpenClaw specifically, both one-agent and many-agent traffic can use the same endpoint. fusionAIze Gate can distinguish delegated traffic through request headers such as `x-openclaw-source` when they are present. +## Anthropic Bridge (Optional) + +fusionAIze Gate can also expose a small Anthropic-/Claude-compatible bridge surface for clients that speak `POST /v1/messages` instead of OpenAI chat completions. + +The bridge stays intentionally narrow: + +- it validates and normalizes Anthropic-style requests +- it maps them into Gate's internal canonical request model +- the existing Gate core still owns hooks, policies, routing, health checks, and fallback +- responses are mapped back into Anthropic-compatible message envelopes + +That makes the bridge useful when a Claude-oriented client should keep one stable local endpoint while Gate decides whether the request should stay on a direct Anthropic route, move to an Anthropic-capable aggregator, or step sideways to a similar coding-capable route or local worker. + +Operationally, this helps in two common cases: + +- Anthropic subscription or account limits are exhausted, but you still want the session to continue through another route with similar coding or context characteristics. +- Anthropic-capable aggregator routes such as Kilo or BLACKBOX are available, but you want Gate health checks and fallback rules to decide whether they are actually usable. + +Do not assume every aggregator route escapes Anthropic limits. Some routes may still rely on a BYOK Anthropic key from the same account. Keep those paths probeable, degradeable, and out of the top fallback position if they share the same exhausted quota domain. + +The same pattern also helps when the best fallback is not Anthropic at all. Gate can route the same Claude-oriented session toward a coding-capable OpenAI-, Gemini-, DeepSeek-, or local-worker lane when that is the healthiest path with acceptable context and tool fit. + +Minimal bridge config: + +```yaml +api_surfaces: + anthropic_messages: true + +anthropic_bridge: + enabled: true + allow_claude_code_hints: true + model_aliases: + claude-code: auto + claude-code-fast: eco + claude-code-premium: premium +``` + +Known v1 limits: + +- non-streaming only +- text content blocks only +- `count_tokens` is a local estimate, not provider-exact accounting +- the optional `claude-code-router` hook only adds routing hints; it is not the protocol bridge + +Local smoke test: + +```bash +./docs/examples/anthropic-bridge-smoke.sh +``` + +For a fuller operator view, see [docs/anthropic-bridge.md](./docs/anthropic-bridge.md) and [docs/API.md](./docs/API.md). + ## API Surface fusionAIze Gate keeps the primary surface compact and OpenAI-compatible. The full endpoint reference lives in [docs/API.md](./docs/API.md). @@ -250,6 +335,8 @@ fusionAIze Gate keeps the primary surface compact and OpenAI-compatible. The ful | `GET /health` | Service health, provider status, and capability coverage | | `GET /v1/models` | OpenAI-compatible model list | | `POST /v1/chat/completions` | OpenAI-compatible chat routing | +| `POST /v1/messages` | Optional Anthropic-/Claude-compatible bridge route | +| `POST /v1/messages/count_tokens` | Optional Anthropic-compatible token estimate | | `POST /v1/images/generations` | OpenAI-compatible image generation | | `POST /v1/images/edits` | OpenAI-compatible image editing | | `POST /api/route` | Chat routing dry-run with decision details | diff --git a/config.yaml b/config.yaml index 14440d0..b663d73 100644 --- a/config.yaml +++ b/config.yaml @@ -920,6 +920,15 @@ request_hooks: - profile-override-header - mode-override-header on_error: continue +api_surfaces: + anthropic_messages: false +anthropic_bridge: + enabled: false + allow_claude_code_hints: true + model_aliases: + claude-code: auto + claude-code-fast: eco + claude-code-premium: premium routing_modes: default: auto enabled: true diff --git a/docs/API.md b/docs/API.md index 9344d7e..f70d4d8 100644 --- a/docs/API.md +++ b/docs/API.md @@ -36,6 +36,85 @@ curl -fsS http://127.0.0.1:8090/v1/chat/completions \ }' ``` +## Optional Anthropic-Compatible Bridge + +The Anthropic bridge stays optional and v1 is intentionally narrow. It exists to let Claude-oriented clients keep one stable local endpoint while Gate still owns routing, health checks, fallback, and provider selection. + +Enable it with: + +```yaml +api_surfaces: + anthropic_messages: true + +anthropic_bridge: + enabled: true +``` + +### `POST /v1/messages` + +Routes Anthropic-/Claude-style message requests through the same internal Gate routing path used by the OpenAI-compatible surface. + +- validates a small v1 subset of Anthropic `messages` +- supports a simple `system` prompt +- supports text content blocks +- non-streaming only in v1 +- optional `anthropic_bridge.model_aliases` can map Claude-facing model ids onto Gate routing modes or provider ids + +```bash +curl -fsS http://127.0.0.1:8090/v1/messages \ + -H 'Content-Type: application/json' \ + -H 'anthropic-client: claude-code' \ + -d '{ + "model": "claude-code", + "system": "Prefer concise technical explanations.", + "messages": [ + {"role": "user", "content": "Summarize the current fallback path."} + ] + }' +``` + +Response shape is Anthropic-compatible and still carries the normal Gate response headers such as: + +- `X-faigate-Provider` +- `X-faigate-Profile` +- `X-faigate-Layer` +- `X-faigate-Rule` + +### `POST /v1/messages/count_tokens` + +Returns a minimal Anthropic-compatible token-count response for the same request structure as `/v1/messages`. + +v1 uses a deterministic local estimate instead of provider-exact token accounting. + +```bash +curl -fsS http://127.0.0.1:8090/v1/messages/count_tokens \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "claude-code", + "messages": [ + {"role": "user", "content": "Count these tokens."} + ] + }' +``` + +Response body: + +```json +{"input_tokens": 11} +``` + +Response headers make the approximation explicit: + +- `X-faigate-Token-Count-Exact: false` +- `X-faigate-Token-Count-Method: estimated-char-v1` + +Known v1 bridge limits: + +- non-streaming only +- text content blocks only +- image or binary content blocks are rejected +- `count_tokens` is an estimate, not provider-exact accounting + ### `POST /v1/images/generations` Routes image-generation requests to providers with `capabilities.image_generation: true`. diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 4dea59d..59acb62 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -298,6 +298,69 @@ fusionAIze Gate supports two lightweight extension seams: - bounded pre-routing hint injection - can fail closed depending on `request_hooks.on_error` +For optional Claude-Code-/Anthropic-bridge routing refinement, you can load the +community hook shipped in this repo: + +```yaml +request_hooks: + enabled: true + community_hooks_dir: "./hooks/community" + hooks: + - claude-code-router +``` + +The hook stays optional and only adds routing hints for bridge traffic. It does +not perform protocol translation. By default it treats Anthropic bridge traffic +as `coding-default`, and it also understands bridge metadata such as +`claude_code_profile: premium` or `claude_code_profile: fast`. + +## Anthropic Bridge Surface + +The Anthropic bridge is an optional additional API surface inside Gate. It does +not replace the OpenAI-compatible surface and it does not add a second routing +stack. + +Relevant config blocks: + +- `api_surfaces` +- `anthropic_bridge` +- optional `request_hooks` when you want Claude-Code-specific routing hints + +Minimal example: + +```yaml +api_surfaces: + anthropic_messages: true + +anthropic_bridge: + enabled: true + allow_claude_code_hints: true + model_aliases: + claude-code: auto + claude-code-fast: eco + claude-code-premium: premium +``` + +What this means: + +- `api_surfaces.anthropic_messages` + - exposes `POST /v1/messages` and `POST /v1/messages/count_tokens` +- `anthropic_bridge.enabled` + - enables request parsing, canonical mapping, and response remapping +- `anthropic_bridge.model_aliases` + - lets you keep stable Claude-facing model ids while Gate routes internally +- `anthropic_bridge.allow_claude_code_hints` + - preserves bridge metadata for optional request hooks such as `claude-code-router` + +Recommended operational pattern: + +- use stable logical aliases like `claude-code`, `claude-code-fast`, and `claude-code-premium` +- keep direct Anthropic routes, Anthropic-capable aggregators, and local workers all probeable +- be careful with aggregator routes that may still use a BYOK Anthropic key from the same quota domain +- prefer health checks and fallback ordering over assuming every Anthropic-shaped route is independent + +For the end-to-end flow and local smoke example, see [Anthropic Bridge](./anthropic-bridge.md). + Use the onboarding docs and starter examples when introducing a new client instead of hand-authoring these sections from scratch. ## Config Wizard diff --git a/docs/anthropic-bridge-plan.md b/docs/anthropic-bridge-plan.md new file mode 100644 index 0000000..1b3e101 --- /dev/null +++ b/docs/anthropic-bridge-plan.md @@ -0,0 +1,434 @@ +# Anthropic Bridge Plan + +## Goal + +Add an internal, optional Anthropic-/Claude-compatible bridge layer to +`faigate` without creating a separate repo or a sidecar gateway. + +The bridge should: + +- add an Anthropic-compatible ingress surface +- normalize Anthropic Messages requests into one internal canonical request shape +- reuse the existing routing, policies, hooks, fallbacks, and provider execution +- translate selected responses back into Anthropic-compatible output + +The bridge should **not**: + +- create a second routing engine +- move provider selection logic out of the current core +- turn `faigate` into a generic protocol-conversion platform + +## Current Repo Shape + +Today the runtime is still intentionally flat: + +- `faigate/main.py` + - owns FastAPI route registration + - validates request bodies + - applies request hooks + - calls the router + - executes providers and fallback order +- `faigate/router.py` + - owns layered route selection via `Router.route(...)` + - uses policy, static, heuristic, hook, profile, optional LLM classify, fallback +- `faigate/providers.py` + - owns backend execution through `ProviderBackend.complete(...)` + - currently assumes OpenAI-style chat payloads plus the Google-specific path +- `faigate/hooks.py` + - owns request hook contracts and hook application +- `faigate/config.py` + - owns config normalization and defaulting + +That means the current OpenAI ingress path is effectively: + +1. HTTP request enters `POST /v1/chat/completions` in `faigate/main.py` +2. JSON body is validated and normalized enough for the OpenAI path +3. `_apply_request_hooks(...)` runs through `faigate/hooks.py` +4. `_resolve_route_preview(...)` prepares routing inputs +5. `Router.route(...)` in `faigate/router.py` selects a provider +6. `ProviderBackend.complete(...)` in `faigate/providers.py` sends the upstream request +7. `faigate/main.py` wraps the response and emits metrics/headers + +This is the seam the Anthropic bridge should reuse, not replace. + +## Existing Integration Points + +These are the main places where API routes, routing, providers, and hooks meet +today. + +### API ingress and request handling + +- `faigate/main.py` + - `_read_json_body(...)` + - `_apply_request_hooks(...)` + - `_resolve_route_preview(...)` + - `chat_completions(...)` + +### Routing engine + +- `faigate/router.py` + - `Router.route(...)` + - `Router.route_capability_request(...)` + +### Provider execution + +- `faigate/providers.py` + - `ProviderBackend.complete(...)` + - `ProviderBackend._stream_response(...)` + +### Hook seam + +- `faigate/hooks.py` + - `RequestHookContext` + - `RequestHookResult` + - `apply_request_hooks(...)` + +## Recommended Minimal Internal Module Structure + +Keep the new structure deliberately small and close to the current repo style. + +Recommended first cut: + +```text +faigate/ + bridges/ + __init__.py + anthropic.py + api/ + __init__.py + anthropic.py +``` + +### Why this is the minimal good cut + +- `faigate/api/anthropic.py` + - contains the FastAPI-facing route handlers and payload validation helpers + - keeps Anthropic-specific HTTP semantics out of `main.py` +- `faigate/bridges/anthropic.py` + - contains the protocol mapping logic + - translates Anthropic request/response shapes to and from the canonical shape + +This is enough for v1. + +Do **not** start with a large tree such as: + +- `faigate/api/anthropic/routes.py` +- `faigate/api/anthropic/validation.py` +- `faigate/bridges/anthropic/mapper_in.py` +- `faigate/bridges/anthropic/mapper_out.py` +- `faigate/bridges/anthropic/errors.py` + +That decomposition may become useful later, but it is not necessary for the +first bridge slice and would add abstraction before we know where the real +complexity lands. + +## Recommended Internal Canonical Request Shape + +The bridge should not hand Anthropic payloads directly to the router or to +providers. Introduce one small canonical request/response mapping boundary. + +For v1, the canonical request can remain a plain Python dict instead of a new +Pydantic model layer. + +Suggested v1 shape: + +```python +{ + "surface": "anthropic", + "client": "claude-code", + "model_requested": "claude-sonnet-4-5", + "messages": [...], # OpenAI-like normalized messages + "system": "...", # folded into messages if needed before provider call + "tools": [...], + "tool_choice": ..., + "max_tokens": 4096, + "temperature": 0.2, + "stream": False, + "metadata": { + "bridge": "anthropic", + "source_client": "claude-code", + "anthropic_model": "claude-sonnet-4-5", + }, +} +``` + +Important point: + +- this canonical shape should be **close enough to the existing OpenAI path** + that we can reuse the current router and provider execution with minimal glue + +## Bridge Behavior by Layer + +### Bridge responsibilities + +- validate incoming Anthropic request body +- normalize `system`, `messages`, `tools`, `tool_choice`, `max_tokens`, `stream` +- map model aliases into `model_requested` +- preserve enough client metadata for hooks and observability +- translate selected response fields back into Anthropic-compatible output + +### Core responsibilities that stay where they are + +- hooks +- client profile resolution +- route scoring and selection +- fallback order +- provider health and runtime penalties +- metrics and traces +- adaptive routing state + +### Hook role + +Hooks stay optional routing refinement, not protocol translation. + +A Claude-/Anthropic-specific hook may later: + +- detect Claude Code traffic +- prefer coding-capable routes +- prefer tool-capable routes +- set `routing_mode` +- add provider preferences + +But the hook should operate on already normalized request metadata, not parse +Anthropic wire protocol directly. + +## Suggested v1 API Scope + +### In scope + +- `POST /v1/messages` +- `POST /v1/messages/count_tokens` +- non-streaming text responses +- simple tool definitions +- model alias handling +- error mapping for the common client-visible cases + +### Explicitly out of scope for v1 + +- full Anthropic SSE streaming parity +- multimodal attachments +- advanced tool-use round-tripping +- every Anthropic-specific content block type +- every Claude Desktop / Claude Code nuance + +That narrower scope keeps the bridge realistic and testable. + +## Concrete Implementation Plan + +### Phase 1: extract a reusable chat execution seam + +Before the Anthropic route is added, create one small internal helper in +`faigate/main.py` for the already-existing OpenAI path. + +Goal: + +- avoid duplicating routing + provider execution logic between + `POST /v1/chat/completions` and `POST /v1/messages` + +Suggested helper shape: + +```python +async def _execute_chat_request( + *, + body: dict[str, Any], + headers: dict[str, str], + surface: str, +) -> Response: + ... +``` + +This helper should: + +- call hooks +- resolve the route +- execute fallback/provider completion +- emit metrics and headers + +Then: + +- existing OpenAI route becomes a thin wrapper around it +- Anthropic route can reuse it + +This is the most important enabling refactor. + +### Phase 2: add the bridge module + +Create: + +- `faigate/bridges/anthropic.py` + +Recommended contents: + +- `normalize_anthropic_messages_request(...)` +- `anthropic_to_openai_messages(...)` +- `anthropic_tools_to_openai(...)` +- `openai_response_to_anthropic(...)` +- `count_tokens_estimate_for_anthropic(...)` + +Keep these as plain functions first. + +### Phase 3: add Anthropic route module + +Create: + +- `faigate/api/anthropic.py` + +Recommended contents: + +- route registration helper, for example `register_anthropic_routes(app, ...)` +- request validation helpers for `/v1/messages` +- request validation helpers for `/v1/messages/count_tokens` + +This keeps Anthropic HTTP behavior out of the already-large `main.py` while +still remaining lightweight. + +### Phase 4: wire config and startup + +Add config gating in `faigate/config.py`. + +Suggested keys: + +```yaml +api_surfaces: + openai_compatible: true + anthropic_messages: false + +anthropic_bridge: + enabled: false + allow_clients: [] + model_aliases: {} +``` + +Guidance: + +- default `anthropic_messages` to `false` +- do not enable this bridge accidentally for existing users +- keep startup behavior unchanged when disabled + +### Phase 5: register the Anthropic routes + +In `faigate/main.py` startup or module initialization: + +- keep existing OpenAI routes untouched +- conditionally register Anthropic routes when enabled + +Do not duplicate app startup or provider loading. + +### Phase 6: add a Claude-specific routing hint hook + +Only after the bridge path works end to end, add an optional community or core +hook for Claude-Code-specific routing hints. + +This hook may look at: + +- `surface=anthropic` +- `metadata.source_client` +- tool presence +- message shape indicating coding work + +And then set: + +- `routing_mode` +- `prefer_providers` +- `require_capabilities` + +## Testing Plan + +### Unit tests + +- Anthropic request normalization +- Anthropic response mapping +- tool mapping +- error mapping +- token-count mapping + +### API tests + +- `/v1/messages` enabled and disabled +- `/v1/messages/count_tokens` +- route preview through the same router path as OpenAI requests +- fallback still works + +### Regression tests + +- existing `/v1/chat/completions` remains unchanged +- existing hooks still apply +- existing provider execution path still behaves the same + +## Risks + +### 1. `main.py` is already large + +Risk: + +- adding Anthropic routes directly into `main.py` would make an existing hotspot + harder to maintain + +Mitigation: + +- extract only the minimal shared chat execution helper +- move Anthropic-specific HTTP handling into `faigate/api/anthropic.py` + +### 2. Internal canonical format can sprawl + +Risk: + +- trying to support every Anthropic content block and tool nuance in v1 will + create a second internal protocol model too early + +Mitigation: + +- keep canonical shape small and chat-focused +- treat advanced content blocks as later follow-up work + +### 3. Hook misuse + +Risk: + +- pushing bridge behavior into hooks would blur protocol translation and routing + +Mitigation: + +- keep protocol translation entirely in the bridge module +- use hooks only for post-normalization routing hints + +### 4. Streaming complexity + +Risk: + +- Anthropic-style streaming parity will widen the surface quickly + +Mitigation: + +- keep v1 non-streaming first +- add streaming only after request/response translation is stable + +## Open Questions + +- How much Anthropic tool-use parity is actually required for the first intended + Claude Code / Claude Desktop workflows? +- Should `/v1/messages/count_tokens` remain an estimate in v1, or should it be + backed by the same token heuristics already used elsewhere in Gate? +- Should the bridge expose Anthropic-specific response headers, or is body-level + compatibility enough for v1? +- Which Anthropic model aliases should be supported from day one, and should + those aliases resolve into lane shortcuts or literal upstream model names? + +## Assumptions + +- The first consumer is primarily Claude Code / Claude-compatible tooling, not + a broad external Anthropic ecosystem migration. +- Reusing the existing OpenAI-shaped provider execution path is acceptable for + v1 if the bridge normalizes requests carefully enough. +- We want an internal extension of `faigate`, not a separately marketed generic + Anthropic adapter. + +## Recommended Next Step + +Do one small implementation slice first: + +1. extract the shared chat execution helper from `faigate/main.py` +2. add `faigate/bridges/anthropic.py` with request/response mapping functions +3. add disabled-by-default `POST /v1/messages` + +That is the smallest meaningful end-to-end bridge slice. diff --git a/docs/anthropic-bridge.md b/docs/anthropic-bridge.md new file mode 100644 index 0000000..b7fa2c5 --- /dev/null +++ b/docs/anthropic-bridge.md @@ -0,0 +1,140 @@ +# Anthropic Bridge + +`fusionAIze Gate` can optionally expose an Anthropic-/Claude-compatible bridge surface on top of the existing gateway core. + +This is not a second gateway and not a sidecar. It is one extra ingress surface inside Gate. + +## Purpose + +Use the bridge when a client wants Anthropic-style `messages` requests, but you still want Gate to keep control over: + +- provider selection +- policy and hook handling +- health-aware fallback +- route scoring +- operator visibility + +That is especially useful for Claude-oriented workflows where a direct Anthropic account or subscription can hit daily or weekly limits. In that case, Gate can continue the session through: + +- another Anthropic-capable route with available balance +- a coding-capable non-Anthropic route with similar context or tool fit +- a local worker when you want to stay operational without depending on one cloud account + +## Architecture Overview + +The bridge stays intentionally thin: + +1. the Anthropic surface accepts a Claude-compatible request +2. the bridge validates and normalizes the request +3. the request is mapped into Gate's internal canonical model +4. the existing Gate core applies hooks, routing, health checks, and fallback +5. the result is mapped back into an Anthropic-compatible response + +The important split is: + +- bridge: protocol normalization only +- core gateway: routing and execution +- optional hook: Claude-Code-specific routing hints + +## Activation + +Minimal config: + +```yaml +api_surfaces: + anthropic_messages: true + +anthropic_bridge: + enabled: true + allow_claude_code_hints: true + model_aliases: + claude-code: auto + claude-code-fast: eco + claude-code-premium: premium + +request_hooks: + enabled: true + community_hooks_dir: "./hooks/community" + hooks: + - claude-code-router +``` + +What the keys do: + +- `api_surfaces.anthropic_messages` + - exposes the Anthropic-compatible HTTP surface +- `anthropic_bridge.enabled` + - enables bridge parsing and response mapping +- `anthropic_bridge.model_aliases` + - maps Claude-facing model ids to Gate routing modes or explicit provider ids +- `anthropic_bridge.allow_claude_code_hints` + - keeps Claude-Code-specific bridge metadata available for optional hooks + +## Model Alias Strategy + +Keep aliases stable and operational, not provider-specific by default. + +Good first aliases: + +- `claude-code -> auto` +- `claude-code-fast -> eco` +- `claude-code-premium -> premium` + +That keeps Claude-oriented clients on stable logical targets while Gate can still adapt the real route underneath. + +## Limits And Fallback Design + +If your main Claude usage comes from one Anthropic subscription or account, be careful with aggregator routes that still depend on a BYOK Anthropic key from the same quota domain. + +Recommended pattern: + +- keep direct Anthropic routes probeable and clearly named +- keep Anthropic-capable aggregators as explicit mirrors or secondary routes +- do not assume a premium Anthropic mirror is independent if it uses the same exhausted account +- use `faigate-doctor`, `faigate-provider-probe`, `/health`, and `/api/providers` to validate which routes are actually request-ready + +## Claude Code / Claude Desktop + +Client support for custom Anthropic endpoints varies by version and integration style. The safe pattern is: + +1. point the client at the local Gate base URL when it supports overriding the Anthropic API endpoint +2. use one stable bridge-facing model alias such as `claude-code` +3. keep route changes inside Gate, not inside the client config + +If a client cannot override the Anthropic base URL directly, use the OpenAI-compatible Gate surface instead or place a thin local wrapper in front of the client. + +Practical operator guidance: + +- start with one alias such as `claude-code -> auto` +- add `claude-code-fast -> eco` and `claude-code-premium -> premium` only when the client can switch models cleanly +- keep Anthropic-capable aggregator routes out of the top priority slot if they may still consume the same Anthropic account quota through BYOK +- keep at least one non-Anthropic coding-capable route or local worker available for continuity + +Illustrative endpoint pattern for Claude-oriented clients that allow endpoint overrides: + +```text +Base URL: http://127.0.0.1:8090 +Messages path: /v1/messages +Model: claude-code +``` + +## Local Smoke Test + +Use the bundled example: + +```bash +./docs/examples/anthropic-bridge-smoke.sh +``` + +This covers: + +- `POST /v1/messages` +- `POST /v1/messages/count_tokens` + +## Known v1 Limits + +- non-streaming only +- text content blocks only +- `count_tokens` returns a deterministic local estimate +- image or binary content blocks are not bridged yet +- the optional `claude-code-router` hook only adds routing hints diff --git a/docs/examples/anthropic-bridge-smoke.sh b/docs/examples/anthropic-bridge-smoke.sh new file mode 100755 index 0000000..aa8ebfc --- /dev/null +++ b/docs/examples/anthropic-bridge-smoke.sh @@ -0,0 +1,33 @@ +#!/usr/bin/env bash +set -euo pipefail + +BASE_URL="${FAIGATE_BASE_URL:-http://127.0.0.1:8090}" +MODEL_ALIAS="${FAIGATE_ANTHROPIC_MODEL_ALIAS:-claude-code}" + +echo "==> Health" +rtk curl -fsS "${BASE_URL}/health" +printf '\n\n' + +echo "==> Anthropic messages" +rtk curl -fsS "${BASE_URL}/v1/messages" \ + -H 'Content-Type: application/json' \ + -H 'anthropic-client: claude-code' \ + -d "{ + \"model\": \"${MODEL_ALIAS}\", + \"system\": \"Respond as a concise operator helper.\", + \"messages\": [ + {\"role\": \"user\", \"content\": \"Summarize why one local gateway endpoint helps with Anthropic quota limits.\"} + ] + }" +printf '\n\n' + +echo "==> Anthropic count_tokens" +rtk curl -i -fsS "${BASE_URL}/v1/messages/count_tokens" \ + -H 'Content-Type: application/json' \ + -d "{ + \"model\": \"${MODEL_ALIAS}\", + \"messages\": [ + {\"role\": \"user\", \"content\": \"Count the bridge tokens for this request.\"} + ] + }" +printf '\n' diff --git a/faigate/api/__init__.py b/faigate/api/__init__.py new file mode 100644 index 0000000..ea9de30 --- /dev/null +++ b/faigate/api/__init__.py @@ -0,0 +1 @@ +"""HTTP API surface modules for optional ingress adapters.""" diff --git a/faigate/api/anthropic/__init__.py b/faigate/api/anthropic/__init__.py new file mode 100644 index 0000000..57577d3 --- /dev/null +++ b/faigate/api/anthropic/__init__.py @@ -0,0 +1,29 @@ +"""Anthropic-compatible wire models and route builders.""" + +from .models import ( + AnthropicBridgeError, + AnthropicContentBlock, + AnthropicMessage, + AnthropicMessagesRequest, + AnthropicMessagesResponse, + AnthropicTokenCountRequest, + AnthropicTokenCountResponse, + AnthropicToolDefinition, + parse_anthropic_messages_request, + parse_anthropic_token_count_request, +) +from .routes import build_anthropic_router + +__all__ = [ + "AnthropicBridgeError", + "AnthropicContentBlock", + "AnthropicMessage", + "AnthropicMessagesRequest", + "AnthropicMessagesResponse", + "AnthropicTokenCountResponse", + "AnthropicTokenCountRequest", + "AnthropicToolDefinition", + "build_anthropic_router", + "parse_anthropic_token_count_request", + "parse_anthropic_messages_request", +] diff --git a/faigate/api/anthropic/models.py b/faigate/api/anthropic/models.py new file mode 100644 index 0000000..b03b858 --- /dev/null +++ b/faigate/api/anthropic/models.py @@ -0,0 +1,223 @@ +"""Anthropic-compatible wire models for the internal bridge layer.""" + +from __future__ import annotations + +from collections.abc import Mapping +from dataclasses import dataclass, field +from typing import Any + + +class AnthropicBridgeError(ValueError): + """Raised when an Anthropic wire payload cannot be normalized.""" + + +@dataclass(frozen=True) +class AnthropicContentBlock: + """One Anthropic content block.""" + + type: str + text: str | None = None + tool_use_id: str | None = None + name: str | None = None + input: dict[str, Any] = field(default_factory=dict) + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class AnthropicMessage: + """One Anthropic message turn.""" + + role: str + content: list[AnthropicContentBlock] = field(default_factory=list) + + +@dataclass(frozen=True) +class AnthropicToolDefinition: + """One Anthropic tool declaration.""" + + name: str + description: str = "" + input_schema: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class AnthropicMessagesRequest: + """Minimal request model for ``POST /v1/messages``.""" + + model: str + system: str | list[str] | None = None + messages: list[AnthropicMessage] = field(default_factory=list) + tools: list[AnthropicToolDefinition] = field(default_factory=list) + stream: bool = False + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class AnthropicTokenCountRequest: + """Minimal request model for ``POST /v1/messages/count_tokens``.""" + + model: str + system: str | list[str] | None = None + messages: list[AnthropicMessage] = field(default_factory=list) + tools: list[AnthropicToolDefinition] = field(default_factory=list) + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class AnthropicTokenCountResponse: + """Minimal Anthropic-compatible token-count response.""" + + input_tokens: int + + +@dataclass(frozen=True) +class AnthropicMessagesResponse: + """Minimal response model for the Anthropic bridge.""" + + id: str + type: str = "message" + role: str = "assistant" + model: str | None = None + content: list[AnthropicContentBlock] = field(default_factory=list) + stop_reason: str | None = None + stop_sequence: str | None = None + usage: dict[str, Any] = field(default_factory=dict) + metadata: dict[str, Any] = field(default_factory=dict) + + +def parse_anthropic_messages_request(payload: Mapping[str, Any]) -> AnthropicMessagesRequest: + """Parse the smallest Anthropic messages payload we need for bridge setup.""" + + if not isinstance(payload, Mapping): + raise AnthropicBridgeError("Anthropic messages payload must be a mapping") + + model = str(payload.get("model", "") or "").strip() + if not model: + raise AnthropicBridgeError("Anthropic messages payload requires a model") + + raw_system = payload.get("system") + system: str | list[str] | None + if raw_system is None: + system = None + elif isinstance(raw_system, str): + system = raw_system + elif isinstance(raw_system, list) and all(isinstance(item, str) for item in raw_system): + system = list(raw_system) + else: + raise AnthropicBridgeError("'system' must be a string, a list of strings, or null") + + raw_messages = payload.get("messages", []) + if not isinstance(raw_messages, list): + raise AnthropicBridgeError("'messages' must be a list") + messages = [_parse_message(item) for item in raw_messages] + + raw_tools = payload.get("tools", []) + if not isinstance(raw_tools, list): + raise AnthropicBridgeError("'tools' must be a list") + tools = [_parse_tool(item) for item in raw_tools] + + stream = payload.get("stream", False) + if not isinstance(stream, bool): + raise AnthropicBridgeError("'stream' must be a boolean") + + metadata = payload.get("metadata", {}) + if metadata is None: + metadata = {} + if not isinstance(metadata, Mapping): + raise AnthropicBridgeError("'metadata' must be a mapping") + + return AnthropicMessagesRequest( + model=model, + system=system, + messages=messages, + tools=tools, + stream=stream, + metadata=dict(metadata), + ) + + +def parse_anthropic_token_count_request(payload: Mapping[str, Any]) -> AnthropicTokenCountRequest: + """Parse the v1 count_tokens payload using the same basic request shape.""" + + request = parse_anthropic_messages_request(payload) + return AnthropicTokenCountRequest( + model=request.model, + system=request.system, + messages=request.messages, + tools=request.tools, + metadata=dict(request.metadata), + ) + + +def _parse_message(raw: Any) -> AnthropicMessage: + if not isinstance(raw, Mapping): + raise AnthropicBridgeError("Anthropic message entries must be mappings") + + role = str(raw.get("role", "") or "").strip() + if not role: + raise AnthropicBridgeError("Anthropic message entries require a role") + + return AnthropicMessage(role=role, content=_parse_content_blocks(raw.get("content", []))) + + +def _parse_content_blocks(raw: Any) -> list[AnthropicContentBlock]: + if isinstance(raw, str): + return [AnthropicContentBlock(type="text", text=raw)] + if not isinstance(raw, list): + raise AnthropicBridgeError("'content' must be a string or a list of blocks") + + blocks: list[AnthropicContentBlock] = [] + for item in raw: + if isinstance(item, str): + blocks.append(AnthropicContentBlock(type="text", text=item)) + continue + if not isinstance(item, Mapping): + raise AnthropicBridgeError("Anthropic content blocks must be strings or mappings") + + block_type = str(item.get("type", "") or "").strip() + if not block_type: + raise AnthropicBridgeError("Anthropic content blocks require a type") + + raw_input = item.get("input", {}) + if raw_input is None: + raw_input = {} + if not isinstance(raw_input, Mapping): + raise AnthropicBridgeError("Anthropic tool content block 'input' must be a mapping") + + block_metadata = { + key: value + for key, value in item.items() + if key not in {"type", "text", "id", "tool_use_id", "name", "input"} + } + blocks.append( + AnthropicContentBlock( + type=block_type, + text=item.get("text"), + tool_use_id=str(item.get("tool_use_id") or item.get("id") or "").strip() or None, + name=str(item.get("name", "") or "").strip() or None, + input=dict(raw_input), + metadata=block_metadata, + ) + ) + return blocks + + +def _parse_tool(raw: Any) -> AnthropicToolDefinition: + if not isinstance(raw, Mapping): + raise AnthropicBridgeError("Anthropic tool definitions must be mappings") + + name = str(raw.get("name", "") or "").strip() + if not name: + raise AnthropicBridgeError("Anthropic tool definitions require a name") + + input_schema = raw.get("input_schema", {}) + if input_schema is None: + input_schema = {} + if not isinstance(input_schema, Mapping): + raise AnthropicBridgeError("'input_schema' must be a mapping") + + return AnthropicToolDefinition( + name=name, + description=str(raw.get("description", "") or "").strip(), + input_schema=dict(input_schema), + ) diff --git a/faigate/api/anthropic/routes.py b/faigate/api/anthropic/routes.py new file mode 100644 index 0000000..5e78814 --- /dev/null +++ b/faigate/api/anthropic/routes.py @@ -0,0 +1,43 @@ +"""FastAPI route builders for the optional Anthropic bridge surface.""" + +from __future__ import annotations + +from dataclasses import asdict + +from fastapi import APIRouter, Request +from fastapi.responses import JSONResponse + +from ...bridges.anthropic import dispatch_anthropic_count_tokens, dispatch_anthropic_messages +from ...canonical import CanonicalChatExecutor + + +def build_anthropic_router(*, executor: CanonicalChatExecutor) -> APIRouter: + """Return a detached Anthropic-compatible router. + + The router is intentionally not mounted by default. This keeps the current + OpenAI-compatible runtime unchanged while making the future bridge ingress + explicit and testable. + """ + + router = APIRouter(tags=["anthropic-bridge"]) + + @router.post("/v1/messages") + async def anthropic_messages(request: Request) -> JSONResponse: + payload = await request.json() + response = await dispatch_anthropic_messages( + payload=payload, + headers={key.lower(): value for key, value in request.headers.items()}, + executor=executor, + ) + return JSONResponse(asdict(response)) + + @router.post("/v1/messages/count_tokens") + async def anthropic_count_tokens(request: Request) -> JSONResponse: + payload = await request.json() + response, extra_headers = dispatch_anthropic_count_tokens( + payload=payload, + headers={key.lower(): value for key, value in request.headers.items()}, + ) + return JSONResponse(asdict(response), headers=extra_headers) + + return router diff --git a/faigate/bridges/__init__.py b/faigate/bridges/__init__.py new file mode 100644 index 0000000..99b3ec9 --- /dev/null +++ b/faigate/bridges/__init__.py @@ -0,0 +1 @@ +"""Internal protocol bridge modules.""" diff --git a/faigate/bridges/anthropic/__init__.py b/faigate/bridges/anthropic/__init__.py new file mode 100644 index 0000000..57bb9dd --- /dev/null +++ b/faigate/bridges/anthropic/__init__.py @@ -0,0 +1,21 @@ +"""Anthropic bridge helpers.""" + +from .adapter import ( + anthropic_count_tokens_request_to_canonical, + anthropic_request_to_canonical, + approximate_anthropic_input_tokens, + canonical_response_to_anthropic, + canonical_to_openai_body, + dispatch_anthropic_count_tokens, + dispatch_anthropic_messages, +) + +__all__ = [ + "anthropic_count_tokens_request_to_canonical", + "anthropic_request_to_canonical", + "approximate_anthropic_input_tokens", + "canonical_response_to_anthropic", + "canonical_to_openai_body", + "dispatch_anthropic_count_tokens", + "dispatch_anthropic_messages", +] diff --git a/faigate/bridges/anthropic/adapter.py b/faigate/bridges/anthropic/adapter.py new file mode 100644 index 0000000..be3b5af --- /dev/null +++ b/faigate/bridges/anthropic/adapter.py @@ -0,0 +1,298 @@ +"""Anthropic <-> canonical model adapters. + +This module intentionally contains only normalization logic. Routing, policy +application, hook execution, and provider selection stay in the existing gate +core and are addressed through the ``CanonicalChatExecutor`` contract. +""" + +from __future__ import annotations + +import json +from typing import Any +from uuid import uuid4 + +from ...api.anthropic.models import ( + AnthropicBridgeError, + AnthropicContentBlock, + AnthropicMessage, + AnthropicMessagesRequest, + AnthropicMessagesResponse, + AnthropicTokenCountRequest, + AnthropicTokenCountResponse, + parse_anthropic_messages_request, + parse_anthropic_token_count_request, +) +from ...canonical import ( + CanonicalChatExecutor, + CanonicalChatRequest, + CanonicalChatResponse, + CanonicalMessage, + CanonicalResponseMessage, + CanonicalTool, +) + + +def anthropic_request_to_canonical( + request: AnthropicMessagesRequest, + *, + headers: dict[str, str] | None = None, +) -> CanonicalChatRequest: + """Map an Anthropic messages request to the internal gateway model.""" + + normalized_headers = {str(key): str(value) for key, value in (headers or {}).items()} + source = ( + normalized_headers.get("x-faigate-client") + or normalized_headers.get("anthropic-client") + or "claude-code" + ) + client = source + metadata = dict(request.metadata) + metadata.setdefault("source", source) + metadata.setdefault("bridge_surface", "anthropic-messages") + if normalized_headers: + metadata["bridge_headers"] = normalized_headers + + return CanonicalChatRequest( + client=client, + surface="anthropic-messages", + requested_model=request.model, + system=request.system, + messages=[_message_to_canonical(message) for message in request.messages], + tools=[ + CanonicalTool( + name=tool.name, + description=tool.description, + input_schema=dict(tool.input_schema), + ) + for tool in request.tools + ], + stream=request.stream, + metadata=metadata, + ) + + +def canonical_to_openai_body(request: CanonicalChatRequest) -> dict[str, Any]: + """Build the current internal handoff shape for the gateway core.""" + + return request.to_openai_body() + + +def anthropic_count_tokens_request_to_canonical( + request: AnthropicTokenCountRequest, + *, + headers: dict[str, str] | None = None, +) -> CanonicalChatRequest: + """Map a count_tokens request to the same canonical request model.""" + + return anthropic_request_to_canonical( + AnthropicMessagesRequest( + model=request.model, + system=request.system, + messages=request.messages, + tools=request.tools, + stream=False, + metadata=dict(request.metadata), + ), + headers=headers, + ) + + +def canonical_response_to_anthropic( + response: CanonicalChatResponse, + *, + requested_model: str, +) -> AnthropicMessagesResponse: + """Map the canonical response model back to Anthropic wire format.""" + + return AnthropicMessagesResponse( + id=response.response_id or f"msg_{uuid4().hex}", + model=response.model or requested_model, + content=_canonical_content_to_anthropic_blocks(response.message), + stop_reason=response.stop_reason or response.message.stop_reason, + usage=dict(response.usage), + metadata={ + **dict(response.metadata), + **({"provider": response.provider} if response.provider else {}), + }, + ) + + +async def dispatch_anthropic_messages( + *, + payload: dict[str, Any], + headers: dict[str, str], + executor: CanonicalChatExecutor, +) -> AnthropicMessagesResponse: + """Run the full bridge flow for one Anthropic messages request.""" + + wire_request = parse_anthropic_messages_request(payload) + canonical_request = anthropic_request_to_canonical(wire_request, headers=headers) + canonical_response = await executor.execute_canonical_chat(canonical_request) + return canonical_response_to_anthropic( + canonical_response, + requested_model=wire_request.model, + ) + + +def dispatch_anthropic_count_tokens( + *, + payload: dict[str, Any], + headers: dict[str, str], +) -> tuple[AnthropicTokenCountResponse, dict[str, str]]: + """Run the bridge flow for a local v1 token-count estimate. + + v1 deliberately favors a stable local estimate over provider-specific token + accounting. The response remains Anthropic-compatible while the headers make + the approximation explicit for operators and advanced clients. + """ + + wire_request = parse_anthropic_token_count_request(payload) + canonical_request = anthropic_count_tokens_request_to_canonical( + wire_request, + headers=headers, + ) + input_tokens, method = approximate_anthropic_input_tokens(canonical_request) + return ( + AnthropicTokenCountResponse(input_tokens=input_tokens), + { + "X-faigate-Token-Count-Exact": "false", + "X-faigate-Token-Count-Method": method, + }, + ) + + +def approximate_anthropic_input_tokens(request: CanonicalChatRequest) -> tuple[int, str]: + """Return a lightweight token estimate for Anthropic bridge requests. + + The gateway does not yet maintain provider-specific tokenizers or a stable + upstream counting path for every routed provider. For v1 we therefore use a + deterministic character-byte heuristic with small structural overheads. + """ + + total = 3 + if isinstance(request.system, str): + total += 4 + _estimate_text_tokens(request.system) + elif isinstance(request.system, list): + for item in request.system: + if isinstance(item, str): + total += 4 + _estimate_text_tokens(item) + + for message in request.messages: + total += 4 + total += _estimate_text_tokens(message.role) + total += _estimate_message_content_tokens(message.content) + + for tool in request.tools: + total += 12 + total += _estimate_text_tokens(tool.name) + total += _estimate_text_tokens(tool.description) + total += _estimate_text_tokens( + json.dumps(tool.input_schema, sort_keys=True, separators=(",", ":")) + ) + + return max(total, 1), "estimated-char-v1" + + +def _message_to_canonical(message: AnthropicMessage) -> CanonicalMessage: + if any(block.type != "text" for block in message.content): + raise AnthropicBridgeError( + "Anthropic bridge v1 currently supports only text content blocks in messages" + ) + if len(message.content) == 1 and message.content[0].type == "text": + content: Any = message.content[0].text or "" + else: + content = [_anthropic_block_to_payload(block) for block in message.content] + return CanonicalMessage(role=message.role, content=content) + + +def _anthropic_block_to_payload(block: AnthropicContentBlock) -> dict[str, Any]: + payload: dict[str, Any] = {"type": block.type} + if block.text is not None: + payload["text"] = block.text + if block.tool_use_id: + payload["tool_use_id"] = block.tool_use_id + if block.name: + payload["name"] = block.name + if block.input: + payload["input"] = dict(block.input) + if block.metadata: + payload["metadata"] = dict(block.metadata) + return payload + + +def _estimate_message_content_tokens(content: Any) -> int: + if isinstance(content, str): + return _estimate_text_tokens(content) + if isinstance(content, list): + total = 0 + for item in content: + if isinstance(item, str): + total += _estimate_text_tokens(item) + elif isinstance(item, dict): + total += _estimate_text_tokens(json.dumps(item, sort_keys=True)) + else: + total += _estimate_text_tokens(str(item)) + return total + return _estimate_text_tokens(str(content or "")) + + +def _estimate_text_tokens(text: str) -> int: + cleaned = str(text or "") + if not cleaned: + return 0 + byte_count = len(cleaned.encode("utf-8")) + return max(1, (byte_count + 3) // 4) + + +def _canonical_content_to_anthropic_blocks( + message: CanonicalResponseMessage, +) -> list[AnthropicContentBlock]: + content = message.content + blocks: list[AnthropicContentBlock] + if isinstance(content, str): + blocks = [AnthropicContentBlock(type="text", text=content)] + elif isinstance(content, list): + blocks = [] + for item in content: + if isinstance(item, str): + blocks.append(AnthropicContentBlock(type="text", text=item)) + continue + if not isinstance(item, dict): + blocks.append(AnthropicContentBlock(type="text", text=str(item))) + continue + blocks.append( + AnthropicContentBlock( + type=str(item.get("type", "text") or "text"), + text=item.get("text"), + tool_use_id=str(item.get("tool_use_id", "") or "").strip() or None, + name=str(item.get("name", "") or "").strip() or None, + input=dict(item.get("input", {}) or {}), + metadata=dict(item.get("metadata", {}) or {}), + ) + ) + else: + blocks = [AnthropicContentBlock(type="text", text=str(content or ""))] + + for tool_call in message.tool_calls: + if not isinstance(tool_call, dict): + continue + function = tool_call.get("function", {}) or {} + raw_arguments = str(function.get("arguments", "") or "").strip() + parsed_arguments: dict[str, Any] + if raw_arguments: + try: + loaded = json.loads(raw_arguments) + parsed_arguments = loaded if isinstance(loaded, dict) else {"arguments": loaded} + except json.JSONDecodeError: + parsed_arguments = {"raw_arguments": raw_arguments} + else: + parsed_arguments = {} + blocks.append( + AnthropicContentBlock( + type="tool_use", + tool_use_id=str(tool_call.get("id", "") or "").strip() or None, + name=str(function.get("name", "") or "").strip() or None, + input=parsed_arguments, + ) + ) + return blocks diff --git a/faigate/canonical.py b/faigate/canonical.py new file mode 100644 index 0000000..1ad558b --- /dev/null +++ b/faigate/canonical.py @@ -0,0 +1,135 @@ +"""Canonical request/response models shared by protocol bridge layers. + +The gateway currently exposes an OpenAI-compatible ingress surface. Additional +surfaces such as Anthropic messages should normalize into one internal shape so +that routing, hooks, and provider execution remain centralized. +""" + +from __future__ import annotations + +from dataclasses import dataclass, field +from typing import Any, Protocol + + +@dataclass(frozen=True) +class CanonicalTool: + """One tool definition in the gateway-internal request model.""" + + name: str + description: str = "" + input_schema: dict[str, Any] = field(default_factory=dict) + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class CanonicalMessage: + """One normalized conversational turn. + + ``content`` intentionally stays flexible for the first bridge slice. The + existing routing path mainly reasons about message lists and roles, while + bridge adapters may still need to preserve provider-specific content blocks. + """ + + role: str + content: Any + name: str | None = None + tool_call_id: str | None = None + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class CanonicalChatRequest: + """Ingress-independent chat request passed into the gateway core.""" + + client: str + surface: str + requested_model: str + system: str | list[str] | None = None + messages: list[CanonicalMessage] = field(default_factory=list) + tools: list[CanonicalTool] = field(default_factory=list) + stream: bool = False + metadata: dict[str, Any] = field(default_factory=dict) + + def to_openai_body(self) -> dict[str, Any]: + """Build the existing OpenAI-compatible request shape. + + This helper is the narrow handoff point for the first bridge iteration. + The rest of the runtime can keep using the established + ``/v1/chat/completions`` payload contract until a shared execution + helper is extracted. + """ + + messages: list[dict[str, Any]] = [] + if isinstance(self.system, str) and self.system.strip(): + messages.append({"role": "system", "content": self.system}) + elif isinstance(self.system, list): + for item in self.system: + if isinstance(item, str) and item.strip(): + messages.append({"role": "system", "content": item}) + + for message in self.messages: + payload: dict[str, Any] = { + "role": message.role, + "content": message.content, + } + if message.name: + payload["name"] = message.name + if message.tool_call_id: + payload["tool_call_id"] = message.tool_call_id + if message.metadata: + payload["metadata"] = dict(message.metadata) + messages.append(payload) + + body: dict[str, Any] = { + "model": self.requested_model, + "messages": messages, + "stream": self.stream, + } + if self.tools: + body["tools"] = [ + { + "type": "function", + "function": { + "name": tool.name, + "description": tool.description, + "parameters": dict(tool.input_schema), + }, + "metadata": dict(tool.metadata), + } + for tool in self.tools + ] + if self.metadata: + body["metadata"] = dict(self.metadata) + return body + + +@dataclass(frozen=True) +class CanonicalResponseMessage: + """Normalized assistant response returned from the gateway core.""" + + role: str = "assistant" + content: Any = "" + tool_calls: list[dict[str, Any]] = field(default_factory=list) + stop_reason: str | None = None + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class CanonicalChatResponse: + """Ingress-independent chat response.""" + + response_id: str | None = None + model: str | None = None + provider: str | None = None + message: CanonicalResponseMessage = field(default_factory=CanonicalResponseMessage) + stop_reason: str | None = None + usage: dict[str, Any] = field(default_factory=dict) + metadata: dict[str, Any] = field(default_factory=dict) + raw: dict[str, Any] = field(default_factory=dict) + + +class CanonicalChatExecutor(Protocol): + """Small execution contract for future bridge surfaces.""" + + async def execute_canonical_chat(self, request: CanonicalChatRequest) -> CanonicalChatResponse: + """Run one canonical chat request through the gateway core.""" diff --git a/faigate/config.py b/faigate/config.py index b0c4ac1..c32184e 100644 --- a/faigate/config.py +++ b/faigate/config.py @@ -1723,6 +1723,79 @@ def _normalize_provider_source_refresh(data: dict[str, Any]) -> dict[str, Any]: return normalized +def _normalize_api_surfaces(data: dict[str, Any]) -> dict[str, Any]: + """Validate top-level API-surface toggles. + + Anthropic bridge activation remains two-step on purpose: + the bridge logic must be enabled and the surface must be exposed. To avoid + breaking earlier bridge configs, the Anthropic surface defaults to the + bridge-enabled state when the operator does not set it explicitly. + """ + + raw = data.get("api_surfaces") or {} + if not isinstance(raw, dict): + raise ConfigError("'api_surfaces' must be a mapping") + + openai_compatible = raw.get("openai_compatible", True) + if not isinstance(openai_compatible, bool): + raise ConfigError("'api_surfaces.openai_compatible' must be a boolean") + + anthropic_default = bool((data.get("anthropic_bridge") or {}).get("enabled", False)) + anthropic_messages = raw.get("anthropic_messages", anthropic_default) + if not isinstance(anthropic_messages, bool): + raise ConfigError("'api_surfaces.anthropic_messages' must be a boolean") + + normalized = dict(data) + normalized["api_surfaces"] = { + "openai_compatible": openai_compatible, + "anthropic_messages": anthropic_messages, + } + return normalized + + +def _normalize_anthropic_bridge(data: dict[str, Any]) -> dict[str, Any]: + """Validate the optional Anthropic-compatible bridge surface.""" + + raw = data.get("anthropic_bridge") or {} + if not isinstance(raw, dict): + raise ConfigError("'anthropic_bridge' must be a mapping") + + enabled = raw.get("enabled", False) + if not isinstance(enabled, bool): + raise ConfigError("'anthropic_bridge.enabled' must be a boolean") + + route_prefix = str(raw.get("route_prefix", "/v1") or "").strip() + if not route_prefix.startswith("/"): + raise ConfigError("'anthropic_bridge.route_prefix' must start with '/'") + + allow_claude_code_hints = raw.get("allow_claude_code_hints", True) + if not isinstance(allow_claude_code_hints, bool): + raise ConfigError("'anthropic_bridge.allow_claude_code_hints' must be a boolean") + + model_aliases = raw.get("model_aliases", {}) + if model_aliases is None: + model_aliases = {} + if not isinstance(model_aliases, dict): + raise ConfigError("'anthropic_bridge.model_aliases' must be a mapping") + + normalized_aliases: dict[str, str] = {} + for key, value in model_aliases.items(): + alias = str(key or "").strip() + target = str(value or "").strip() + if not alias or not target: + raise ConfigError("'anthropic_bridge.model_aliases' keys and values must be non-empty") + normalized_aliases[alias] = target + + normalized = dict(data) + normalized["anthropic_bridge"] = { + "enabled": enabled, + "route_prefix": route_prefix.rstrip("/") or "/v1", + "allow_claude_code_hints": allow_claude_code_hints, + "model_aliases": normalized_aliases, + } + return normalized + + class Config: """Holds the parsed and expanded configuration.""" @@ -1769,6 +1842,13 @@ def request_hooks(self) -> dict: {"enabled": False, "hooks": [], "on_error": "continue"}, ) + @property + def api_surfaces(self) -> dict: + return self._data.get( + "api_surfaces", + {"openai_compatible": True, "anthropic_messages": False}, + ) + @property def routing_modes(self) -> dict: return self._data.get( @@ -1885,6 +1965,18 @@ def provider_source_refresh(self) -> dict: }, ) + @property + def anthropic_bridge(self) -> dict: + return self._data.get( + "anthropic_bridge", + { + "enabled": False, + "route_prefix": "/v1", + "allow_claude_code_hints": True, + "model_aliases": {}, + }, + ) + def provider(self, name: str) -> dict | None: return self.providers.get(name) @@ -1935,17 +2027,21 @@ def load_config(path: str | Path | None = None) -> Config: raw = yaml.safe_load(f) expanded = _normalize_provider_source_refresh( - _normalize_provider_catalog_check( - _normalize_security( - _normalize_auto_update( - _normalize_update_check( - _normalize_request_hooks( - _validate_routing_mode_references( - _normalize_model_shortcuts( - _normalize_routing_modes( - _normalize_client_profiles( - _normalize_routing_policies( - _normalize_providers(_walk_expand(raw)) + _normalize_api_surfaces( + _normalize_anthropic_bridge( + _normalize_provider_catalog_check( + _normalize_security( + _normalize_auto_update( + _normalize_update_check( + _normalize_request_hooks( + _validate_routing_mode_references( + _normalize_model_shortcuts( + _normalize_routing_modes( + _normalize_client_profiles( + _normalize_routing_policies( + _normalize_providers(_walk_expand(raw)) + ) + ) ) ) ) diff --git a/faigate/main.py b/faigate/main.py index 730b741..0e16a1c 100644 --- a/faigate/main.py +++ b/faigate/main.py @@ -17,7 +17,9 @@ import time import uuid from base64 import b64encode +from collections.abc import AsyncIterator from contextlib import asynccontextmanager, suppress +from dataclasses import asdict, dataclass from hashlib import sha256 from typing import Any @@ -27,6 +29,13 @@ from . import __version__ from .adaptation import AdaptiveRouteState +from .api.anthropic.models import AnthropicBridgeError, parse_anthropic_messages_request +from .bridges.anthropic import ( + anthropic_request_to_canonical, + canonical_response_to_anthropic, + dispatch_anthropic_count_tokens, +) +from .canonical import CanonicalChatRequest, CanonicalChatResponse, CanonicalResponseMessage from .config import Config, load_config from .hooks import ( AppliedHooks, @@ -78,6 +87,31 @@ class PayloadTooLargeError(ValueError): """Raised when one request or upload exceeds configured size limits.""" +@dataclass +class _ChatExecutionSuccess: + """One successful internal chat execution.""" + + result: dict[str, Any] | AsyncIterator[bytes] + provider_name: str + client_profile: str + client_tag: str + decision: RoutingDecision + model_requested: str + resolved_mode: str | None + resolved_shortcut: str | None + hook_state: AppliedHooks + trace_id: str | None + stream: bool + + +@dataclass +class _ChatExecutionFailure: + """One structured chat execution failure.""" + + status_code: int + body: dict[str, Any] + + def _client_error_response(message: str, *, error_type: str, status_code: int) -> JSONResponse: """Return a client-facing JSON error without exposing internal exception details.""" return JSONResponse({"error": message, "type": error_type}, status_code=status_code) @@ -93,6 +127,21 @@ def _request_hook_error_response(exc: Exception) -> JSONResponse: ) +def _anthropic_error_response(message: str, *, error_type: str, status_code: int) -> JSONResponse: + """Return an Anthropic-compatible error envelope.""" + + return JSONResponse( + { + "type": "error", + "error": { + "type": error_type, + "message": message, + }, + }, + status_code=status_code, + ) + + def _invalid_request_response(message: str, *, exc: Exception | None = None) -> JSONResponse: """Return a sanitized invalid-request response.""" if exc is not None: @@ -247,6 +296,56 @@ def _collect_routing_headers(request: Request) -> dict[str, str]: } +def _collect_anthropic_bridge_headers(request: Request) -> dict[str, str]: + """Return routing headers plus bridge-specific client/source hints.""" + + headers = _collect_routing_headers(request) + max_chars = int((_config.security or {}).get("max_header_value_chars", 160)) + bridge_source = _sanitize_token( + request.headers.get("anthropic-client") + or request.headers.get("x-faigate-client") + or request.headers.get("x-claude-code-client") + or "claude-code", + default="claude-code", + max_chars=max_chars, + ) + headers.setdefault("x-faigate-client", bridge_source) + headers.setdefault("x-faigate-surface", "anthropic-messages") + return headers + + +def _anthropic_bridge_surface_enabled() -> bool: + """Return whether the Anthropic-compatible surface should be exposed.""" + + if "_config" not in globals(): + return False + bridge = _config.anthropic_bridge + surfaces = _config.api_surfaces + return bool(bridge.get("enabled", False) and surfaces.get("anthropic_messages", False)) + + +def _resolve_anthropic_requested_model(request: CanonicalChatRequest) -> CanonicalChatRequest: + """Apply configured Anthropic bridge aliases without changing wire parsing.""" + + alias_map = _config.anthropic_bridge.get("model_aliases", {}) + requested_model = str(alias_map.get(request.requested_model, request.requested_model)) + if requested_model == request.requested_model: + return request + metadata = dict(request.metadata) + metadata.setdefault("requested_model_original", request.requested_model) + metadata["requested_model_resolved"] = requested_model + return CanonicalChatRequest( + client=request.client, + surface=request.surface, + requested_model=requested_model, + system=request.system, + messages=list(request.messages), + tools=list(request.tools), + stream=request.stream, + metadata=metadata, + ) + + def _collect_operator_context(headers: dict[str, str]) -> tuple[str, str]: """Return operator action and client tag hints from request headers.""" max_chars = int((_config.security or {}).get("max_header_value_chars", 160)) @@ -1443,6 +1542,221 @@ async def _resolve_route_preview( ) +def _completion_extra_body(body: dict[str, Any]) -> dict[str, Any] | None: + """Return a narrow passthrough set for upstream completion calls.""" + + passthrough: dict[str, Any] = {} + for key in ("metadata", "response_format", "tool_choice", "user", "stop"): + value = body.get(key) + if value in (None, "", [], {}): + continue + passthrough[key] = value + return passthrough or None + + +async def _execute_chat_completion_body( + body: dict[str, Any], + headers: dict[str, str], +) -> _ChatExecutionSuccess | _ChatExecutionFailure: + """Run one normalized chat request through the existing provider path.""" + + ( + decision, + client_profile, + client_tag, + attempt_order, + model_requested, + resolved_mode, + resolved_shortcut, + hook_state, + effective_body, + ) = await _resolve_route_preview(body, headers) + messages = effective_body.get("messages", []) + stream = effective_body.get("stream", False) + temperature = effective_body.get("temperature") + max_tokens = effective_body.get("max_tokens") + tools = effective_body.get("tools") + extra_body = _completion_extra_body(effective_body) + + logger.info( + "Route: %s [%s/%s] %.1fms", + decision.provider_name, + decision.layer, + decision.rule_name, + decision.elapsed_ms, + ) + + errors: list[dict[str, Any]] = [] + + for provider_name in attempt_order: + provider = _providers.get(provider_name) + if not provider: + continue + if not provider.health.healthy and provider_name != attempt_order[0]: + continue + + try: + result = await provider.complete( + messages, + stream=stream, + temperature=temperature, + max_tokens=max_tokens, + tools=tools, + extra_body=extra_body, + ) + _adaptive_state.record_success( + provider_name, + latency_ms=(result.get("_faigate") or {}).get("latency_ms", 0) + if isinstance(result, dict) + else 0.0, + ) + + trace_id: str | None = None + if _config.metrics.get("enabled") and isinstance(result, dict): + usage = result.get("usage", {}) + cg = result.get("_faigate", {}) + pt = usage.get("prompt_tokens", 0) + ct = usage.get("completion_tokens", 0) + ch = cg.get("cache_hit_tokens", 0) + cm = cg.get("cache_miss_tokens", 0) + provider_cfg = _config.provider(provider_name) + pricing = provider_cfg.get("pricing", {}) if provider_cfg else {} + cost = calc_cost(pt, ct, pricing, cache_hit=ch, cache_miss=cm) + row_id = _metrics.log_request( + provider=provider_name, + model=provider.model, + layer=decision.layer, + rule_name=decision.rule_name, + prompt_tokens=pt, + completion_tokens=ct, + cache_hit=ch, + cache_miss=cm, + cost_usd=cost, + latency_ms=cg.get("latency_ms", 0), + requested_model=model_requested, + modality="chat", + client_profile=client_profile, + client_tag=client_tag, + decision_reason=decision.reason, + confidence=decision.confidence, + **_attempt_metric_fields( + decision, + provider_name, + attempt_order=attempt_order, + ), + attempt_order=attempt_order, + ) + trace_id = str(row_id) if row_id is not None else str(uuid.uuid4()) + + return _ChatExecutionSuccess( + result=result, + provider_name=provider_name, + client_profile=client_profile, + client_tag=client_tag, + decision=decision, + model_requested=model_requested, + resolved_mode=resolved_mode, + resolved_shortcut=resolved_shortcut, + hook_state=hook_state, + trace_id=trace_id, + stream=bool(stream), + ) + except ProviderError as e: + _adaptive_state.record_failure(provider_name, error=e.detail[:500]) + errors.append(_serialize_provider_attempt_error(provider_name, e)) + logger.warning("Provider %s failed: %s, trying next...", provider_name, e.detail[:200]) + if _config.metrics.get("enabled"): + _metrics.log_request( + provider=provider_name, + model=provider.model, + layer=decision.layer, + rule_name=decision.rule_name, + success=False, + error=e.detail[:500], + requested_model=model_requested, + modality="chat", + client_profile=client_profile, + client_tag=client_tag, + decision_reason=decision.reason, + confidence=decision.confidence, + **_attempt_metric_fields( + decision, + provider_name, + attempt_order=attempt_order, + ), + attempt_order=attempt_order, + ) + continue + + return _ChatExecutionFailure( + status_code=502, + body={ + "error": { + "message": "All providers failed", + "type": "provider_error", + "attempts": errors, + } + }, + ) + + +def _openai_result_to_canonical_response(result: dict[str, Any]) -> CanonicalChatResponse: + """Normalize one OpenAI-style completion response into the canonical model.""" + + choices = result.get("choices") or [] + first_choice = choices[0] if choices else {} + message = first_choice.get("message") or {} + usage = result.get("usage") or {} + provider_meta = result.get("_faigate") or {} + return CanonicalChatResponse( + response_id=str(result.get("id") or ""), + model=str(result.get("model") or ""), + provider=str(provider_meta.get("provider") or ""), + message=CanonicalResponseMessage( + role=str(message.get("role") or "assistant"), + content=message.get("content") or "", + tool_calls=list(message.get("tool_calls") or []), + stop_reason=str(first_choice.get("finish_reason") or "") or None, + ), + stop_reason=str(first_choice.get("finish_reason") or "") or None, + usage={ + "input_tokens": int(usage.get("prompt_tokens") or 0), + "output_tokens": int(usage.get("completion_tokens") or 0), + "total_tokens": int(usage.get("total_tokens") or 0), + }, + metadata={"raw_usage": dict(usage)}, + raw=dict(result), + ) + + +class _AnthropicBridgeExecutor: + """Route canonical Anthropic requests through the existing chat path.""" + + async def execute_canonical_chat(self, request: CanonicalChatRequest) -> CanonicalChatResponse: + alias_map = _config.anthropic_bridge.get("model_aliases", {}) + requested_model = str(alias_map.get(request.requested_model, request.requested_model)) + effective_request = CanonicalChatRequest( + client=request.client, + surface=request.surface, + requested_model=requested_model, + system=request.system, + messages=list(request.messages), + tools=list(request.tools), + stream=request.stream, + metadata=dict(request.metadata), + ) + body = effective_request.to_openai_body() + headers = dict(effective_request.metadata.get("bridge_headers") or {}) + execution = await _execute_chat_completion_body(body, headers) + if isinstance(execution, _ChatExecutionFailure): + raise AnthropicBridgeError( + execution.body.get("error", {}).get("message", "Anthropic bridge request failed") + ) + if execution.stream or not isinstance(execution.result, dict): + raise AnthropicBridgeError("Anthropic bridge v1 does not support streaming responses") + return _openai_result_to_canonical_response(execution.result) + + def _collect_image_request_fields(body: dict[str, Any]) -> dict[str, Any]: """Return a narrow, validated subset of image-generation request fields.""" fields: dict[str, Any] = {} @@ -2610,163 +2924,172 @@ async def chat_completions(request: Request): headers = _collect_routing_headers(request) try: - ( - decision, - client_profile, - client_tag, - attempt_order, - model_requested, - resolved_mode, - resolved_shortcut, - hook_state, - effective_body, - ) = await _resolve_route_preview(body, headers) + execution = await _execute_chat_completion_body(body, headers) except HookExecutionError as exc: return _request_hook_error_response(exc) - messages = effective_body.get("messages", []) - stream = effective_body.get("stream", False) - temperature = effective_body.get("temperature") - max_tokens = effective_body.get("max_tokens") - tools = effective_body.get("tools") - - logger.info( - "Route: %s [%s/%s] %.1fms", - decision.provider_name, - decision.layer, - decision.rule_name, - decision.elapsed_ms, - ) - # ── Execute with fallback ────────────────────────────── + if isinstance(execution, _ChatExecutionFailure): + return JSONResponse(execution.body, status_code=execution.status_code) + + if execution.stream: + return StreamingResponse( + execution.result, + media_type="text/event-stream", + headers={ + "X-faigate-Provider": execution.provider_name, + "X-faigate-Profile": execution.client_profile, + "X-faigate-Hooks": ",".join(execution.hook_state.applied_hooks), + "X-faigate-Hook-Errors": str(len(execution.hook_state.errors)), + "x-faigate-trace-id": execution.trace_id or str(uuid.uuid4()), + }, + ) - errors: list[dict[str, Any]] = [] + resp = JSONResponse(execution.result) + resp.headers["X-faigate-Provider"] = execution.provider_name + resp.headers["X-faigate-Profile"] = execution.client_profile + if execution.resolved_mode: + resp.headers["X-faigate-Mode"] = execution.resolved_mode + if execution.resolved_shortcut: + resp.headers["X-faigate-Shortcut"] = execution.resolved_shortcut + resp.headers["X-faigate-Layer"] = execution.decision.layer + resp.headers["X-faigate-Rule"] = execution.decision.rule_name + resp.headers["X-faigate-Hooks"] = ",".join(execution.hook_state.applied_hooks) + resp.headers["X-faigate-Hook-Errors"] = str(len(execution.hook_state.errors)) + resp.headers["x-faigate-trace-id"] = execution.trace_id or str(uuid.uuid4()) + return resp + + +@app.post("/v1/messages") +async def anthropic_messages(request: Request): + """Anthropic-compatible messages endpoint, kept intentionally small for v1.""" + + if not _anthropic_bridge_surface_enabled(): + return _anthropic_error_response( + "Anthropic bridge is disabled", + error_type="not_found_error", + status_code=404, + ) - for provider_name in attempt_order: - provider = _providers.get(provider_name) - if not provider: - continue - if not provider.health.healthy and provider_name != attempt_order[0]: - continue # Skip known-unhealthy fallbacks (but always try the chosen one) + try: + body = await _read_json_body(request, operation="Anthropic messages") + except PayloadTooLargeError: + return _anthropic_error_response( + "Anthropic messages request is too large", + error_type="request_too_large", + status_code=413, + ) + except ValueError: + return _anthropic_error_response( + "Invalid Anthropic messages request", + error_type="invalid_request_error", + status_code=400, + ) - try: - result = await provider.complete( - messages, - stream=stream, - temperature=temperature, - max_tokens=max_tokens, - tools=tools, + headers = _collect_anthropic_bridge_headers(request) + try: + wire_request = parse_anthropic_messages_request(body) + if wire_request.stream: + return _anthropic_error_response( + "Anthropic bridge v1 does not support streaming yet", + error_type="not_supported_error", + status_code=501, ) - _adaptive_state.record_success( - provider_name, - latency_ms=(result.get("_faigate") or {}).get("latency_ms", 0) - if isinstance(result, dict) - else 0.0, + canonical_request = anthropic_request_to_canonical(wire_request, headers=headers) + canonical_request = _resolve_anthropic_requested_model(canonical_request) + execution = await _execute_chat_completion_body(canonical_request.to_openai_body(), headers) + except AnthropicBridgeError as exc: + return _anthropic_error_response( + str(exc), + error_type="invalid_request_error", + status_code=400, + ) + except HookExecutionError as exc: + logger.warning("Anthropic bridge request hook processing failed: %s", exc) + return _anthropic_error_response( + "Request hook processing failed", + error_type="request_hook_error", + status_code=500, + ) + + if isinstance(execution, _ChatExecutionFailure): + message = str( + execution.body.get("error", {}).get("message", "Anthropic bridge request failed") + ) + error_type = str(execution.body.get("error", {}).get("type", "api_error")) + return _anthropic_error_response( + message, + error_type=error_type, + status_code=execution.status_code, + ) + + if execution.stream or not isinstance(execution.result, dict): + return _anthropic_error_response( + "Anthropic bridge v1 does not support streaming responses", + error_type="not_supported_error", + status_code=501, + ) + + canonical_response = _openai_result_to_canonical_response(execution.result) + response = JSONResponse( + asdict( + canonical_response_to_anthropic( + canonical_response, + requested_model=canonical_request.requested_model, ) + ) + ) + response.headers["X-faigate-Provider"] = execution.provider_name + response.headers["X-faigate-Profile"] = execution.client_profile + response.headers["X-faigate-Layer"] = execution.decision.layer + response.headers["X-faigate-Rule"] = execution.decision.rule_name + response.headers["X-faigate-Hooks"] = ",".join(execution.hook_state.applied_hooks) + response.headers["X-faigate-Hook-Errors"] = str(len(execution.hook_state.errors)) + response.headers["x-faigate-trace-id"] = execution.trace_id or str(uuid.uuid4()) + return response - # Log metrics with cost (cache-aware) - trace_id: str | None = None - if _config.metrics.get("enabled") and isinstance(result, dict): - usage = result.get("usage", {}) - cg = result.get("_faigate", {}) - pt = usage.get("prompt_tokens", 0) - ct = usage.get("completion_tokens", 0) - ch = cg.get("cache_hit_tokens", 0) - cm = cg.get("cache_miss_tokens", 0) - provider_cfg = _config.provider(provider_name) - pricing = provider_cfg.get("pricing", {}) if provider_cfg else {} - cost = calc_cost(pt, ct, pricing, cache_hit=ch, cache_miss=cm) - row_id = _metrics.log_request( - provider=provider_name, - model=provider.model, - layer=decision.layer, - rule_name=decision.rule_name, - prompt_tokens=pt, - completion_tokens=ct, - cache_hit=ch, - cache_miss=cm, - cost_usd=cost, - latency_ms=cg.get("latency_ms", 0), - requested_model=model_requested, - modality="chat", - client_profile=client_profile, - client_tag=client_tag, - decision_reason=decision.reason, - confidence=decision.confidence, - **_attempt_metric_fields( - decision, - provider_name, - attempt_order=attempt_order, - ), - attempt_order=attempt_order, - ) - trace_id = str(row_id) if row_id is not None else str(uuid.uuid4()) - if stream: - return StreamingResponse( - result, - media_type="text/event-stream", - headers={ - "X-faigate-Provider": provider_name, - "X-faigate-Profile": client_profile, - "X-faigate-Hooks": ",".join(hook_state.applied_hooks), - "X-faigate-Hook-Errors": str(len(hook_state.errors)), - "x-faigate-trace-id": trace_id or str(uuid.uuid4()), - }, - ) +@app.post("/v1/messages/count_tokens") +async def anthropic_count_tokens(request: Request): + """Anthropic-compatible token counting endpoint. - # Add routing info to response headers (non-streaming) - resp = JSONResponse(result) - resp.headers["X-faigate-Provider"] = provider_name - resp.headers["X-faigate-Profile"] = client_profile - if resolved_mode: - resp.headers["X-faigate-Mode"] = resolved_mode - if resolved_shortcut: - resp.headers["X-faigate-Shortcut"] = resolved_shortcut - resp.headers["X-faigate-Layer"] = decision.layer - resp.headers["X-faigate-Rule"] = decision.rule_name - resp.headers["X-faigate-Hooks"] = ",".join(hook_state.applied_hooks) - resp.headers["X-faigate-Hook-Errors"] = str(len(hook_state.errors)) - resp.headers["x-faigate-trace-id"] = trace_id or str(uuid.uuid4()) - return resp + v1 uses a deterministic local estimate. The JSON body stays compatible with + Anthropic's minimal response shape, while headers make the approximation + explicit. + """ - except ProviderError as e: - _adaptive_state.record_failure(provider_name, error=e.detail[:500]) - errors.append(_serialize_provider_attempt_error(provider_name, e)) - logger.warning("Provider %s failed: %s, trying next...", provider_name, e.detail[:200]) - if _config.metrics.get("enabled"): - _metrics.log_request( - provider=provider_name, - model=provider.model, - layer=decision.layer, - rule_name=decision.rule_name, - success=False, - error=e.detail[:500], - requested_model=model_requested, - modality="chat", - client_profile=client_profile, - client_tag=client_tag, - decision_reason=decision.reason, - confidence=decision.confidence, - **_attempt_metric_fields( - decision, - provider_name, - attempt_order=attempt_order, - ), - attempt_order=attempt_order, - ) - continue + if not _anthropic_bridge_surface_enabled(): + return _anthropic_error_response( + "Anthropic bridge is disabled", + error_type="not_found_error", + status_code=404, + ) - # All providers failed - return JSONResponse( - { - "error": { - "message": "All providers failed", - "type": "provider_error", - "attempts": errors, - } - }, - status_code=502, - ) + try: + body = await _read_json_body(request, operation="Anthropic count_tokens") + except PayloadTooLargeError: + return _anthropic_error_response( + "Anthropic count_tokens request is too large", + error_type="request_too_large", + status_code=413, + ) + except ValueError: + return _anthropic_error_response( + "Invalid Anthropic count_tokens request", + error_type="invalid_request_error", + status_code=400, + ) + + headers = _collect_anthropic_bridge_headers(request) + try: + result, extra_headers = dispatch_anthropic_count_tokens(payload=body, headers=headers) + except AnthropicBridgeError as exc: + return _anthropic_error_response( + str(exc), + error_type="invalid_request_error", + status_code=400, + ) + + return JSONResponse(asdict(result), headers=extra_headers) # ── CLI entry point ──────────────────────────────────────────── diff --git a/hooks/community/claude_code_router.py b/hooks/community/claude_code_router.py new file mode 100644 index 0000000..1bd6a2f --- /dev/null +++ b/hooks/community/claude_code_router.py @@ -0,0 +1,113 @@ +"""Optional request hook for Claude Code / Anthropic bridge traffic. + +This hook is intentionally bounded: it only derives routing hints from bridge +metadata and headers. It does not perform protocol translation or provider +execution, and it stays optional via ``request_hooks.community_hooks_dir``. +""" + +from __future__ import annotations + +from typing import Any + +from faigate.hooks import RequestHookContext, RequestHookResult + +_DEFAULT_PROFILE = "coding-default" +_SUPPORTED_PROFILES = {"coding-default", "fast", "premium"} +_CLAUDE_SOURCES = {"claude", "claude-code", "anthropic"} + + +def register(register_hook, _register_provider=None) -> None: + """Register the Claude Code routing hint hook.""" + + register_hook("claude-code-router", _claude_code_router_hook) + + +def _claude_code_router_hook(context: RequestHookContext) -> RequestHookResult | None: + metadata = _metadata(context.body) + source = _normalized_source(metadata, context.headers) + surface = _normalized_surface(metadata, context.headers) + + if source not in _CLAUDE_SOURCES and surface != "anthropic-messages": + return None + + profile = _resolve_profile(metadata, context.headers) + routing_hints = _profile_hints(profile) + notes = [ + f"Claude Code router hook applied profile: {profile}", + f"Bridge source: {source or 'unknown'}", + ] + if surface: + notes.append(f"Bridge surface: {surface}") + + return RequestHookResult(routing_hints=routing_hints, notes=notes) + + +def _profile_hints(profile: str) -> dict[str, Any]: + base = { + "require_capabilities": ["tools"], + "capability_values": { + "tools": [True], + "long_context": [True], + }, + } + + if profile == "premium": + return { + **base, + "prefer_tiers": ["reasoning", "default"], + "routing_mode": "premium", + } + if profile == "fast": + return { + "require_capabilities": ["tools"], + "capability_values": {"tools": [True]}, + "prefer_tiers": ["default", "cheap"], + "routing_mode": "auto", + } + return { + **base, + "prefer_tiers": ["default", "reasoning"], + } + + +def _resolve_profile(metadata: dict[str, Any], headers: dict[str, str]) -> str: + for candidate in ( + metadata.get("claude_code_profile"), + metadata.get("routing_profile"), + headers.get("x-faigate-bridge-profile"), + ): + normalized = str(candidate or "").strip().lower() + if normalized in _SUPPORTED_PROFILES: + return normalized + return _DEFAULT_PROFILE + + +def _normalized_source(metadata: dict[str, Any], headers: dict[str, str]) -> str: + return ( + str( + metadata.get("source") + or headers.get("x-faigate-client") + or headers.get("anthropic-client") + or "" + ) + .strip() + .lower() + ) + + +def _normalized_surface(metadata: dict[str, Any], headers: dict[str, str]) -> str: + return ( + str( + metadata.get("bridge_surface") + or metadata.get("surface") + or headers.get("x-faigate-surface") + or "" + ) + .strip() + .lower() + ) + + +def _metadata(body: dict[str, Any]) -> dict[str, Any]: + value = body.get("metadata", {}) + return dict(value) if isinstance(value, dict) else {} diff --git a/tests/test_anthropic_api.py b/tests/test_anthropic_api.py new file mode 100644 index 0000000..ba7d2ed --- /dev/null +++ b/tests/test_anthropic_api.py @@ -0,0 +1,300 @@ +"""Functional tests for the live Anthropic-compatible messages endpoint.""" + +from __future__ import annotations + +import importlib +import sys +import types +from contextlib import asynccontextmanager +from pathlib import Path + +import pytest + +sys.modules.pop("httpx", None) +import httpx # noqa: E402 +from fastapi.testclient import TestClient # noqa: E402 + +sys.modules["httpx"] = httpx + +sys.modules.pop("faigate.providers", None) +sys.modules.pop("faigate.updates", None) +sys.modules.pop("faigate.main", None) + +import faigate.main as main_module # noqa: E402 +from faigate.config import load_config # noqa: E402 +from faigate.router import Router # noqa: E402 + +importlib.reload(main_module) + + +def _write_config(tmp_path: Path, body: str) -> Path: + path = tmp_path / "config.yaml" + path.write_text(body) + return path + + +class _CapturingProviderStub: + def __init__(self): + self.name = "cloud-default" + self.model = "chat-model" + self.backend_type = "openai-compat" + self.contract = "generic" + self.tier = "default" + self.capabilities = {"chat": True, "local": False, "cloud": True, "network_zone": "public"} + self.context_window = 128000 + self.limits = {"max_input_tokens": 128000, "max_output_tokens": 4096} + self.cache = {"mode": "none", "read_discount": False} + self.image = {} + self.calls: list[dict[str, object]] = [] + self.health = types.SimpleNamespace( + healthy=True, + last_check=1.0, + avg_latency_ms=12.0, + last_error="", + to_dict=lambda: { + "name": "cloud-default", + "healthy": True, + "consecutive_failures": 0, + "avg_latency_ms": 12.0, + "last_error": "", + }, + ) + + async def close(self): + return None + + async def complete(self, messages, **kwargs): + self.calls.append({"messages": messages, **kwargs}) + return { + "id": "chatcmpl-bridge", + "object": "chat.completion", + "model": "chat-model", + "choices": [ + { + "index": 0, + "finish_reason": "stop", + "message": {"role": "assistant", "content": "anthropic ok"}, + } + ], + "usage": {"prompt_tokens": 12, "completion_tokens": 6, "total_tokens": 18}, + "_faigate": {"latency_ms": 12, "provider": "cloud-default"}, + } + + +class _MetricsStub: + def log_request(self, **_kwargs): + return None + + +@pytest.fixture +def anthropic_api_client(tmp_path, monkeypatch): + cfg = load_config( + _write_config( + tmp_path, + """ +server: + host: "127.0.0.1" + port: 8090 + log_level: "info" +security: + max_json_body_bytes: 4096 + max_upload_bytes: 8 + max_header_value_chars: 64 +providers: + cloud-default: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +anthropic_bridge: + enabled: true + model_aliases: + claude-code-premium: premium +fallback_chain: + - cloud-default +metrics: + enabled: false +""", + ) + ) + provider = _CapturingProviderStub() + + @asynccontextmanager + async def _noop_lifespan(_app): + yield + + monkeypatch.setattr(main_module, "_config", cfg, raising=False) + monkeypatch.setattr(main_module, "_router", Router(cfg), raising=False) + monkeypatch.setattr(main_module, "_providers", {"cloud-default": provider}, raising=False) + monkeypatch.setattr(main_module, "_metrics", _MetricsStub(), raising=False) + monkeypatch.setattr(main_module.app.router, "lifespan_context", _noop_lifespan, raising=False) + + with TestClient(main_module.app) as client: + yield client, provider + + +def test_anthropic_messages_returns_bridge_response(anthropic_api_client): + client, provider = anthropic_api_client + + response = client.post( + "/v1/messages", + json={ + "model": "claude-sonnet", + "system": "Use markdown", + "messages": [{"role": "user", "content": "Summarize this"}], + }, + ) + + assert response.status_code == 200 + body = response.json() + assert body["type"] == "message" + assert body["content"][0]["type"] == "text" + assert body["content"][0]["text"] == "anthropic ok" + assert provider.calls[0]["extra_body"]["metadata"]["source"] == "claude-code" + assert provider.calls[0]["messages"][0] == {"role": "system", "content": "Use markdown"} + + +def test_anthropic_messages_applies_model_aliases(anthropic_api_client): + client, provider = anthropic_api_client + + response = client.post( + "/v1/messages", + json={ + "model": "claude-code-premium", + "messages": [ + { + "role": "user", + "content": "Route this like a premium coding request", + } + ], + }, + ) + + assert response.status_code == 200 + metadata = provider.calls[0]["extra_body"]["metadata"] + assert metadata["requested_model_original"] == "claude-code-premium" + assert metadata["requested_model_resolved"] == "premium" + + +def test_anthropic_messages_rejects_non_text_blocks(anthropic_api_client): + client, _provider = anthropic_api_client + + response = client.post( + "/v1/messages", + json={ + "model": "claude-sonnet", + "messages": [ + { + "role": "user", + "content": [{"type": "image", "source": {"type": "base64"}}], + } + ], + }, + ) + + assert response.status_code == 400 + body = response.json() + assert body["type"] == "error" + assert body["error"]["type"] == "invalid_request_error" + assert "text content blocks" in body["error"]["message"] + + +def test_anthropic_count_tokens_returns_estimate_with_headers(anthropic_api_client): + client, _provider = anthropic_api_client + + response = client.post( + "/v1/messages/count_tokens", + json={ + "model": "claude-sonnet", + "system": "Be concise", + "messages": [{"role": "user", "content": "Count these tokens please"}], + "tools": [ + { + "name": "lookup_doc", + "description": "Load one doc", + "input_schema": {"type": "object", "properties": {"id": {"type": "string"}}}, + } + ], + }, + ) + + assert response.status_code == 200 + body = response.json() + assert isinstance(body["input_tokens"], int) + assert body["input_tokens"] > 0 + assert response.headers["x-faigate-token-count-exact"] == "false" + assert response.headers["x-faigate-token-count-method"] == "estimated-char-v1" + + +def test_anthropic_count_tokens_rejects_invalid_payload(anthropic_api_client): + client, _provider = anthropic_api_client + + response = client.post( + "/v1/messages/count_tokens", + json={ + "model": "claude-sonnet", + "messages": "not-a-list", + }, + ) + + assert response.status_code == 400 + body = response.json() + assert body["type"] == "error" + assert body["error"]["type"] == "invalid_request_error" + assert "messages" in body["error"]["message"] + + +def test_anthropic_messages_can_be_disabled_by_surface_toggle(tmp_path, monkeypatch): + cfg = load_config( + _write_config( + tmp_path, + """ +server: + host: "127.0.0.1" + port: 8090 +providers: + cloud-default: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +api_surfaces: + anthropic_messages: false +anthropic_bridge: + enabled: true +fallback_chain: + - cloud-default +metrics: + enabled: false +""", + ) + ) + + @asynccontextmanager + async def _noop_lifespan(_app): + yield + + monkeypatch.setattr(main_module, "_config", cfg, raising=False) + monkeypatch.setattr(main_module, "_router", Router(cfg), raising=False) + monkeypatch.setattr( + main_module, + "_providers", + {"cloud-default": _CapturingProviderStub()}, + raising=False, + ) + monkeypatch.setattr(main_module, "_metrics", _MetricsStub(), raising=False) + monkeypatch.setattr(main_module.app.router, "lifespan_context", _noop_lifespan, raising=False) + + with TestClient(main_module.app) as client: + response = client.post( + "/v1/messages", + json={ + "model": "claude-sonnet", + "messages": [{"role": "user", "content": "hello"}], + }, + ) + + assert response.status_code == 404 + body = response.json() + assert body["type"] == "error" + assert body["error"]["type"] == "not_found_error" diff --git a/tests/test_anthropic_bridge.py b/tests/test_anthropic_bridge.py new file mode 100644 index 0000000..08c324a --- /dev/null +++ b/tests/test_anthropic_bridge.py @@ -0,0 +1,125 @@ +"""Tests for the Anthropic bridge scaffolding.""" + +from dataclasses import asdict + +from fastapi import FastAPI +from fastapi.testclient import TestClient + +from faigate.api.anthropic import build_anthropic_router +from faigate.api.anthropic.models import ( + AnthropicMessagesRequest, + parse_anthropic_messages_request, +) +from faigate.bridges.anthropic import ( + anthropic_request_to_canonical, + canonical_response_to_anthropic, +) +from faigate.canonical import CanonicalChatResponse, CanonicalResponseMessage + + +class _FakeExecutor: + def __init__(self): + self.last_request = None + + async def execute_canonical_chat(self, request): + self.last_request = request + return CanonicalChatResponse( + response_id="msg_test", + model="anthropic/claude-sonnet-4.6", + provider="anthropic-direct", + message=CanonicalResponseMessage(content="bridge ok"), + stop_reason="end_turn", + usage={"input_tokens": 10, "output_tokens": 4}, + ) + + +def test_parse_anthropic_messages_request_accepts_string_content(): + request = parse_anthropic_messages_request( + { + "model": "claude-sonnet", + "system": "Stay concise", + "messages": [{"role": "user", "content": "hello"}], + "stream": False, + } + ) + + assert isinstance(request, AnthropicMessagesRequest) + assert request.messages[0].content[0].type == "text" + assert request.messages[0].content[0].text == "hello" + + +def test_anthropic_request_maps_to_canonical_and_openai_body(): + wire_request = parse_anthropic_messages_request( + { + "model": "claude-sonnet", + "system": "Use markdown", + "messages": [{"role": "user", "content": "Explain the diff"}], + "tools": [ + { + "name": "lookup_doc", + "description": "Load one doc", + "input_schema": {"type": "object"}, + } + ], + "metadata": {"source": "claude-code"}, + } + ) + + canonical = anthropic_request_to_canonical( + wire_request, + headers={"x-faigate-client": "claude-code"}, + ) + openai_body = canonical.to_openai_body() + + assert canonical.client == "claude-code" + assert canonical.surface == "anthropic-messages" + assert canonical.requested_model == "claude-sonnet" + assert canonical.tools[0].name == "lookup_doc" + assert openai_body["messages"][0] == {"role": "system", "content": "Use markdown"} + assert openai_body["messages"][1]["content"] == "Explain the diff" + + +def test_detached_router_runs_bridge_dispatch(): + executor = _FakeExecutor() + response = TestClient(_build_test_app(executor)).post( + "/v1/messages", + json={ + "model": "claude-opus", + "messages": [{"role": "user", "content": "hi"}], + }, + headers={"x-faigate-client": "claude-code"}, + ) + + assert response.status_code == 200 + payload = response.json() + assert payload["type"] == "message" + assert payload["content"][0]["text"] == "bridge ok" + assert executor.last_request is not None + assert executor.last_request.client == "claude-code" + assert executor.last_request.surface == "anthropic-messages" + + +def test_canonical_response_maps_back_to_anthropic_blocks(): + response = canonical_response_to_anthropic( + CanonicalChatResponse( + response_id="msg_back", + model="anthropic/claude-opus-4.6", + provider="kilo-opus", + message=CanonicalResponseMessage( + content=[{"type": "text", "text": "done"}], + ), + stop_reason="end_turn", + ), + requested_model="claude-opus", + ) + + payload = asdict(response) + assert payload["id"] == "msg_back" + assert payload["content"][0]["text"] == "done" + assert payload["metadata"]["provider"] == "kilo-opus" + + +def _build_test_app(executor: _FakeExecutor) -> FastAPI: + app = FastAPI() + app.include_router(build_anthropic_router(executor=executor)) + return app diff --git a/tests/test_config.py b/tests/test_config.py index 2c8aa88..b22069c 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -363,6 +363,103 @@ def test_provider_source_refresh_rejects_invalid_interval(tmp_path): load_config(path) +def test_anthropic_bridge_defaults_are_exposed(): + cfg = load_config(Path(__file__).parent.parent / "config.yaml") + assert cfg.api_surfaces == { + "openai_compatible": True, + "anthropic_messages": False, + } + assert cfg.anthropic_bridge == { + "enabled": False, + "route_prefix": "/v1", + "allow_claude_code_hints": True, + "model_aliases": { + "claude-code": "auto", + "claude-code-fast": "eco", + "claude-code-premium": "premium", + }, + } + + +def test_anthropic_bridge_rejects_invalid_route_prefix(tmp_path): + path = tmp_path / "config.yaml" + path.write_text( + """ +server: + host: "127.0.0.1" + port: 8090 +providers: + cloud-default: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +anthropic_bridge: + enabled: true + route_prefix: v1 +fallback_chain: [] +metrics: + enabled: false +""" + ) + + with pytest.raises(ConfigError, match="anthropic_bridge.route_prefix"): + load_config(path) + + +def test_api_surfaces_follow_bridge_enablement_when_not_set_explicitly(tmp_path): + path = tmp_path / "config.yaml" + path.write_text( + """ +server: + host: "127.0.0.1" + port: 8090 +providers: + cloud-default: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +anthropic_bridge: + enabled: true +fallback_chain: [] +metrics: + enabled: false +""" + ) + + cfg = load_config(path) + assert cfg.api_surfaces == { + "openai_compatible": True, + "anthropic_messages": True, + } + + +def test_api_surfaces_rejects_invalid_anthropic_messages_value(tmp_path): + path = tmp_path / "config.yaml" + path.write_text( + """ +server: + host: "127.0.0.1" + port: 8090 +providers: + cloud-default: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +api_surfaces: + anthropic_messages: "yes" +fallback_chain: [] +metrics: + enabled: false +""" + ) + + with pytest.raises(ConfigError, match="api_surfaces.anthropic_messages"): + load_config(path) + + def test_security_rejects_invalid_limit_values(tmp_path): path = tmp_path / "config.yaml" path.write_text( diff --git a/tests/test_request_hooks.py b/tests/test_request_hooks.py index ecb1631..9019979 100644 --- a/tests/test_request_hooks.py +++ b/tests/test_request_hooks.py @@ -116,6 +116,33 @@ def test_rejects_unknown_request_hook_name(self, tmp_path): with pytest.raises(ConfigError, match="unknown hook"): load_config(path) + def test_accepts_claude_code_community_hook(self, tmp_path): + community_dir = Path(__file__).parent.parent / "hooks" / "community" + path = _write_config( + tmp_path, + f""" +server: + host: "127.0.0.1" + port: 8090 +providers: + default-provider: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "chat-model" +request_hooks: + enabled: true + community_hooks_dir: "{community_dir}" + hooks: ["claude-code-router"] +fallback_chain: [] +metrics: + enabled: false +""", + ) + + cfg = load_config(path) + assert cfg.request_hooks["hooks"] == ["claude-code-router"] + @pytest.fixture def hook_config(tmp_path, monkeypatch): @@ -217,6 +244,202 @@ async def test_prefer_provider_header_selects_requested_provider(self, hook_conf assert hook_state.applied_hooks == ["prefer-provider-header"] assert effective_body["model"] == "auto" + @pytest.mark.asyncio + async def test_claude_code_router_prefers_coding_ready_routes(self, tmp_path, monkeypatch): + community_dir = Path(__file__).parent.parent / "hooks" / "community" + cfg = load_config( + _write_config( + tmp_path, + f""" +server: + host: "127.0.0.1" + port: 8090 +providers: + cheap-basic: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "cheap-chat" + tier: cheap + coding-default-provider: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "coding-chat" + tier: default + premium-coder: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "premium-chat" + tier: reasoning +request_hooks: + enabled: true + community_hooks_dir: "{community_dir}" + hooks: ["claude-code-router"] +fallback_chain: + - cheap-basic + - coding-default-provider + - premium-coder +metrics: + enabled: false +""", + ) + ) + monkeypatch.setattr(main_module, "_config", cfg, raising=False) + monkeypatch.setattr(main_module, "_router", Router(cfg), raising=False) + monkeypatch.setattr( + main_module, + "_providers", + { + "cheap-basic": _ProviderStub( + name="cheap-basic", + model="cheap-chat", + tier="cheap", + capabilities={"tools": False, "long_context": False, "cloud": True}, + ), + "coding-default-provider": _ProviderStub( + name="coding-default-provider", + model="coding-chat", + tier="default", + capabilities={"tools": True, "long_context": True, "cloud": True}, + ), + "premium-coder": _ProviderStub( + name="premium-coder", + model="premium-chat", + tier="reasoning", + capabilities={"tools": True, "long_context": True, "cloud": True}, + ), + }, + raising=False, + ) + + ( + decision, + _profile_name, + _client_tag, + _attempt_order, + _model_requested, + _resolved_mode, + _resolved_shortcut, + hook_state, + _effective_body, + ) = await _resolve_route_preview( + { + "model": "auto", + "messages": [{"role": "user", "content": "help with this refactor"}], + "metadata": { + "source": "claude-code", + "bridge_surface": "anthropic-messages", + }, + }, + {"x-faigate-client": "claude-code", "x-faigate-surface": "anthropic-messages"}, + ) + + assert hook_state.applied_hooks == ["claude-code-router"] + assert hook_state.routing_hints["require_capabilities"] == ["tools"] + assert hook_state.routing_hints["capability_values"]["tools"] == [True] + assert hook_state.routing_hints["capability_values"]["long_context"] == [True] + assert hook_state.routing_hints["prefer_tiers"] == ["default", "reasoning"] + assert any( + "Claude Code router hook applied profile: coding-default" in note + for note in hook_state.notes + ) + + @pytest.mark.asyncio + async def test_claude_code_router_supports_premium_profile(self, tmp_path, monkeypatch): + community_dir = Path(__file__).parent.parent / "hooks" / "community" + cfg = load_config( + _write_config( + tmp_path, + f""" +server: + host: "127.0.0.1" + port: 8090 +providers: + coding-default-provider: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "coding-chat" + tier: default + premium-coder: + backend: openai-compat + base_url: "https://api.example.com/v1" + api_key: "secret" + model: "premium-chat" + tier: reasoning +request_hooks: + enabled: true + community_hooks_dir: "{community_dir}" + hooks: ["claude-code-router"] +routing_modes: + enabled: true + default: auto + modes: + premium: + select: + prefer_tiers: ["reasoning"] +fallback_chain: + - coding-default-provider + - premium-coder +metrics: + enabled: false +""", + ) + ) + monkeypatch.setattr(main_module, "_config", cfg, raising=False) + monkeypatch.setattr(main_module, "_router", Router(cfg), raising=False) + monkeypatch.setattr( + main_module, + "_providers", + { + "coding-default-provider": _ProviderStub( + name="coding-default-provider", + model="coding-chat", + tier="default", + capabilities={"tools": True, "long_context": True, "cloud": True}, + ), + "premium-coder": _ProviderStub( + name="premium-coder", + model="premium-chat", + tier="reasoning", + capabilities={"tools": True, "long_context": True, "cloud": True}, + ), + }, + raising=False, + ) + + ( + decision, + _profile_name, + _client_tag, + _attempt_order, + _model_requested, + _resolved_mode, + _resolved_shortcut, + hook_state, + _effective_body, + ) = await _resolve_route_preview( + { + "model": "auto", + "messages": [{"role": "user", "content": "deep architectural review"}], + "metadata": { + "source": "claude-code", + "bridge_surface": "anthropic-messages", + "claude_code_profile": "premium", + }, + }, + {"x-faigate-client": "claude-code"}, + ) + + assert hook_state.applied_hooks == ["claude-code-router"] + assert hook_state.routing_hints["routing_mode"] == "premium" + assert hook_state.routing_hints["prefer_tiers"] == ["reasoning", "default"] + assert any( + "Claude Code router hook applied profile: premium" in note for note in hook_state.notes + ) + @pytest.mark.asyncio async def test_locality_and_profile_hooks_shape_one_request(self, hook_config): (