Implementation Plan

Goal

Turn Gate's existing routing intelligence into the default daily-use behavior for Claude Code, opencode, openclaw, and similar clients, then expose that intelligence through a stronger standalone product surface.

Scope

This document turns the roadmap into the next concrete release lines.

It stays biased toward the biggest product levers:

Claude-native daily usability
routing trust and operator explainability
adaptive behavior under real pressure

It is not a parking lot for every possible feature.

Principles

keep fusionAIze Gate as the gateway, not a second platform
prefer one coherent release theme over many mini-features
do protocol hardening before bigger marketing claims
do routing explainability before stronger live adaptation
keep v2.x work behind clean product boundaries

Parity Definitions

Full Anthropic parity

Protocol parity for Gate's Anthropic-compatible surface.

Working definition:

messages request and response compatibility
SSE streaming compatibility
content-block compatibility
header/version/beta compatibility
compatible error envelopes and stop reasons
trustworthy token counting semantics

Full Claude Code parity

Daily-workflow parity for Claude Code against local Gate.

Working definition:

iterative coding sessions feel normal
streaming works
tool flows work
aliases and fallback do not constantly disrupt the session
operator routing behavior stays behind the gateway, not in the client

Full Claude Desktop parity

Daily-use parity for Claude Desktop against local Gate where endpoint override is supported.

Working definition:

stable local endpoint configuration
session behavior is acceptable for the feature set Claude Desktop actually uses
no recurring compatibility papercuts that make the setup feel experimental

Release Sequence

`v1.14.x` - coding auto modes and Claude daily-use trust

Primary outcome:

the cheapest capable route becomes the default for coding traffic instead of hardwiring Sonnet or Opus too early

Implementation slices:

map Claude-native model ids to routing intent instead of direct frontier providers
- claude-sonnet-* -> auto
- claude-opus-* -> premium
- claude-haiku-* -> eco
add clear coding routing modes
- coding-auto
- coding-fast
- coding-premium
align default client profiles
- claude
- opencode
- openclaw
- codex
harden Anthropic streaming parity
- SSE streaming parity for /v1/messages
- mid-stream failure handling
- stop-reason correctness
- stronger tool_use / tool_result continuity across longer sessions
validate against real workflows
- Claude Code
- opencode
- openclaw

Success bar:

Claude Code can be pointed at local Gate and used for normal iterative coding flows with acceptable behavior
streaming and tool-oriented workflows do not immediately fall off the happy path
coding clients enter through clear auto modes instead of muddled provider-first defaults

Guardrails:

do not hide premium escalations
do not bypass the scoring engine with provider aliases unless the operator asked for an explicit concrete provider
keep bridge routing inside the same core, not as a parallel router

`v1.15.x` - product surface and operator trust

Primary outcome:

Gate becomes legible as a standalone product, not just a strong core hidden behind config files
the dashboard answers operator jobs in a sane order instead of dumping one long admin page

Implementation slices:

overview dashboard
- request-readiness first
- provider health
- spend and token trend
- top alerts
- priority-next actions
providers surface
- route type
- lane family
- quota group
- billing mode
- readiness
clients surface
- cost by client
- latency by client
- profile recommendations
- premium-escalation hotspots
routes surface
- chosen lane
- chosen execution route
- same-lane fallback vs downgrade
- why selected / why not selected
analytics surface
- cost by client
- cost by stack
- cost by lane family
- routing posture distribution
- downgrade and fallback visibility
request-log and route drilldowns
- recent request stream
- trace-first debugging
- provider, client, and lane pivots
integrations and troubleshooting surface
- Claude Code
- opencode
- openclaw
- Codex / Cursor / Continue / automation setups
- common symptom-to-fix views

Design constraints:

keep the web surface read-heavy and operationally safe first
do not hide YAML, traces, or helper CLIs behind opaque UI abstractions
borrow the clarity of LLM AIRouter's docs and page structure, not its hosted-router assumptions
make Gate feel more intentional and polished than a default admin panel

Reference:

Dashboard IA

Immediate operator-trust slices after dashboard v1

These are the next high-signal follow-ups now that the first dashboard surface exists and exposes the real gaps.

They should be treated as short operator-trust slices, not as a second broad UI redesign.

Cluster A - cost truth and catalog freshness

Observed gap:

cost data is not yet trustworthy enough to explain spend posture per provider
several providers still show as untracked in the catalog layer

Challenge:

provider "price" is not one thing
direct-provider list pricing, aggregator marketplace pricing, free-tier offers, and effective billed usage can diverge
the product should not claim false precision where only a stale public price table exists

Recommendation:

introduce a versioned provider-metadata source of truth that can live as JSON in a repo rather than as a hosted database
design that metadata source as a shared fusionAIze boundary from day one so Gate is only the first consumer, not the only one
keep that scope explicitly limited to fusionAIze products rather than turning it into a generic cross-repo metadata service
keep cost provenance explicit per field:
- source price
- source timestamp
- freshness status
- source type (provider-docs, aggregator-offer, manual-review, observed-usage)
add a small refresh job outside Gate that updates that metadata on a fixed rhythm
pull that metadata into Gate through a conservative update path tied to normal catalog refresh and restart flows

Working shape:

versioned JSON documents
statically hostable if desired, but not dependent on a dedicated hosted database
reusable later for fusionAIze Grid, Lens, Fabric, and similar products
not intended for unrelated repositories outside the fusionAIze product line
reviewable in Git with clear operator override paths

Immediate slices:

catalog schema for price provenance and freshness
tracked assumptions for anthropic-haiku, anthropic-sonnet, and gemini-pro
dashboard surfacing for tracked, stale, untracked, and source age
post-update metadata refresh hook tied to Gate's normal update cadence

Cluster B - route and lane explainability

Observed gap:

operators can see routes and lanes, but not yet in a way that feels obvious at a glance

Challenge:

a graphic by itself will not fix this if the underlying route explanation is still too implicit
visual route maps should follow clearer route-decision semantics, not replace them

Recommendation:

make route choice legible in layers:
- requested intent
- chosen lane
- chosen execution route
- same-lane fallback candidates
- downgrade path if fallback crossed clusters
then add a light visual route map once the textual explanation is already operator-trustworthy

Immediate slices:

"why this lane / why this route" drilldown in Routes and Request Log
explicit same-lane fallback vs downgrade markers
lane-family summary cards in Overview and Routes
lightweight visual route map once route trace semantics are stable

Cluster C - intelligent command bar and shell parity

Observed gap:

the command bar filters well enough, but it is not yet intelligent
the dashboard does not yet move in lockstep with shell capabilities and config workflows

Challenge:

"intelligent" should not become another black box
if the dashboard can suggest scopes or edits, the same logic must stay inspectable and reproducible from CLI and YAML

Recommendation:

keep the command bar operator-first:
- saved scopes
- recommended pivots
- next useful drilldowns
build shell and dashboard against the same capability layer rather than inventing separate UX-only semantics
add config actions only through safe preview/diff/apply flows, not direct opaque mutation

Immediate slices:

shell-backed scope suggestions (high spend, fallback active, premium drift, untracked catalog)
deep links from dashboard panels to equivalent shell or API views
dashboard config actions with preview, diff, backup, and explicit apply
parity review so dashboard filters, shell helpers, and YAML names stay aligned

Recommended near-term order

cost truth and tracked-catalog freshness
route and lane explainability
command bar intelligence and shell/config parity

This order keeps the next product gains grounded in trust:

first make the cost and catalog story believable
then make route choice more legible
then add smarter operator controls on top of that clearer model

`v1.16.x` - adaptive orchestration trust

Primary outcome:

richer route decisions without turning Gate into a black box

Implementation slices:

benchmark and cost cluster refinement
live pressure adaptation under quota, latency, and failure
stronger operator explainability per routing decision
same-lane-first reactions before weaker-cluster degrade
richer traces that show attempted route order and downgrade reasons

Success bar:

operators can look at a route decision and understand it without reading source code
route switching under pressure is visible, understandable, and mostly unsurprising to operators

Concrete Next Actions

Immediate

finish the v1.14.x validation matrix
close remaining Claude-native daily-use gaps under real workflows
keep product-surface work operator-first and trace-friendly

`v1.14.x` validation matrix

Minimum:

POST /v1/messages non-streaming
POST /v1/messages streaming
tool_use / tool_result
count_tokens
version/beta header handling
one real Claude Code local workflow

Stretch:

Claude Desktop local override flow
route fallback under Anthropic quota pressure

Claude parity matrix

Use this matrix when deciding whether a release truly moved parity forward:

Capability	Anthropic parity	Claude Code parity	Claude Desktop parity
Non-streaming `messages`	Required	Required	Required
SSE streaming	Required	Required	Likely required
`tool_use` / `tool_result`	Required	Required	Maybe, depending on product flow
Header/version/beta tolerance	Required	Required	Required
Stable model aliasing	Helpful	Required	Required
Session continuity under fallback	Helpful	Required	Required
Exact token counting	Strongly preferred	Helpful	Helpful
Real client workflow validation	Not sufficient alone	Required	Required

Deferred Lines

Exact provider-side token counting

Recommendation:

defer until after v1.14.x
implement first for providers that expose reliable count endpoints or deterministic usage feedback
keep the current bridge estimate until a route-specific exact path exists

Virtual keys and per-key budgets

Recommendation:

likely next after the v1.16.x trust line if operator-scale controls become the top demand

OTEL trace-context forwarding

Recommendation:

can move earlier than other v2.x items
good candidate for a narrower cross-cutting observability release if demand is high

Open Questions

which Claude Code workflows are still meaningfully blocked after streaming lands?
which bridge gaps are protocol gaps versus route-selection gaps?
which exact providers should support exact token counting first?
do operators want OTEL before virtual keys, or vice versa?

Anti-Goals

no second gateway runtime for Anthropic traffic
no opaque routing magic that cannot be explained later
no multi-instance infrastructure inside Gate just to claim clustering
no semantic-cache detour before the simpler operator wins land

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation Plan

Goal

Scope

Principles

Parity Definitions

Full Anthropic parity

Full Claude Code parity

Full Claude Desktop parity

Release Sequence

`v1.14.x` - coding auto modes and Claude daily-use trust

`v1.15.x` - product surface and operator trust

Immediate operator-trust slices after dashboard v1

Cluster A - cost truth and catalog freshness

Cluster B - route and lane explainability

Cluster C - intelligent command bar and shell parity

Recommended near-term order

`v1.16.x` - adaptive orchestration trust

Concrete Next Actions

Immediate

`v1.14.x` validation matrix

Claude parity matrix

Deferred Lines

Exact provider-side token counting

Virtual keys and per-key budgets

OTEL trace-context forwarding

Open Questions

Anti-Goals

FilesExpand file tree

IMPLEMENTATION-PLAN.md

Latest commit

History

IMPLEMENTATION-PLAN.md

File metadata and controls

Implementation Plan

Goal

Scope

Principles

Parity Definitions

Full Anthropic parity

Full Claude Code parity

Full Claude Desktop parity

Release Sequence

v1.14.x - coding auto modes and Claude daily-use trust

v1.15.x - product surface and operator trust

Immediate operator-trust slices after dashboard v1

Cluster A - cost truth and catalog freshness

Cluster B - route and lane explainability

Cluster C - intelligent command bar and shell parity

Recommended near-term order

v1.16.x - adaptive orchestration trust

Concrete Next Actions

Immediate

v1.14.x validation matrix

Claude parity matrix

Deferred Lines

Exact provider-side token counting

Virtual keys and per-key budgets

OTEL trace-context forwarding

Open Questions

Anti-Goals

`v1.14.x` - coding auto modes and Claude daily-use trust

`v1.15.x` - product surface and operator trust

`v1.16.x` - adaptive orchestration trust

`v1.14.x` validation matrix