Problem Statement
Context
SMG needs a way to route a request to an explicit provider endpoint supplied by a trusted upstream service.
This is useful when the upstream service has already resolved and authorized a dedicated endpoint, and SMG should proxy the request directly to that endpoint instead of relying on worker self-discovery.
Proposed Solution
When a request includes a trusted provider endpoint override:
- Validate
x-provider-endpoint.
- Route the request to the explicit provider endpoint.
- Strip internal routing headers before forwarding upstream:
x-provider-endpoint
x-model-provider
- Keep existing worker selection behavior when no override is present.
Alternatives Considered
Worker self-discovery is not always safe for dedicated endpoint routing. If the same model is available on shared workers and multiple dedicated endpoints, routing by model alone can send traffic to the wrong capacity pool.
Using an explicit endpoint override lets the upstream service make the routing and authorization decision, while SMG handles request proxying.
Feature Area
Routing & Load Balancing
Affected Component(s)
model-gateway
Use Case
- For routing request to private/custom inference endpoints such as customer created DACs in OCI.
- For routing requests to Gemini Vertex AI endpoints which contain path parameters and URL is not fixed.
Priority to You
Critical / Blocking
Contribution
Additional Context
Acceptance criteria
- Valid endpoint overrides route to the explicit endpoint.
- Invalid or unsupported overrides are rejected.
- Internal routing headers are stripped before forwarding.
- No fallback to shared workers after override routing failure.
- Existing behavior is unchanged when no override is provided.
Problem Statement
Context
SMG needs a way to route a request to an explicit provider endpoint supplied by a trusted upstream service.
This is useful when the upstream service has already resolved and authorized a dedicated endpoint, and SMG should proxy the request directly to that endpoint instead of relying on worker self-discovery.
Proposed Solution
When a request includes a trusted provider endpoint override:
x-provider-endpoint.x-provider-endpointx-model-providerAlternatives Considered
Worker self-discovery is not always safe for dedicated endpoint routing. If the same model is available on shared workers and multiple dedicated endpoints, routing by model alone can send traffic to the wrong capacity pool.
Using an explicit endpoint override lets the upstream service make the routing and authorization decision, while SMG handles request proxying.
Feature Area
Routing & Load Balancing
Affected Component(s)
model-gateway
Use Case
Priority to You
Critical / Blocking
Contribution
Additional Context
Acceptance criteria