Skip to content

[Feature]: Support provider endpoint override for dedicated endpoint routing #1404

@s-borah

Description

@s-borah

Problem Statement

Context

SMG needs a way to route a request to an explicit provider endpoint supplied by a trusted upstream service.

This is useful when the upstream service has already resolved and authorized a dedicated endpoint, and SMG should proxy the request directly to that endpoint instead of relying on worker self-discovery.

Proposed Solution

When a request includes a trusted provider endpoint override:

  • Validate x-provider-endpoint.
  • Route the request to the explicit provider endpoint.
  • Strip internal routing headers before forwarding upstream:
    • x-provider-endpoint
    • x-model-provider
  • Keep existing worker selection behavior when no override is present.

Alternatives Considered

Worker self-discovery is not always safe for dedicated endpoint routing. If the same model is available on shared workers and multiple dedicated endpoints, routing by model alone can send traffic to the wrong capacity pool.

Using an explicit endpoint override lets the upstream service make the routing and authorization decision, while SMG handles request proxying.

Feature Area

Routing & Load Balancing

Affected Component(s)

model-gateway

Use Case

  • For routing request to private/custom inference endpoints such as customer created DACs in OCI.
  • For routing requests to Gemini Vertex AI endpoints which contain path parameters and URL is not fixed.

Priority to You

Critical / Blocking

Contribution

  • I am willing to contribute this feature (with guidance)
  • I am willing to help test this feature
  • I can provide more detailed requirements if needed

Additional Context

Acceptance criteria

  • Valid endpoint overrides route to the explicit endpoint.
  • Invalid or unsupported overrides are rejected.
  • Internal routing headers are stripped before forwarding.
  • No fallback to shared workers after override routing failure.
  • Existing behavior is unchanged when no override is provided.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions