feat(router): support provider endpoint url override#1403
Conversation
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
📝 WalkthroughWalkthroughAdds per-request upstream endpoint and provider-hint overrides for OpenAI-compatible /v1/responses via Changes
Sequence DiagramsequenceDiagram
participant Client
participant RouterManager
participant ResponsesRouter
participant ExternalEndpoint
participant WorkerSelector
Client->>RouterManager: POST /v1/responses (with headers)
RouterManager->>ResponsesRouter: select_router_for_responses(headers, model_id)
alt x-provider-endpoint present
ResponsesRouter->>ResponsesRouter: extract & validate endpoint, map provider hint
ResponsesRouter->>ResponsesRouter: build per-request WorkerSpec (no health checks)
ResponsesRouter->>ExternalEndpoint: Forward request (strip override headers)
ExternalEndpoint-->>ResponsesRouter: Response / SSE stream
ResponsesRouter-->>Client: Return response
else no override
ResponsesRouter->>WorkerSelector: select worker for model
WorkerSelector-->>ResponsesRouter: Selected worker URL
ResponsesRouter->>WorkerSelector: Forward request
WorkerSelector-->>ResponsesRouter: Response / SSE stream
ResponsesRouter-->>Client: Return response
end
alt invalid override
ResponsesRouter-->>Client: 400 invalid_request (no fallback)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a "Direct Endpoint Override" feature for OpenAI-compatible response endpoints, allowing requests to target specific provider URLs via x-provider-endpoint and x-model-provider headers. The implementation includes URL normalization, provider mapping, and the creation of request-scoped workers. Feedback suggests refining the URL path validation to remove redundant checks, dynamically generating error messages for supported providers to improve maintainability, and ensuring that the per-request worker creation does not lead to inefficient resource usage, such as redundant reqwest::Client instantiation.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac06e7bd58
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/reference/api/responses.md`:
- Around line 93-97: Update the API docs to list the exact set of accepted
x-model-provider header values and their behavior: enumerate the closed set
accepted by provider_from_header (openai, gpt-oss, xai, google, anthropic), note
that "gpt-oss" is treated as OpenAI-compatible, emphasize that unknown values
produce a 400 error, and clarify that provider-native endpoint shapes may not
work via the override; reference provider_from_header in
model_gateway/src/routers/openai/responses/route.rs when adding this
clarification.
In `@model_gateway/src/routers/openai/responses/route.rs`:
- Around line 52-66: normalize_provider_endpoint currently allows userinfo and
private/loopback hosts which can leak credentials and enable SSRF; update it to
reject any URL that contains userinfo by checking parsed.username() and
parsed.password() (return None if username non-empty or password present) and
also reject hosts that are loopback, link-local, or private by parsing
parsed.host_str() as IpAddr (if parse succeeds, return None for
is_loopback/is_private/is_link_local) and additionally reject the literal
"localhost"; if you must allow private/loopback for trusted callers, gate that
behavior behind a config flag (e.g., allow_private_endpoints) and document the
trust assumption, while keeping references to target_url, worker_base_url, and
parsed in the checks so the changes are located inside
normalize_provider_endpoint.
- Around line 88-125: Currently endpoint_override_from_headers returns Ok(None)
when extract_provider_endpoint is missing, which causes any x-model-provider
value to be ignored; change endpoint_override_from_headers so it still calls
provider_from_header when extract_model_provider(headers) returns Some(_) to
validate the provider even if extract_provider_endpoint(headers) is None,
returning Err(EndpointOverrideError::InvalidProvider) on invalid provider values
(use provider_from_header to perform validation) and only skip endpoint parsing
when no endpoint header exists; reference provider_from_header,
extract_model_provider, extract_provider_endpoint, and
normalize_provider_endpoint to locate and update the logic.
- Around line 197-211: Extracted override headers (x-provider-endpoint,
x-model-provider) must be explicitly removed before any upstream request is
built so they cannot be accidentally forwarded; update the code paths around
endpoint_override handling and the request builders used by
build_request_scoped_worker, handle_non_streaming_response, and
handle_streaming_response to explicitly filter these header names from the
header map (or clone of the incoming headers) before applying
provider.apply_headers() or copying other headers upstream, ensuring the
override keys are never present in the forwarded request.
In `@model_gateway/src/routers/router_manager.rs`:
- Around line 320-340: In select_router_for_responses, when
extract_provider_endpoint(headers) is Some() and the lookup
self.routers.get(&router_ids::HTTP_OPENAI) returns None, log a warn! with clear
text like "Endpoint override requested but OpenAI-compatible router
(HTTP_OPENAI) is not registered; endpoint overrides are unsupported in
single-router mode" including model_id and the header presence, then return None
as before; reference select_router_for_responses, extract_provider_endpoint,
router_ids::HTTP_OPENAI and self.routers so reviewers can locate and, if desired
later, consider changing the function signature to return a Result to surface a
specific error to callers.
In `@model_gateway/tests/routing/test_openai_routing.rs`:
- Around line 439-462: The test
test_responses_endpoint_override_rejects_unknown_provider_without_fallback uses
the unlikely typo "gemiin"; change the invalid provider string passed to
endpoint_override_headers to "gemini" (or parameterize the test to include
realistic invalid values like "gemini", "openaii", "") so the test better
reflects real-world mistakes and validates OpenAIRouter::route_responses
behavior; update the call site that builds headers (endpoint_override_headers)
in that test to use the new value(s) and keep the same assertions on response
status and request counts.
- Around line 215-225: The test uses a fixed sleep in
start_capture_responses_server (sleep(Duration::from_millis(10))) to wait for
axum::serve to start, which is race-prone and unnecessary because the
TcpListener is bound synchronously before spawning; remove the sleep call and
its await so the function returns immediately after spawning the tokio task that
runs axum::serve(listener, app), ensuring the listener created via
TcpListener::bind handles incoming connections without the artificial delay.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9b88062b-358c-4a3c-a756-8d43882b9b5b
📒 Files selected for processing (5)
docs/reference/api/responses.mdmodel_gateway/src/routers/common/header_utils.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/router_manager.rsmodel_gateway/tests/routing/test_openai_routing.rs
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
model_gateway/src/routers/openai/responses/route.rs (1)
102-115:⚠️ Potential issue | 🟠 MajorBlock all non-global IP literals, not just private/loopback/link-local.
This predicate still accepts non-public targets like
0.0.0.0,::,100.64.0.0/10, and multicast/reserved ranges because it only checksis_loopback/is_private/is_link_local. That leaves a bypass around the “public URL” guarantee and can still steer requests at local or otherwise non-routable addresses. Please switch this to a positive globally-routable check, or extend the denylist and add regression tests for the remaining non-global ranges.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/openai/responses/route.rs` around lines 102 - 115, The current ip check (using ip_host.parse::<IpAddr>() and the blocked boolean computed from IpAddr::V4/V6 with is_loopback/is_private/is_link_local) still permits non-global addresses; change the predicate to a positive “globally routable” test by rejecting any ip where ip.is_unspecified() || ip.is_loopback() || ip.is_private() || ip.is_link_local() || ip.is_multicast() || (for IPv6) ip.is_unique_local() || (for IPv4) special-ranges like 100.64.0.0/10, and other reserved/documentation ranges; implement this either as a single helper (e.g. fn is_globally_routable(ip: &IpAddr) -> bool) and use it instead of the current blocked logic (replace use of blocked variable), or extend the denylist to include those ranges, and add regression tests for 0.0.0.0, ::, 100.64.0.0/10, multicast and reserved examples to validate the behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@model_gateway/src/routers/openai/responses/route.rs`:
- Around line 102-115: The current ip check (using ip_host.parse::<IpAddr>() and
the blocked boolean computed from IpAddr::V4/V6 with
is_loopback/is_private/is_link_local) still permits non-global addresses; change
the predicate to a positive “globally routable” test by rejecting any ip where
ip.is_unspecified() || ip.is_loopback() || ip.is_private() || ip.is_link_local()
|| ip.is_multicast() || (for IPv6) ip.is_unique_local() || (for IPv4)
special-ranges like 100.64.0.0/10, and other reserved/documentation ranges;
implement this either as a single helper (e.g. fn is_globally_routable(ip:
&IpAddr) -> bool) and use it instead of the current blocked logic (replace use
of blocked variable), or extend the denylist to include those ranges, and add
regression tests for 0.0.0.0, ::, 100.64.0.0/10, multicast and reserved examples
to validate the behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: da2c8bc8-3f07-47f8-aca5-565bf06476d6
📒 Files selected for processing (3)
model_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/router_manager.rsmodel_gateway/tests/routing/test_openai_routing.rs
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f8ec395e7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/openai/responses/route.rs`:
- Around line 242-260: The code currently calls provider_from_headers and
enforces/records errors unconditionally via requested_provider, which causes
non-override requests to change behavior; update the logic so
provider_from_headers is called and validated only when an endpoint override
header (x-provider-endpoint) is present, move the
debug/metrics/endpoint_override_error_response into that conditional branch, and
ensure any worker-selection call (the site that currently consumes
requested_provider) only receives the provider hint when the override header
exists; reference provider_from_headers, requested_provider, and
endpoint_override_error_response to locate and limit changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 53469f96-65f9-4b43-8ee4-221782455780
📒 Files selected for processing (3)
model_gateway/src/routers/openai/provider/provider_trait.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/tests/routing/test_openai_routing.rs
Signed-off-by: Sudipta Borah <sudipta.borah@oracle.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d71796231
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let endpoint_override = if extract_provider_endpoint(headers).is_some() { | ||
| let requested_provider = match provider_from_headers(headers) { | ||
| Ok(provider) => provider, | ||
| Err(err) => { |
There was a problem hiding this comment.
Honor x-model-provider when choosing upstream auth headers
This override path parses x-model-provider here, but the actual outbound auth strategy is still chosen later from ApiProvider::from_url (see responses/non_streaming.rs and responses/streaming.rs), which only inspects the endpoint hostname. For endpoint overrides that use custom domains (the primary use case for dedicated endpoints), Anthropic/Gemini requests can be misclassified as Generic, so x-api-key / x-goog-api-key are not forwarded and upstream returns auth failures. This means non-OpenAI endpoint overrides break unless the URL happens to match the hardcoded provider host patterns.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/openai/responses/route.rs`:
- Around line 106-119: The literal-IP guard accepts IPv4-mapped IPv6 addresses
(e.g. [::ffff:127.0.0.1]) because the current branch for IpAddr::V6 only checks
Ipv6Addr properties; update the IpAddr::V6 arm to detect and extract embedded
IPv4 addresses and apply the same loopback/private/link-local checks to that
embedded IPv4 (e.g., call Ipv6Addr::to_ipv4() or otherwise detect the
::ffff:0:0/96 mapping and test the resulting Ipv4Addr), and only if there is no
mapped IPv4 fall back to the existing
ipv6.is_loopback()/is_unique_local()/is_unicast_link_local() checks; also add
regression tests for "[::ffff:127.0.0.1]" and "[::ffff:10.0.0.1]" alongside the
existing normalization tests referenced in the diff (ip_host, parse::<IpAddr>(),
and the match on IpAddr::V6).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 399b5111-03d9-48ad-aaa1-5d8ca4390e9e
📒 Files selected for processing (1)
model_gateway/src/routers/openai/responses/route.rs
| let ip_host = host | ||
| .strip_prefix('[') | ||
| .and_then(|host| host.strip_suffix(']')) | ||
| .unwrap_or(host); | ||
| if let Ok(ip) = ip_host.parse::<IpAddr>() { | ||
| let blocked = match ip { | ||
| IpAddr::V4(ip) => ip.is_loopback() || ip.is_private() || ip.is_link_local(), | ||
| IpAddr::V6(ip) => { | ||
| ip.is_loopback() || ip.is_unique_local() || ip.is_unicast_link_local() | ||
| } | ||
| }; | ||
| if blocked { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
Block IPv4-mapped IPv6 loopback/private hosts.
The literal-IP guard still accepts endpoints like http://[::ffff:127.0.0.1]/v1/responses and http://[::ffff:10.0.0.1]/v1/responses. Those parse as Ipv6Addr, so the current is_loopback / is_unique_local / is_unicast_link_local checks miss the embedded private IPv4 and the override SSRF guard can be bypassed.
🔒 Suggested hardening
if let Ok(ip) = ip_host.parse::<IpAddr>() {
let blocked = match ip {
IpAddr::V4(ip) => ip.is_loopback() || ip.is_private() || ip.is_link_local(),
IpAddr::V6(ip) => {
- ip.is_loopback() || ip.is_unique_local() || ip.is_unicast_link_local()
+ ip.is_loopback()
+ || ip.is_unique_local()
+ || ip.is_unicast_link_local()
+ || ip.to_ipv4_mapped()
+ .is_some_and(|ip| ip.is_loopback() || ip.is_private() || ip.is_link_local())
}
};
if blocked {
return None;
}Add regression cases for [::ffff:127.0.0.1] and [::ffff:10.0.0.1] alongside the existing normalization tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/openai/responses/route.rs` around lines 106 - 119,
The literal-IP guard accepts IPv4-mapped IPv6 addresses (e.g.
[::ffff:127.0.0.1]) because the current branch for IpAddr::V6 only checks
Ipv6Addr properties; update the IpAddr::V6 arm to detect and extract embedded
IPv4 addresses and apply the same loopback/private/link-local checks to that
embedded IPv4 (e.g., call Ipv6Addr::to_ipv4() or otherwise detect the
::ffff:0:0/96 mapping and test the resulting Ipv4Addr), and only if there is no
mapped IPv4 fall back to the existing
ipv6.is_loopback()/is_unique_local()/is_unicast_link_local() checks; also add
regression tests for "[::ffff:127.0.0.1]" and "[::ffff:10.0.0.1]" alongside the
existing normalization tests referenced in the diff (ip_host, parse::<IpAddr>(),
and the match on IpAddr::V6).
|
This pull request has been automatically marked as stale because it has not had any activity within 14 days. It will be automatically closed if no further activity occurs within 16 days. Leave a comment if you feel this pull request should remain open. Thank you! |
|
Hi @s-borah, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch: git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease |
Description
Resolves #1404
Problem
Responses API requests could only route through registered workers. Dedicated endpoint requests need to target a provider endpoint directly for a single request without registering that endpoint as a worker first.
Solution
Add a Responses endpoint override path using
x-provider-endpointandx-model-provider. When present, SMG routes the request through the OpenAI-compatible Responses router, builds a request-scoped external worker, sends the request to the exact upstream URL, and strips override headers before forwarding.Changes
x-provider-endpointandx-model-providerheader extraction.gpt-oss, and streaming.Test Plan
cargo +nightly fmt --allcargo clippy --all-targets --all-features -- -D warningscargo test -p smg --test routing_tests test_responses_endpoint_override -- --nocapturecargo test -p smg --test routing_testsKnown: full
cargo testcurrently fails in unrelatedotel_tracing_test::test_router_with_tracing, where the collector receives 0 spans instead of 2.Checklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
New Features
Behavior
Bug Fixes / Validation
Tests
Documentation