feat(gateway): Add Messages API to HTTP router#1521
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughImplements GenerationRequest for CreateMessageRequest (stream/model/routing text), wires RouterTrait::route_messages to forward typed requests to /v1/messages, adds a mock upstream messages handler, and introduces integration and unit tests for streaming, non-streaming, and error cases. ChangesMessages API Routing Support
Sequence Diagram(s)sequenceDiagram
participant Client
participant Gateway as HTTP_Gateway_Router
participant Worker as Upstream_Worker_v1_messages
Client->>Gateway: POST /v1/messages (CreateMessageRequest, stream?, model, messages)
Gateway->>Gateway: GenerationRequest::extract_text_for_routing()
Gateway->>Worker: Forward typed request to /v1/messages
Worker->>Gateway: SSE event stream OR JSON response OR error
Gateway->>Client: Proxy SSE events or JSON response or propagate error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @ekzhang, the DCO sign-off check has failed. All commits must include a To fix existing commits: # Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-leaseTo sign off future commits automatically:
|
There was a problem hiding this comment.
Code Review
This pull request implements the GenerationRequest trait for CreateMessageRequest and adds a route_messages method to the HTTP router. The reviewer suggested enhancing the extract_text_for_routing logic to include ToolResult blocks while excluding control-plane items and internal reasoning blocks to provide better context for routing decisions without introducing noise.
| for msg in &self.messages { | ||
| match &msg.content { | ||
| InputContent::String(s) => push(s, &mut has_content, &mut buffer), | ||
| InputContent::Blocks(blocks) => { | ||
| for block in blocks { | ||
| if let InputContentBlock::Text(text_block) = block { | ||
| push(&text_block.text, &mut has_content, &mut buffer); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of extract_text_for_routing only extracts text from InputContentBlock::Text. We should extend this to include ToolResult blocks to provide more context for routing decisions. However, we must ensure that control-plane items (such as McpApprovalRequest and McpApprovalResponse) are excluded to avoid noise and instability in routing. Additionally, Thinking blocks should be excluded as they contain internal reasoning rather than user-facing content, keeping the analysis channel distinct from routing logic.
for msg in &self.messages {
match &msg.content {
InputContent::String(s) => push(s, &mut has_content, &mut buffer),
InputContent::Blocks(blocks) => {
for block in blocks {
match block {
InputContentBlock::Text(text_block) => {
push(&text_block.text, &mut has_content, &mut buffer);
}
InputContentBlock::ToolResult(tool_result) => {
if tool_result.is_control_plane() {
continue;
}
if let Some(content) = &tool_result.content {
match content {
ToolResultContent::String(s) => {
push(s, &mut has_content, &mut buffer);
}
ToolResultContent::Blocks(blocks) => {
for b in blocks {
if let ToolResultContentBlock::Text(t) = b {
push(&t.text, &mut has_content, &mut buffer);
}
}
}
}
}
}
_ => {}
}
}
}
}
}References
- To avoid noise and instability in routing decisions, do not extract text from control-plane items such as McpApprovalRequest and McpApprovalResponse.
- Mixing user-facing content with internal chain-of-thought (CoT) in the analysis channel conflates the two distinct purposes; routing should focus on user-facing or environment state.
| body: &CreateMessageRequest, | ||
| model_id: &str, | ||
| ) -> Response { | ||
| self.route_typed_request(headers, body, "/v1/messages", model_id) |
There was a problem hiding this comment.
🔴 Important: route_to_endpoint() in grpc/utils/metrics.rs has no match arm for "/v1/messages", so it falls through to "other". All messages-API metrics (request counts, durations, errors, retries) from the HTTP router will be bucketed under the "other" endpoint label instead of the existing ENDPOINT_MESSAGES ("messages").
The gRPC routers already use metrics_labels::ENDPOINT_MESSAGES directly, but the HTTP router relies on route_to_endpoint(route) to derive the label from the path string.
Fix: add "/v1/messages" => metrics_labels::ENDPOINT_MESSAGES to the match in route_to_endpoint.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e60a18714
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let InputContentBlock::Text(text_block) = block { | ||
| push(&text_block.text, &mut has_content, &mut buffer); |
There was a problem hiding this comment.
Include tool_result content when extracting routing text
The new CreateMessageRequest::extract_text_for_routing only appends InputContentBlock::Text, so requests whose latest user turn is a tool_result (common in tool-calling loops) can produce an empty or incomplete routing key even though they contain substantial text. This degrades text-based worker selection and can misroute /v1/messages traffic compared with chat routing, which includes tool message content; consider extracting text from ToolResult payloads (string and text blocks) as well.
Useful? React with 👍 / 👎.
Some backends like SGLang recently implement the `/v1/messages` API - sgl-project/sglang#18630 This PR adds support for routing to the messages API over HTTP, similar to how `/v1/chat/completions` is handled currently via a `GenerationRequest` trait impl. Signed-off-by: Eric Zhang <eric@thinkingmachines.ai>
Adds integration tests for the new HTTP `/v1/messages` proxy: - `/v1/messages` mock handler in the shared MockWorker (non-streaming + SSE streaming, Anthropic event sequence). - End-to-end tests via `AppTestContext` for success, streaming, and upstream-error propagation. - Unit tests for the new `GenerationRequest` impl on `CreateMessageRequest` (is_stream, routing-text extraction for string content, text blocks, and image-only / no-text edge case). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eric Zhang <eric@thinkingmachines.ai>
1e60a18 to
e18412e
Compare
CatherineSue
left a comment
There was a problem hiding this comment.
Seems lint is failing. @ekzhang can you take a look?
|
Hi @ekzhang, the DCO sign-off check has failed. All commits must include a To fix existing commits: # Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-leaseTo sign off future commits automatically:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72d525ce93
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .unwrap(); | ||
|
|
||
| assert!(req.is_stream()); | ||
| assert_eq!(req.get_model(), Some("claude-sonnet-4-5-20250929")); |
There was a problem hiding this comment.
Disambiguate trait call for get_model in test
CreateMessageRequest has an inherent get_model(&self) -> &str and also implements GenerationRequest::get_model(&self) -> Option<&str>. The assertion here calls req.get_model() and compares it to Some(...), which resolves to the inherent method and produces a type mismatch (&str vs Option<&str>), so this test target will not compile when dependencies are available. Use fully-qualified syntax (e.g. GenerationRequest::get_model(&req)) or compare against the inherent return type.
Useful? React with 👍 / 👎.
Signed-off-by: zhyncs <46627482+zhyncs@users.noreply.github.com>
e630d01 to
cf1f2c3
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf1f2c3cd2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let InputContentBlock::Text(text_block) = block { | ||
| push(&text_block.text, &mut has_content, &mut buffer); | ||
| } |
There was a problem hiding this comment.
Extract document text when building routing key
CreateMessageRequest::extract_text_for_routing currently appends only InputContentBlock::Text, so requests whose prompt is carried in document blocks (for example DocumentSource::Text or document content blocks) produce an empty/partial routing key even though they contain substantial text. Because HTTP routing uses this extracted text for worker selection (route_typed_request), document-heavy /v1/messages traffic can be misrouted compared with equivalent chat payloads; include textual fields from document blocks when constructing the routing text.
Useful? React with 👍 / 👎.
Some backends like SGLang recently implement the
/v1/messagesAPI - sgl-project/sglang#18630This PR adds support for routing to the messages API over HTTP, similar to how
/v1/chat/completionsis handled currently via aGenerationRequesttrait impl.Description
Problem
When hitting /v1/messages in SMG:
Solution
Forwards the API call directly for HTTP backends, with a text-based routing policy.
Changes
Changed router for
/v1/messagesidentical to/v1/chat/completions.Test Plan
Add test for /v1/messages
Checklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
New Features
Tests
Mocks