fix(passthrough): track usage for non-streaming responses and extract…#332
fix(passthrough): track usage for non-streaming responses and extract…#332vfeitoza wants to merge 1 commit into
Conversation
… model from opaque bodies Non-streaming passthrough responses were not logging usage data because the response body was copied directly to the client without parsing. Additionally, when the request body was not fully captured (chunked transfer), the model field was not extracted for opaque passthrough routes because the peek logic required both model and provider to apply selector hints. - Read non-streaming response body and extract usage via ExtractFromCachedResponseBody before forwarding to the client - Apply body selector hints for BodyModeOpaque when only model is found, ensuring passthrough routes populate PassthroughRouteInfo.Model even without a provider field in the request body
📝 WalkthroughWalkthroughThis PR enhances server-side request and response processing through two independent improvements: buffering non-SSE passthrough responses to enable usage logging with pricing resolution, and refining the conditional logic for applying request body selector hints based on parsing state. ChangesServer Request/Response Processing Enhancements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/server/passthrough_support.go`:
- Around line 327-329: The code currently passes the incoming request URL path
(usagePath) into logPassthroughNonStreamUsage which later uses it as both
requestPath and endpoint in passthroughStreamAuditPath, preventing canonical
endpoint mapping; change logPassthroughNonStreamUsage calls (and the values
passed into passthroughStreamAuditPath) to pass two distinct values: the request
path (strings.TrimSpace(c.Request().URL.Path)) as requestPath and the provider's
configured endpoint (the provider endpoint obtained from the provider config
used for the outbound call) as endpoint; update calls around
logPassthroughNonStreamUsage and passthroughStreamAuditPath (also in the 341-347
region) to use providerEndpoint for endpoint so canonical mapping (e.g.,
/v1/chat/completions) is preserved.
- Around line 314-317: The code currently does io.ReadAll(resp.Body) which can
unboundedly buffer large passthrough responses; replace this with a bounded-read
or streaming approach: for non-SSE passthrough responses use an io.LimitedReader
(e.g., io.LimitReader(resp.Body, maxPassthroughBytes)) or stream directly to the
client with io.Copy (copying from resp.Body to the framework response writer)
and return a ProviderError if the limit is exceeded. Update the code around
resp.Body read (the block that calls io.ReadAll and returns handleError(...)) to
enforce a configurable maxPassthroughBytes constant and avoid loading the entire
body into memory. Ensure you still close resp.Body and preserve existing error
handling via handleError/providerType.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 41cbbf6a-a806-4732-9b92-4a968dd1b4d4
📒 Files selected for processing (2)
internal/server/passthrough_support.gointernal/server/request_selector_peek.go
| body, err := io.ReadAll(resp.Body) | ||
| if err != nil { | ||
| return handleError(c, core.NewProviderError(providerType, http.StatusBadGateway, "failed to read provider passthrough response body", err)) | ||
| } |
There was a problem hiding this comment.
Avoid unbounded buffering of non-SSE passthrough responses.
Line 314 reads the full upstream body into memory with io.ReadAll(resp.Body) and no size guard. Large passthrough payloads can cause high memory pressure or OOM under concurrency.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/server/passthrough_support.go` around lines 314 - 317, The code
currently does io.ReadAll(resp.Body) which can unboundedly buffer large
passthrough responses; replace this with a bounded-read or streaming approach:
for non-SSE passthrough responses use an io.LimitedReader (e.g.,
io.LimitReader(resp.Body, maxPassthroughBytes)) or stream directly to the client
with io.Copy (copying from resp.Body to the framework response writer) and
return a ProviderError if the limit is exceeded. Update the code around
resp.Body read (the block that calls io.ReadAll and returns handleError(...)) to
enforce a configurable maxPassthroughBytes constant and avoid loading the entire
body into memory. Ensure you still close resp.Body and preserve existing error
handling via handleError/providerType.
| usagePath := strings.TrimSpace(c.Request().URL.Path) | ||
| s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, usagePath, c.Request().Context()) | ||
| } |
There was a problem hiding this comment.
Fix non-stream audit path derivation to use provider endpoint, not request URL path.
At Line 328, the caller passes request.URL.Path into logPassthroughNonStreamUsage as endpoint. At Line 346, that value is used as both requestPath and endpoint in passthroughStreamAuditPath(...), which prevents canonical mapping (e.g., /chat/completions → /v1/chat/completions) and mislabels usage entries.
Suggested minimal fix
- usagePath := strings.TrimSpace(c.Request().URL.Path)
- s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, usagePath, c.Request().Context())
+ requestPath := strings.TrimSpace(c.Request().URL.Path)
+ s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, requestPath, endpoint, c.Request().Context())
@@
-func (s *passthroughService) logPassthroughNonStreamUsage(body []byte, model, providerType, providerName, requestID, endpoint string, ctx context.Context) {
+func (s *passthroughService) logPassthroughNonStreamUsage(body []byte, model, providerType, providerName, requestID, requestPath, providerEndpoint string, ctx context.Context) {
@@
- auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint)
+ auditPath := passthroughStreamAuditPath(requestPath, providerType, providerEndpoint)Also applies to: 341-347
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/server/passthrough_support.go` around lines 327 - 329, The code
currently passes the incoming request URL path (usagePath) into
logPassthroughNonStreamUsage which later uses it as both requestPath and
endpoint in passthroughStreamAuditPath, preventing canonical endpoint mapping;
change logPassthroughNonStreamUsage calls (and the values passed into
passthroughStreamAuditPath) to pass two distinct values: the request path
(strings.TrimSpace(c.Request().URL.Path)) as requestPath and the provider's
configured endpoint (the provider endpoint obtained from the provider config
used for the outbound call) as endpoint; update calls around
logPassthroughNonStreamUsage and passthroughStreamAuditPath (also in the 341-347
region) to use providerEndpoint for endpoint so canonical mapping (e.g.,
/v1/chat/completions) is preserved.
Greptile SummaryThis PR fixes two gaps in passthrough request handling: non-streaming responses now buffer the body and call
Confidence Score: 3/5The non-streaming usage logging path computes an incorrect audit path for every request to a known provider endpoint, causing usage entries to be written with the raw URL instead of the canonical normalised path. The audit path passed to
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[proxyPassthroughResponse] --> B{resp.StatusCode >= 400?}
B -- Yes --> C[ReadAll + ParseProviderError]
B -- No --> D{isSSEContentType?}
D -- Yes --> E[Streaming path\nObservedSSEStream\nusage observer]
D -- No --> F[ReadAll body]
F --> G{usageLogger enabled?}
G -- Yes --> H[logPassthroughNonStreamUsage]
H --> I["passthroughStreamAuditPath(endpoint, providerType, endpoint)\n⚠️ endpoint = full URL path\nnormalisation never matches"]
I --> J[ExtractFromCachedResponseBody]
J --> K[usageLogger.Write]
G -- No --> L[WriteHeader + Write body]
K --> L
Reviews (1): Last reviewed commit: "fix(passthrough): track usage for non-st..." | Re-trigger Greptile |
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| return | ||
| } | ||
|
|
||
| auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint) |
There was a problem hiding this comment.
The
endpoint received by logPassthroughNonStreamUsage is c.Request().URL.Path (e.g., /p/openai/v1/chat/completions), not the passthrough endpoint segment (e.g., chat/completions). Passing it as both requestPath and endpoint means passthroughStreamAuditPath's switch cases (/chat/completions, /responses, /messages) never match, so auditPath always equals the full URL path. The streaming path avoids this by calling passthroughAuditPath(c, providerType, endpoint, info) with the real endpoint segment, which correctly returns /v1/chat/completions for OpenAI chat. The non-streaming path should derive the audit path the same way before calling the helper, rather than relying on the URL path for normalisation.
| auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint) | |
| auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint) // TODO: pass the actual passthrough endpoint segment, not the full URL path |
| if !hints.parsed && !hints.complete { | ||
| if bodyMode == core.BodyModeOpaque && hints.model != "" { | ||
| core.ApplyBodySelectorHints(env, hints.model, hints.provider, hints.stream) | ||
| } | ||
| return | ||
| } |
There was a problem hiding this comment.
|| → && broadens the fall-through path beyond the described fix
The change from !hints.parsed || !hints.complete to !hints.parsed && !hints.complete also silently changes behaviour for a second state: when the decoder finds both model and provider and exits early, it returns parsed=true, complete=false. The old guard returned without applying hints in that case (a pre-existing bug); the new guard falls through to core.ApplyBodySelectorHints, which is correct. This second fix is beneficial, but it is not mentioned in the PR description — worth calling out explicitly so the change is clearly intentional.
… model from opaque bodies
Non-streaming passthrough responses were not logging usage data because the response body was copied directly to the client without parsing. Additionally, when the request body was not fully captured (chunked transfer), the model field was not extracted for opaque passthrough routes because the peek logic required both model and provider to apply selector hints.
Summary by CodeRabbit