Skip to content

fix(passthrough): track usage for non-streaming responses and extract…#332

Open
vfeitoza wants to merge 1 commit into
ENTERPILOT:mainfrom
vfeitoza:fix/passthrough-usage-tracking
Open

fix(passthrough): track usage for non-streaming responses and extract…#332
vfeitoza wants to merge 1 commit into
ENTERPILOT:mainfrom
vfeitoza:fix/passthrough-usage-tracking

Conversation

@vfeitoza
Copy link
Copy Markdown
Contributor

@vfeitoza vfeitoza commented May 15, 2026

… model from opaque bodies

Non-streaming passthrough responses were not logging usage data because the response body was copied directly to the client without parsing. Additionally, when the request body was not fully captured (chunked transfer), the model field was not extracted for opaque passthrough routes because the peek logic required both model and provider to apply selector hints.

  • Read non-streaming response body and extract usage via ExtractFromCachedResponseBody before forwarding to the client
  • Apply body selector hints for BodyModeOpaque when only model is found, ensuring passthrough routes populate PassthroughRouteInfo.Model even without a provider field in the request body

Summary by CodeRabbit

  • Bug Fixes
    • Improved usage tracking and logging for non-streaming provider responses, including accurate model/provider resolution and pricing calculation.
    • Enhanced request body parsing with more robust selector hint application for increased reliability.

Review Change Stack

… model from opaque bodies

Non-streaming passthrough responses were not logging usage data because
the response body was copied directly to the client without parsing.
Additionally, when the request body was not fully captured (chunked
transfer), the model field was not extracted for opaque passthrough
routes because the peek logic required both model and provider to apply
selector hints.

- Read non-streaming response body and extract usage via
  ExtractFromCachedResponseBody before forwarding to the client
- Apply body selector hints for BodyModeOpaque when only model is found,
  ensuring passthrough routes populate PassthroughRouteInfo.Model even
  without a provider field in the request body
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

This PR enhances server-side request and response processing through two independent improvements: buffering non-SSE passthrough responses to enable usage logging with pricing resolution, and refining the conditional logic for applying request body selector hints based on parsing state.

Changes

Server Request/Response Processing Enhancements

Layer / File(s) Summary
Non-SSE Passthrough Response Buffering and Usage Logging
internal/server/passthrough_support.go
Non-SSE passthrough responses are read into memory instead of streamed directly to the client, enabling conditional usage logging with provider/model resolution, audit path derivation, and pricing lookup before writing the buffered body to the response.
Request Body Selector Hints Deferred Application
internal/server/request_selector_peek.go
Request body selector hints are applied conditionally when parsed and complete flags are both false: hints are applied only for opaque body mode with non-empty model, otherwise deferred until the final fall-through path.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

release:internal

Poem

🐰 A rabbit hops through buffered streams,
Where passthrough logs fulfill their dreams,
With pricing hints and selectors wise,
The server's path now optimized!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: tracking usage for non-streaming passthrough responses and extracting model from opaque bodies, which directly aligns with the PR's primary objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/server/passthrough_support.go`:
- Around line 327-329: The code currently passes the incoming request URL path
(usagePath) into logPassthroughNonStreamUsage which later uses it as both
requestPath and endpoint in passthroughStreamAuditPath, preventing canonical
endpoint mapping; change logPassthroughNonStreamUsage calls (and the values
passed into passthroughStreamAuditPath) to pass two distinct values: the request
path (strings.TrimSpace(c.Request().URL.Path)) as requestPath and the provider's
configured endpoint (the provider endpoint obtained from the provider config
used for the outbound call) as endpoint; update calls around
logPassthroughNonStreamUsage and passthroughStreamAuditPath (also in the 341-347
region) to use providerEndpoint for endpoint so canonical mapping (e.g.,
/v1/chat/completions) is preserved.
- Around line 314-317: The code currently does io.ReadAll(resp.Body) which can
unboundedly buffer large passthrough responses; replace this with a bounded-read
or streaming approach: for non-SSE passthrough responses use an io.LimitedReader
(e.g., io.LimitReader(resp.Body, maxPassthroughBytes)) or stream directly to the
client with io.Copy (copying from resp.Body to the framework response writer)
and return a ProviderError if the limit is exceeded. Update the code around
resp.Body read (the block that calls io.ReadAll and returns handleError(...)) to
enforce a configurable maxPassthroughBytes constant and avoid loading the entire
body into memory. Ensure you still close resp.Body and preserve existing error
handling via handleError/providerType.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 41cbbf6a-a806-4732-9b92-4a968dd1b4d4

📥 Commits

Reviewing files that changed from the base of the PR and between ddd80ae and b051dc8.

📒 Files selected for processing (2)
  • internal/server/passthrough_support.go
  • internal/server/request_selector_peek.go

Comment on lines +314 to +317
body, err := io.ReadAll(resp.Body)
if err != nil {
return handleError(c, core.NewProviderError(providerType, http.StatusBadGateway, "failed to read provider passthrough response body", err))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid unbounded buffering of non-SSE passthrough responses.

Line 314 reads the full upstream body into memory with io.ReadAll(resp.Body) and no size guard. Large passthrough payloads can cause high memory pressure or OOM under concurrency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/server/passthrough_support.go` around lines 314 - 317, The code
currently does io.ReadAll(resp.Body) which can unboundedly buffer large
passthrough responses; replace this with a bounded-read or streaming approach:
for non-SSE passthrough responses use an io.LimitedReader (e.g.,
io.LimitReader(resp.Body, maxPassthroughBytes)) or stream directly to the client
with io.Copy (copying from resp.Body to the framework response writer) and
return a ProviderError if the limit is exceeded. Update the code around
resp.Body read (the block that calls io.ReadAll and returns handleError(...)) to
enforce a configurable maxPassthroughBytes constant and avoid loading the entire
body into memory. Ensure you still close resp.Body and preserve existing error
handling via handleError/providerType.

Comment on lines +327 to +329
usagePath := strings.TrimSpace(c.Request().URL.Path)
s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, usagePath, c.Request().Context())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix non-stream audit path derivation to use provider endpoint, not request URL path.

At Line 328, the caller passes request.URL.Path into logPassthroughNonStreamUsage as endpoint. At Line 346, that value is used as both requestPath and endpoint in passthroughStreamAuditPath(...), which prevents canonical mapping (e.g., /chat/completions/v1/chat/completions) and mislabels usage entries.

Suggested minimal fix
-        usagePath := strings.TrimSpace(c.Request().URL.Path)
-        s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, usagePath, c.Request().Context())
+        requestPath := strings.TrimSpace(c.Request().URL.Path)
+        s.logPassthroughNonStreamUsage(body, model, providerType, providerName, requestID, requestPath, endpoint, c.Request().Context())
@@
-func (s *passthroughService) logPassthroughNonStreamUsage(body []byte, model, providerType, providerName, requestID, endpoint string, ctx context.Context) {
+func (s *passthroughService) logPassthroughNonStreamUsage(body []byte, model, providerType, providerName, requestID, requestPath, providerEndpoint string, ctx context.Context) {
@@
-    auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint)
+    auditPath := passthroughStreamAuditPath(requestPath, providerType, providerEndpoint)

Also applies to: 341-347

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/server/passthrough_support.go` around lines 327 - 329, The code
currently passes the incoming request URL path (usagePath) into
logPassthroughNonStreamUsage which later uses it as both requestPath and
endpoint in passthroughStreamAuditPath, preventing canonical endpoint mapping;
change logPassthroughNonStreamUsage calls (and the values passed into
passthroughStreamAuditPath) to pass two distinct values: the request path
(strings.TrimSpace(c.Request().URL.Path)) as requestPath and the provider's
configured endpoint (the provider endpoint obtained from the provider config
used for the outbound call) as endpoint; update calls around
logPassthroughNonStreamUsage and passthroughStreamAuditPath (also in the 341-347
region) to use providerEndpoint for endpoint so canonical mapping (e.g.,
/v1/chat/completions) is preserved.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR fixes two gaps in passthrough request handling: non-streaming responses now buffer the body and call ExtractFromCachedResponseBody for usage logging, and opaque-body passthrough routes now populate PassthroughRouteInfo.Model even when no provider field is present in the request body.

  • passthrough_support.go: Adds logPassthroughNonStreamUsage to extract and log token usage from buffered non-streaming response bodies, mirroring what the streaming observer already does. The endpoint value passed to passthroughStreamAuditPath is the full URL path rather than the passthrough endpoint segment, so audit-path normalisation (e.g. → /v1/chat/completions) never fires for non-streaming logs.
  • request_selector_peek.go: Tightens the early-return guard from !parsed || !complete to !parsed && !complete, so selector hints are applied when parsing succeeds but the JSON object wasn't fully consumed. As a side effect this also fixes a pre-existing case where hints were silently dropped when both model and provider were found via early exit (parsed=true, complete=false).

Confidence Score: 3/5

The non-streaming usage logging path computes an incorrect audit path for every request to a known provider endpoint, causing usage entries to be written with the raw URL instead of the canonical normalised path.

The audit path passed to ExtractFromCachedResponseBody will always be the full request URL (e.g. /p/openai/v1/chat/completions) rather than the canonical path (e.g. /v1/chat/completions) because passthroughStreamAuditPath is called with the full URL as both its requestPath and endpoint arguments. This means every non-streaming usage log entry for OpenAI chat, OpenAI responses, and Anthropic messages will be filed under the wrong path — a silent data quality regression that could skew usage reports and any downstream logic that filters by audit path.

internal/server/passthrough_support.go — the logPassthroughNonStreamUsage function and the call site in proxyPassthroughResponse need a second look to pass the actual passthrough endpoint segment rather than the full URL path.

Important Files Changed

Filename Overview
internal/server/passthrough_support.go Adds non-streaming usage logging by buffering the full response body; contains a bug where passthroughStreamAuditPath is called with the full URL path as both arguments, causing audit path normalisation to always fall through to the raw URL instead of the canonical path (e.g. /v1/chat/completions).
internal/server/request_selector_peek.go Changes early-return guard from `

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[proxyPassthroughResponse] --> B{resp.StatusCode >= 400?}
    B -- Yes --> C[ReadAll + ParseProviderError]
    B -- No --> D{isSSEContentType?}
    D -- Yes --> E[Streaming path\nObservedSSEStream\nusage observer]
    D -- No --> F[ReadAll body]
    F --> G{usageLogger enabled?}
    G -- Yes --> H[logPassthroughNonStreamUsage]
    H --> I["passthroughStreamAuditPath(endpoint, providerType, endpoint)\n⚠️ endpoint = full URL path\nnormalisation never matches"]
    I --> J[ExtractFromCachedResponseBody]
    J --> K[usageLogger.Write]
    G -- No --> L[WriteHeader + Write body]
    K --> L
Loading

Reviews (1): Last reviewed commit: "fix(passthrough): track usage for non-st..." | Re-trigger Greptile

@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 9.09091% with 30 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/server/passthrough_support.go 6.66% 25 Missing and 3 partials ⚠️
internal/server/request_selector_peek.go 33.33% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

return
}

auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The endpoint received by logPassthroughNonStreamUsage is c.Request().URL.Path (e.g., /p/openai/v1/chat/completions), not the passthrough endpoint segment (e.g., chat/completions). Passing it as both requestPath and endpoint means passthroughStreamAuditPath's switch cases (/chat/completions, /responses, /messages) never match, so auditPath always equals the full URL path. The streaming path avoids this by calling passthroughAuditPath(c, providerType, endpoint, info) with the real endpoint segment, which correctly returns /v1/chat/completions for OpenAI chat. The non-streaming path should derive the audit path the same way before calling the helper, rather than relying on the URL path for normalisation.

Suggested change
auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint)
auditPath := passthroughStreamAuditPath(endpoint, providerType, endpoint) // TODO: pass the actual passthrough endpoint segment, not the full URL path

Comment on lines +29 to 34
if !hints.parsed && !hints.complete {
if bodyMode == core.BodyModeOpaque && hints.model != "" {
core.ApplyBodySelectorHints(env, hints.model, hints.provider, hints.stream)
}
return
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ||&& broadens the fall-through path beyond the described fix

The change from !hints.parsed || !hints.complete to !hints.parsed && !hints.complete also silently changes behaviour for a second state: when the decoder finds both model and provider and exits early, it returns parsed=true, complete=false. The old guard returned without applying hints in that case (a pre-existing bug); the new guard falls through to core.ApplyBodySelectorHints, which is correct. This second fix is beneficial, but it is not mentioned in the PR description — worth calling out explicitly so the change is clearly intentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants