Skip to content

fix: rewrote provider for openai, handling schema and response#472

Open
hanneshapke wants to merge 2 commits into
mainfrom
fix/schema-swap-for-reasoning-models
Open

fix: rewrote provider for openai, handling schema and response#472
hanneshapke wants to merge 2 commits into
mainfrom
fix/schema-swap-for-reasoning-models

Conversation

@hanneshapke
Copy link
Copy Markdown
Collaborator

@hanneshapke hanneshapke commented May 27, 2026

Closes #461

Summary

  • Reasoning OpenAI models (gpt-5*, o1*, o3*, o4*) sent to
    /v1/chat/completions were returning 400 from upstream — those models
    require the Responses API. The proxy now converts the request schema and
    rewrites the path after PII masking, so masked content carries over.
  • Response-side restoration now walks the Responses-API shape
    (output[].content[].text + top-level output_text) in addition to
    Chat-Completions choices[], so masked placeholders no longer leak back
    to the client when a gpt-5 request flows through /v1/responses.
  • Native /v1/responses traffic (any model) is now masked on the way out:
    CreateMaskedRequest walks input (string or array, including content
    parts) and instructions when messages is absent.

The split is based on reasoning vs non-reasoning model family, not
GPT-4 vs GPT-5. Schema detection happens by inspecting payload fields
(messages vs input/instructions, choices vs output/output_text)
so the proxy doesn't bake in a version allow-list.

What changed

  • src/backend/providers/openai.go
    • MaybeConvertOpenAIRequest: chat→responses for reasoning models,
      responses→chat for non-reasoning models. Folds system/developer
      messages into instructions, renames max_tokens
      max_output_tokens, strips Chat-Completions-only fields
      (frequency_penalty, presence_penalty, logit_bias, logprobs,
      top_logprobs, n, stop).
    • isReasoningModel: matches gpt-5* and the o-series o[1-9]….
    • CreateMaskedRequest / RestoreMaskedResponse /
      ExtractRequestText / ExtractResponseText: dispatch on payload
      shape; new walkers for the Responses-API request and response
      shapes. Proxy notice is appended only to message-type output
      items so it doesn't get pinned to reasoning items.
  • src/backend/providers/provider.go
    • Registers /v1/responses as an OpenAI subpath in
      GetProviderFromPath.
  • src/backend/proxy/handler.go
    • createAndSendProxyRequest calls MaybeConvertOpenAIRequest after
      PII masking and threads the rewritten path through
      buildTargetURL. buildTargetURL takes the path as a parameter
      so callers can pass a rewritten value.

Out of scope

  • Streaming (text/event-stream) — tracked separately.
  • Tool-call and structured-output field differences beyond the top-level
    schema mapping.
  • Converting the Responses-API response shape back to Chat-Completions
    for clients that called /v1/chat/completions. Clients see the
    Responses shape on the wire.

Test plan

  • go test ./src/backend/... passes
  • New unit tests:
    • TestIsReasoningModel (table covering gpt-5/o-series/gpt-4/etc.)
    • TestMaybeConvertOpenAIRequest_ChatToResponses (system→instructions,
      max_tokens rename, stripped fields, multi-system join)
    • TestMaybeConvertOpenAIRequest_ResponsesToChat
    • TestMaybeConvertOpenAIRequest_NoOps (invalid JSON, missing model,
      unrelated path)
    • TestOpenAIProvider_CreateMaskedRequest_ResponsesAPI (string
      input, instructions, array with string content, array with
      content parts, instructions-only)
    • TestOpenAIProvider_RestoreMaskedResponse_ResponsesAPI (output +
      output_text, ignoring reasoning items, proxy notice placement,
      output_text-only)
    • TestOpenAIProvider_ShapeDetection
  • Manual: send the failing request from Support OpenAI Responses API (GPT-5) end-to-end #461 (gpt-5 with max_tokens
    on /v1/chat/completions) and confirm 200 + correctly restored PII.

@franko-c
Copy link
Copy Markdown

read through the full diff — clean approach, a few notes:

reasoning model list overlap: o1-mini, o1-preview, o1-pro already match via the o1 entry + HasPrefix(model, "o1-"). same for o3-mini/o3-pro vs o3. trimming to {"gpt-5", "o1", "o3", "o4-mini"} simplifies maintenance when new variants ship. no bug, just redundant matching.

conversion flow: masking on Chat Completions shape → converting to Responses shape → upstream → restoring on Responses shape tracks correctly. maskedToOriginal tokens are content-level so they survive the structural reshuffling. buildTargetURL signature change keeping requestPath explicit is the right call over mutating r.URL.Path.

one question: convertChatToResponses copies all fields then sets out["input"] from the converted messages — if a malformed request has both messages and input, the original input content gets overwritten without having been masked. hasChatCompletionsShape gates on messages presence so this is the intended path, just confirming that's by design.

tests cover the edge cases well — shape-detection table and NoOps suite are solid.

Copy link
Copy Markdown

@franko-c franko-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what i can see this handles the main path cleanly — schema detection based on payload fields rather than a model allowlist is the right call, and the test matrix covers the edges well.

couple things i noticed:

reasoningModelFamilies has redundant entries — isReasoningModel matches on family + "-" prefix, so "o1" already catches "o1-mini", "o1-preview", "o1-pro" (and "o3" catches its sub-families). the specific entries never fire because the shorter prefix matches first. not a bug, just slightly confusing to read.

bigger question: when a client sends to /v1/chat/completions with gpt-5, the request converts but the response comes back in Responses API shape (output[] instead of choices[]). PR description flags this as out of scope — is there a tracking issue? any SDK or middleware downstream that reads choices[0].message.content will break on the converted path.

minor: mergeMask in createMaskedResponsesRequest takes masked string as first arg but immediately discards it (_ = masked).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support OpenAI Responses API (GPT-5) end-to-end

2 participants