Skip to content

fix(realtime): drop OpenAI-Beta header rejected by GA Realtime API#1516

Merged
key4ng merged 3 commits into
mainfrom
fix/realtime-drop-beta-header
May 21, 2026
Merged

fix(realtime): drop OpenAI-Beta header rejected by GA Realtime API#1516
key4ng merged 3 commits into
mainfrom
fix/realtime-drop-beta-header

Conversation

@key4ng
Copy link
Copy Markdown
Collaborator

@key4ng key4ng commented May 21, 2026

Description

Problem

After #1504 bumped the realtime test to the GA model alias gpt-realtime, the openai-realtime job is still failing on main with the same 1005 (no status received) / ConnectionResetError pattern (run 26212745181).

Direct upstream reproduction (forcing HTTP/1.1 so the WS upgrade isn't masked by HTTP/2 framing) shows what's actually happening:

HTTP/1.1 101 Switching Protocols
…
{"type":"error","error":{
  "type":"invalid_request_error",
  "code":"beta_api_shape_disabled",
  "message":"The Realtime Beta API is no longer supported. Please use /v1/realtime for the GA API."
}}

OpenAI accepts the WebSocket upgrade (HTTP 101), then immediately sends an error event and closes the connection because the request carries OpenAI-Beta: realtime=v1. SMG forwards that error frame, the upstream tears down, the upgrade closure ends, and the already-upgraded client socket gets dropped → client sees 1005 / TCP RST.

Repeating the same handshake without the beta header succeeds and returns a proper GA session.created:

{"type":"session.created","session":{"id":"sess_…","model":"gpt-realtime","object":"realtime.session", }}

So bumping the model in #1504 was necessary (the old preview snapshot 404s) but not sufficient (SMG's hardcoded beta header still triggers GA-shape rejection).

Solution

Drop the hardcoded OpenAI-Beta: realtime=v1 header from SMG's upstream WebSocket request. The GA Realtime API doesn't want it; the beta API that did want it has been retired.

Changes

  • model_gateway/src/routers/openai/realtime/proxy.rs — remove the .insert("OpenAI-Beta", "realtime=v1") on the upstream tokio_tungstenite request. Leave a short comment documenting why, so nobody re-adds it.
  • e2e_test/realtime/test_realtime_ws.py — drop OpenAI-Beta from the test's ws_headers fixture for cleanliness (the test sends it client→SMG, but SMG never proxies client headers upstream, so it was already inert).

Test Plan

  • CI openai-realtime job goes green.
  • Before/after: the 10 previously-failing tests pass; the 2 always-passing tests (test_missing_model_returns_error, test_missing_auth_returns_error) remain passing.

Direct verification with the upstream (already done locally):

  • With OpenAI-Beta: realtime=v1code: beta_api_shape_disabled, then close.
  • Without it → session.created frame with the full GA session object (model gpt-realtime, output_modalities ["audio"], server_vad turn detection, etc.).
Checklist
  • cargo +nightly fmt --all -- --check passes
  • cargo check -p smg passes
  • cargo clippy --all-targets --all-features -- -D warnings (relying on CI)
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

  • Bug Fixes

    • Removed the incompatible realtime header from WebSocket requests and fixtures to ensure compatibility with GA Realtime endpoints.
  • Tests

    • Updated end-to-end realtime WebSocket tests to match revised event and payload schemas: streaming now uses response.output_text.delta, session and response use output_modalities/text typing, response.done requires output_text content.type, conversation.item.added replaces conversation.item.created, and related assertions/logging adjusted.

Review Change Stack

@github-actions github-actions Bot added tests Test changes model-gateway Model gateway crate changes realtime-api Realtime API related changes openai OpenAI router changes labels May 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2049644c-2ba0-4698-a225-66057d209a0f

📥 Commits

Reviewing files that changed from the base of the PR and between 5694628 and cab68e7.

📒 Files selected for processing (2)
  • e2e_test/realtime/test_realtime_ws.py
  • model_gateway/src/routers/openai/realtime/proxy.rs

📝 Walkthrough

Walkthrough

The upstream WebSocket proxy no longer sends the OpenAI-Beta: realtime=v1 header. E2E realtime tests are updated to the GA event/payload schema: output_modalities, response.output_text.delta, session.audio.*, and content.type == "output_text" are validated across session, response, streaming, and cancellation flows.

Changes

Realtime WebSocket tests & proxy updates

Layer / File(s) Summary
Remove OpenAI-Beta header from proxy and test fixture
model_gateway/src/routers/openai/realtime/proxy.rs, e2e_test/realtime/test_realtime_ws.py
Proxy upstream request and test fixture both remove the OpenAI-Beta: realtime=v1 header; only the Authorization bearer token is forwarded to upstream.
Session setup and schema assertions
e2e_test/realtime/test_realtime_ws.py
Session setup messages use type: "realtime" and output_modalities: ["text"]; session.created/session.updated assertions validate session.audio.input and session.audio.output fields and expect output_modalities.
response.create and response.done checks
e2e_test/realtime/test_realtime_ws.py
All response.create payloads use output_modalities; response.done assertions expect output content items with content.type == "output_text".
Streaming deltas and cancellation sync
e2e_test/realtime/test_realtime_ws.py
Streaming collection and schema validation now target response.output_text.delta; tests wait for deltas to start streaming before issuing response.cancel and assert delta counts and logs for the new delta type.
Multi-turn conversation event update
e2e_test/realtime/test_realtime_ws.py
Multi-turn conversation test uses response.create with output_modalities and now asserts conversation.item.added events instead of conversation.item.created.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • lightseekorg/smg#660: Both PRs are centered on the Realtime WebSocket E2E test/proxy behavior—main PR adjusts e2e_test/realtime/test_realtime_ws.py to the GA WebSocket event/payload schema and removes the OpenAI-Beta: realtime=v1 header in proxy.rs, aligning with the integration test added in PR #660.
  • lightseekorg/smg#725: The main PR changes the websocket proxy behavior by removing the OpenAI-Beta: realtime=v1 header inside proxy::run_ws_proxy, while the retrieved PR refactors the WS handler to wrap that proxy call and record worker outcomes—both touch the realtime WS proxy execution path.

Suggested reviewers

  • slin1237
  • XinyueZhang369

Poem

🐰 I hopped through headers, clipped the trace,
Deltas now dance in a GA-shaped place.
Sessions sing with audio aligned,
Output_text blooms where text once chimed,
Tests and proxy snug in pace.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main change: removing the OpenAI-Beta header that was causing GA Realtime API to reject connections. It aligns perfectly with the primary objective of fixing the realtime WebSocket connection issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/realtime-drop-beta-header

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good — clean removal of the beta header from both the proxy and e2e test, with a clear comment explaining the upstream API change. No remaining references to OpenAI-Beta in the codebase.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the "OpenAI-Beta: realtime=v1" header from the WebSocket connection headers in both the end-to-end tests and the proxy implementation. This change is necessary because OpenAI's GA Realtime API now rejects the beta header with a "beta_api_shape_disabled" error. I have no feedback to provide as there were no review comments.

Comment on lines +303 to +304
input_cfg = audio.get("input") or {}
assert isinstance(input_cfg.get("turn_detection"), (dict, type(None)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: Unlike audio.output (where the isinstance(voice, str) assertion catches a missing key), a missing audio.input passes silently here — None or {} gives {}, then {}.get("turn_detection") is None, and isinstance(None, (dict, type(None))) is True.

Consider asserting input exists as a dict, mirroring the audio assertion on line 300:

Suggested change
input_cfg = audio.get("input") or {}
assert isinstance(input_cfg.get("turn_detection"), (dict, type(None)))
input_cfg = audio.get("input") or {}
assert isinstance(input_cfg, dict) and input_cfg, f"Expected session.audio.input dict, got: {input_cfg!r}"
assert isinstance(input_cfg.get("turn_detection"), (dict, type(None)))

@@ -333,12 +355,14 @@ async def _run():
asyncio.run(_run())

def test_response_text_delta_format(self, ws_url, ws_headers):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: Method name still says text_delta but the event migrated to output_text.delta. Consider renaming for grep-ability:

Suggested change
def test_response_text_delta_format(self, ws_url, ws_headers):
def test_response_output_text_delta_format(self, ws_url, ws_headers):

key4ng added 3 commits May 21, 2026 12:50
OpenAI's GA Realtime API now rejects `OpenAI-Beta: realtime=v1` with
`beta_api_shape_disabled` ("The Realtime Beta API is no longer
supported. Please use /v1/realtime for the GA API."). The upstream
accepts the WebSocket upgrade (HTTP 101), then immediately sends an
error frame and closes — which the test sees as a 1005 close / TCP RST.

This is the actual cause of the openai-realtime E2E failures that #1504
only partially addressed: bumping the model to the GA alias was
necessary but not sufficient, because the hardcoded beta header on
the upstream connection still triggers GA-shape rejection.

Remove the beta header from the upstream request in proxy.rs and from
the test's ws_headers fixture (the test header was never proxied
upstream anyway, but is no longer meaningful).

Signed-off-by: key4ng <rukeyang@gmail.com>
After dropping the `OpenAI-Beta: realtime=v1` header (so the upstream
WS now negotiates the GA Realtime API), the post-connect tests started
timing out / asserting wrong fields because they were still using beta
event shapes that GA renamed or relocated.

Confirmed from the GA `session.created` payload OpenAI now returns
(captured in the failing CI log):

  {
    "type": "realtime", "object": "realtime.session",
    "model": "gpt-realtime",
    "output_modalities": ["audio"],
    "audio": {
      "input":  {"turn_detection": {...}, ...},
      "output": {"voice": "alloy", ...}
    },
    ...
  }

Migrations applied:

- `session.update` payload: send `{"type": "realtime",
  "output_modalities": ["text"]}` instead of `{"modalities": ["text"]}`
- `response.create` params: `output_modalities` instead of `modalities`
- Streaming delta event: `response.output_text.delta` instead of the
  beta `response.text.delta`
- `response.done` output content type: `output_text` instead of `text`
- `session.created` schema check: `output_modalities` at the top, plus
  `audio.input.turn_detection` and `audio.output.voice` (formerly
  `session.turn_detection` and `session.voice`)

The 3 tests that did not use the realtime shape (basic connect,
invalid-event, missing-model, missing-auth) were already passing and
are untouched. The `OPENAI-Beta` header removal in proxy.rs is what
unblocked the connection; this commit makes the rest of the suite
match the GA wire format.

Signed-off-by: key4ng <rukeyang@gmail.com>
GA renamed `conversation.item.created` to `conversation.item.added`
(emitted when an item is added to the default conversation; the old
name is now a legacy event the server no longer sends for plain
`conversation.item.create` requests).

The previous run timed out 30s waiting for the old event name. Switch
to the new name to match the GA wire format.

Signed-off-by: key4ng <rukeyang@gmail.com>
@key4ng key4ng force-pushed the fix/realtime-drop-beta-header branch from 5694628 to cab68e7 Compare May 21, 2026 19:50
@key4ng
Copy link
Copy Markdown
Collaborator Author

key4ng commented May 21, 2026

@key4ng key4ng merged commit bdfe76e into main May 21, 2026
37 checks passed
@key4ng key4ng deleted the fix/realtime-drop-beta-header branch May 21, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model-gateway Model gateway crate changes openai OpenAI router changes realtime-api Realtime API related changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants