fix(0.6.1): Realtime firstMessage interruption on adopted path by nicolotognoni · Pull Request #95 · PatterAI/Patter

nicolotognoni · 2026-05-12T16:11:46Z

Summary

With agent.prewarm=true (default) the OpenAI Realtime WebSocket is opened, primed, and adopted at call pickup with source=adopted ms=0. The audio bridge between Twilio/Telnyx and OpenAI is live the instant the callee answers, so the caller's "Hi" / "Hello?" reliably reaches OpenAI in the ~250-450 ms before the firstMessage audio starts streaming back. OpenAI's server-VAD treats that early caller audio as a barge-in and silently cancels the in-flight response.create — the configured first_message is never delivered and the caller hears the agent respond to their hello instead of the scripted opening. The cold connect() path masked this because the WS handshake naturally buffered ~300 ms of caller silence.
Fix: send_first_message / sendFirstMessage now arm a one-shot server-VAD lockout by sending session.update with turn_detection: null immediately before response.create, then restore the original turn_detection block on the firstMessage response.done. Subsequent turns barge in normally. Complements the client-side firstAudioSentAt / _first_audio_sent_at guard from PR fix(0.6.1): dashboard live merge + firstMessage barge-in + drain marks (re-base of #89) #92 — that prevents the local audio bridge from clearing the playout buffer; this prevents the server from cancelling the response.

Implementation

Why turn_detection: null and not a temporary high silence_duration_ms? turn_detection: null is fully OpenAI-documented and disables server-VAD entirely with zero edge cases. A high-silence fallback relies on a server-side timer that is sensitive to delivery jitter across the response.done window (variable 1-3 s on long greetings). Null is byte-cheap, documented, and deterministic.
New adapter state: _first_message_protection_pending / firstMessageProtectionPending and _saved_turn_detection / savedTurnDetection snapshot. Set on send_first_message, consumed on the next response.done inside the existing receive loop / message listener. Strictly one-shot — a later response.done does not re-trigger the restore.
Best-effort failure handling: a failed lockout send clears the pending flag so we don't try to restore a turn_detection we never disabled. A failed restore leaves the session VAD-disabled (degraded barge-in but call still completes) — the next configuration-touching session.update would rearm.
Parity respected: behaviour and wire shape identical across Python and TypeScript.
Files touched: libraries/python/getpatter/providers/openai_realtime.py, libraries/typescript/src/providers/openai-realtime.ts, the matching _unit.py / unit/*.test.ts test files, and CHANGELOG.md.

Breaking change?

No. send_first_message / sendFirstMessage keep the same signature and external contract. The only observable difference is two extra session.update frames on the wire during the firstMessage turn — both within the documented OpenAI Realtime schema and billing-safe (session.update does not invoke the model).

Test plan

Python: pytest tests/unit/test_providers_io_unit.py::TestOpenAIRealtimeAdapterIO -x — 21 passed (was 18, +3 new)
Python: full pytest tests/ — 1844 passed, 7 skipped
TypeScript: npm test -- --run — 1520 passed across 85 files (was 1516, +4 new)
TypeScript: npm run lint — clean
TypeScript: npm run build — clean
Manual smoke: outbound call on Twilio with agent.prewarm=true, OpenAI Realtime provider, first_message="Hello! Can you hear me?". Verify call-log transcript starts with the agent firstMessage, not the caller's "Hi.".

Docs updates

N/A — no public surface change, fix is entirely internal to the adapter.

With ``agent.prewarm=true`` (default) the OpenAI Realtime WebSocket is parked, primed, and adopted at call pickup with ``source=adopted ms=0``. The audio bridge is live the instant the callee answers, and the caller's "Hi" / "Hello?" reliably reaches OpenAI in the ~250-450 ms before the firstMessage audio starts streaming back. OpenAI's server-VAD treats that early caller audio as a barge-in and silently cancels the in-flight ``response.create``, so the configured ``first_message`` is never delivered. The cold ``connect()`` path masked the bug because the WS handshake naturally buffered ~300 ms of caller silence. Fix: ``send_first_message`` / ``sendFirstMessage`` now arm a one-shot server-VAD lockout. A ``session.update`` with ``turn_detection: null`` (OpenAI-documented: disables server-VAD entirely, no audio-driven response cancellation) is sent immediately before ``response.create``, then the receive loop / message listener restores the original ``turn_detection`` block (snapshotted from the configured ``vad_type`` / ``silence_duration_ms`` / ``threshold`` / ``prefix_padding_ms``) on the firstMessage ``response.done`` so barge-in works normally for every subsequent turn. The lockout is strictly one-shot. ``turn_detection: null`` was chosen over a temporary high ``silence_duration_ms`` because it is fully OpenAI-documented and guarantees zero server-side cancellation (timer-based fallbacks remain sensitive to clock skew on multi-second response.done windows). Complements the client-side ``firstAudioSentAt`` guard from PR #92 which prevents the local audio bridge from clearing the playout buffer on caller speech — this closes the same gap on the *server* side. Coverage: 3 new Python tests + 4 new TypeScript tests in the ``OpenAIRealtimeAdapter`` IO suites, covering lockout sequence, custom ``silence_duration_ms`` / ``vad_type`` restore, one-shot semantics, and no-ws no-op. Files: libraries/python/getpatter/providers/openai_realtime.py, libraries/typescript/src/providers/openai-realtime.ts, libraries/python/tests/unit/test_providers_io_unit.py, libraries/typescript/tests/unit/openai-realtime.test.ts, CHANGELOG.md.

nicolotognoni mentioned this pull request May 12, 2026

diag(0.6.1): Realtime firstMessage lockout — instrumentation + finding #99

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(0.6.1): Realtime firstMessage interruption on adopted path#95

fix(0.6.1): Realtime firstMessage interruption on adopted path#95
nicolotognoni wants to merge 1 commit into
feat/observability-otel-attrs-0.6.1from
fix/0.6.1-realtime-firstmessage-adopted-race

nicolotognoni commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nicolotognoni commented May 12, 2026

Summary

Implementation

Breaking change?

Test plan

Docs updates

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant