Partial transcription, in-place history rewrite, live UI polish by etiennechabert · Pull Request #14 · etiennechabert/polyglot

etiennechabert · 2026-04-21T12:10:32Z

Summary

Streaming-style transcription: while a speaker keeps talking, Polyglot re-runs Whisper + translation on the growing buffer every 10 s and emits a payload with the same utterance_id. The frontend upserts by utterance_id, so the same row updates in place; when the final batch flushes (speaker switch or 60 s cap) it replaces the partial with the polished version.

What ships

Backend

partial_transcribe_and_emit() runs Whisper + translations on a buffer snapshot, uses _current_bot_speaker as the speaker label, skips diarization / transcript-file write / summarization accumulation.
Shares transcription_lock with the final so partial + final Whisper calls serialise on the GPU.
Mints a fresh utterance_id on the first chunk of every batch (in both the is_processing=True and =False branches of process_audio) and resets it when the final flushes, so every partial + final for one turn shares the same key.
Final WS payloads now carry utterance_id + is_partial:false; partials carry utterance_id + is_partial:true.
Admin auth replays the current bot_status (including the meeting URL) + meet_roster so a late-arriving admin tab reflects the active bot.
Cache-Control: no-store headers on the viewer + admin routes so browsers always fetch fresh templates.

Bot

Emits a bot_info event on connect with the meeting URL so the admin panel shows which call the bot is attached to, even when the bot was started from the CLI.
nameFromTile() now strips the "'s Presentation" suffix so a screen-sharing participant's tile doesn't label them "X's Presentation".

Frontend (admin + viewer)

Transcript is keyed by utterance_id. Partial arrivals upsert the matching row; finals replace it. Timestamps are preserved across the swap so rows don't reshuffle.
Partials get a dashed left border + "● live" tag + 0.75 opacity; finals get a solid border + full opacity.
Viewer row segments render in a vertical body column instead of the flex-row that was making multi-segment utterances render horizontally.
Viewer keeps the ?p=<password> query param across refreshes so the user doesn't have to re-enter the passphrase on every reload.

Test plan

Validated live against a real Meet call: partials fire every 10 s, carry the same utterance_id, final at 60 s cap flips the bubble from dashed to solid and mints a new id for the next turn.
[BOT] Resolved SPEAKER_XX → Etienne Chabert's Presentation proves Phase 5 still works end-to-end on finals.
Diagnostic socket.io client confirms partials reach admin + lang_en rooms when viewers are joined.
Admin bot-status + URL replay on auth verified with a manual refresh after bot was already running.

🤖 Generated with Claude Code

Streaming-style transcription: while a speaker keeps talking, Polyglot re-runs Whisper + translation on the growing buffer every 10 seconds and emits a payload with the same utterance_id. The frontend upserts by utterance_id so the same row updates in place; when the final batch flushes (speaker switch or 60 s cap) it replaces the partial with the polished version. Backend - New partial_transcribe_and_emit() runs Whisper + translations on a buffer snapshot, uses _current_bot_speaker as the speaker label, skips diarization, transcript-file write, and summarization accumulation. - Shares the existing transcription_lock so partial + final Whisper calls serialise on the GPU. - Mints a fresh utterance_id on the first chunk of every batch (both the is_processing=True and =False branches of process_audio) and resets it when the final flushes, so every partial + final for one turn shares the same key. - Final WS payloads now carry utterance_id + is_partial=false; partials carry utterance_id + is_partial=true. Admin auth replays the current bot_status (including the meeting URL) + meet_roster so a late- arriving admin tab reflects the active bot. - Cache-Control: no-store headers on the viewer + admin routes so browsers always fetch fresh templates. Bot - Emits a `bot_info` event on connect with the meeting URL so the admin panel shows which call the bot is attached to, even when the bot was started from the CLI. - nameFromTile() now strips the "'s Presentation" suffix so a screen-sharing participant's tile doesn't label them "X's Presentation". Frontend (admin + viewer) - Transcript is now keyed by utterance_id. Partial arrivals upsert the matching row; finals replace it. Timestamps are preserved across the swap so rows don't reshuffle. - In-place styling distinguishes the two states: partials get a dashed left border + "● live" tag + 0.75 opacity, finals get a solid border + full opacity. - Viewer row segments render in a vertical body column instead of the flex-row that was making multi-segment utterances render horizontally. - Viewer keeps the ?p=<password> query param across refreshes so the user doesn't have to re-enter the passphrase on every reload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Partial transcription interval is now a runtime-mutable threshold (audio_thresholds["partial_interval_sec"]), default 5 s (was 10 s). Settable via /api/thresholds, overridable via PARTIAL_INTERVAL_SEC env var. New admin Settings slider: "Partial Transcription Interval (seconds)" — 2..30 s, step 1. - Admin auth replays the current bot_status (with meeting URL) and meet_roster so a late-joining admin tab immediately reflects the active bot instead of showing Start-bot while a bot is already live. - Admin + viewer transcript renderers now group consecutive same- speaker segments: speaker name shown once at the top, each sentence rendered as a bulleted line with its own :SS timestamp (the second of the minute when the sentence started). Cuts the repetition when one speaker says several sentences in one turn. - Viewer keeps the ?p=<password> query param across refreshes so the passphrase survives reloads. - Remove the "Speaking: X" banner that toggled on/off every ~1.5 s as captions paused. Partial bubbles already carry the speaker name inline, so the banner was redundant and visually noisy. - Cache-Control: no-store on / /viewer /admin so browsers always pull fresh templates (no more hard-refresh needed during development). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

etiennechabert and others added 2 commits April 21, 2026 14:10

etiennechabert merged commit 2e54eb6 into main Apr 21, 2026
2 of 6 checks passed

etiennechabert deleted the claude/streaming-partial-transcription branch April 21, 2026 12:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial transcription, in-place history rewrite, live UI polish#14

Partial transcription, in-place history rewrite, live UI polish#14
etiennechabert merged 2 commits into
mainfrom
claude/streaming-partial-transcription

etiennechabert commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

etiennechabert commented Apr 21, 2026

Summary

What ships

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant