feat: anti-repetition protection for whisper transcription#552
Open
egsok wants to merge 1 commit intoOpenWhispr:mainfrom
Open
feat: anti-repetition protection for whisper transcription#552egsok wants to merge 1 commit intoOpenWhispr:mainfrom
egsok wants to merge 1 commit intoOpenWhispr:mainfrom
Conversation
Three layers against hallucination loops: - --max-context 128 limits cross-segment context to break feedback loops - --entropy-thold 2.8 discards low-confidence (hallucinated) segments - Regex dedup removes consecutive repeated phrases (10+ chars) from output New "Suppress repeated phrases" toggle in Settings → Transcription (default: on). i18n keys added for all 10 locales. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
af39f9d to
cbebebe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Whisper (especially large-v3) is prone to hallucination loops — repeating the same phrase indefinitely, particularly during silence or at audio segment boundaries.
The root cause: Whisper uses the previous segment's output as context for the next. A repetition in segment N propagates to N+1, N+2, etc. — creating an exponential feedback loop. This is one of the most widely reported Whisper issues:
-mc+ large-v2--max-context 64or32,--beam-size 5,--entropy-thold ~2.8-mc 0=condition_on_previous_text=FalseSolution
Three layers of protection:
1.
--max-context 128— limits cross-segment context to 128 tokens, breaking the feedback loop. Does NOT affectinitial_prompt(custom prompt and dictionary words are always sent in full). Maintainer recommends 64 (#1507); we use a more conservative 128 to preserve cross-segment continuity, relying on the regex dedup layer to catch any loops that slip through.2.
--entropy-thold 2.8— discards segments where model confidence is very low (high entropy = silence/noise/hallucination). Default is 2.4; raised to 2.8 per maintainer recommendation in #1507. Zero performance overhead.3. Regex dedup — post-processing safety net that removes consecutive repeated phrases (10+ chars) from final output. 10-char minimum avoids false positives on common short phrases.
--max-context--entropy-tholdHow Whisper context works
UI
New "Suppress repeated phrases" toggle in Settings → Transcription (default: on). When disabled, only regex dedup is skipped — whisper.cpp flags remain active as they improve quality regardless.
Files changed
src/helpers/whisperServer.js--max-context 128and--entropy-thold 2.8flagssrc/helpers/audioManager.jsremoveRepetitions()regex dedup + wiring inprocessTranscription()src/components/SettingsPage.tsxsrc/hooks/useSettings.ts+src/stores/settingsStore.tssuppressRepetitionboolean settingsrc/locales/*/translation.jsonTest plan
--max-context 128--max-context 128and--entropy-thold 2.8in server startup args