feat: add custom transcription prompt with language presets by egsok · Pull Request #537 · OpenWhispr/openwhispr

egsok · 2026-04-01T07:51:51Z

Problem

Whisper often drops punctuation entirely, producing a wall of unformatted text — especially for non-English languages. This is a well-known issue in the community:

No Punctuation openai/whisper#557 No punctuation for the first 75 minutes of the video. What could be the error / bug? openai/whisper#194 — punctuation loss discussions
OpenAI Whisper Prompting Guide — official guidance on using initial_prompt to steer style

The documented solution: pass a well-punctuated paragraph as initial_prompt. Whisper doesn't follow instructions — it copies the style of the prompt. A paragraph full of commas, question marks, and em-dashes nudges the decoder to keep producing punctuation.

What I tested

I tested this in my fork with a hardcoded Russian punctuation prompt, and it works remarkably well. Russian transcriptions went from completely unpunctuated to properly formatted output with commas, periods, question marks, em-dashes, and quotation marks — consistently, across different recording lengths.

Why not just hardcode a default prompt

The prompt also acts as a language hint. When I had an English prompt hardcoded and the language was set to auto, Whisper started transcribing Russian speech as English. This is expected behavior — the prompt biases the language detector. So pre-filling a default prompt for all users would break the experience for anyone using auto with a non-English language.

Worth noting: a prompt in one language doesn't prevent Whisper from recognizing words in another. For example, a Russian prompt with language set to "Auto" works fine for mixed Russian/English speech — English words are still transcribed correctly.

Solution

This PR adds a Transcription Prompt textarea in Settings → Transcription — empty by default, with an "Insert preset" dropdown offering punctuation prompts for 10 languages. Users pick a preset matching their language (or write their own), and Whisper starts producing punctuated output.

Implementation details:

Empty by default — no language bias for existing users
Presets for: English, Spanish, French, German, Portuguese, Italian, Russian, Japanese, Chinese Simplified/Traditional
Each preset uses native punctuation conventions (Russian «ёлочки», German „Gänsefüßchen", French « guillemets », Japanese 「括弧」, Chinese ""引号"")
Custom prompt placed after dictionary words in the combined prompt — survives Whisper's left-truncation of the 224-token initial_prompt window (dictionary words are lower priority and get truncated first). Truncation direction explicitly documented in code.
Token-aware budget with estimateTokens() heuristic (CJK ×2.2, Cyrillic ×0.5, Latin ×0.25) instead of a flat character limit — prevents CJK users from unknowingly exceeding the token window
Visual progress bar (gray → yellow at 80% → red at 95%) capped at ~112 tokens (~half the window, leaving room for Custom Dictionary)
Description explains how the prompt shares a token budget with Custom Dictionary
i18n: all 10 locales updated

Works with both local whisper.cpp and cloud Whisper API. Does not affect Custom Dictionary behavior.

In the future, we could consider auto-populating the prompt based on the selected transcription language — but I chose not to do that now to avoid breaking the experience for existing users.

Test plan

egsok · 2026-04-01T08:13:24Z

hey @gabrielste1n, I've been using this in my fork for a few weeks — the punctuation improvement is dramatic, especially for Russian. Happy to iterate quickly if anything needs adjusting.

JiwaniZakir

The character-based limit in SettingsPage.tsx (maxLength={900} and the 800/900 warning threshold) doesn't account for token density differences across scripts. For the included Japanese (ja) and Chinese (zh-CN, zh-TW) presets, each character typically maps to 2–3 Whisper tokens, meaning a 900-character CJK prompt could easily consume 1800+ tokens — far exceeding Whisper's 224-token initial_prompt window. This undermines the whole premise of the priority ordering in buildTranscriptionPrompt(), where the custom prompt is placed last specifically to survive truncation.

The comment in audioManager.js — "Custom prompt LAST — survives truncation (higher priority)" — is only correct if Whisper truncates from the left, which should be explicitly documented or linked to the Whisper source/docs, since this is a non-obvious assumption that future maintainers could easily break.

A more robust approach would be to enforce a token-based limit (or at least a much lower character limit for CJK locales) rather than a flat 900-character cap that gives a false sense of safety for non-Latin scripts.

…n prompt The 900-character limit was misleading for CJK scripts where each character maps to ~2.2 Whisper tokens, easily exceeding the 224-token initial_prompt window. Replace with estimateTokens() that weights characters by script (CJK ×2.2, Cyrillic ×0.5, Latin ×0.25) and a visual progress bar capped at 112 tokens (~half the window, leaving room for Custom Dictionary). - Shorten all presets to fit within token budget - Add progress bar with color thresholds (gray → yellow → red) - Enforce token budget in onChange instead of maxLength - Update description in all 10 locales to explain shared token budget - Document left-truncation assumption in audioManager.js Addresses review feedback on PR OpenWhispr#537. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

egsok · 2026-04-05T08:18:55Z

Thanks for the thorough review @JiwaniZakir — all valid points, now addressed in the latest push:

Token-aware budget instead of character limit:

Added estimateTokens() heuristic that approximates Whisper token count by weighting characters by script: CJK ×2.2, Cyrillic ×0.5, Latin ×0.25. This is a lightweight approximation (not a real tokenizer) but sufficient to prevent CJK users from unknowingly blowing past the 224-token window.
Replaced maxLength={900} with an approximate budget of ~112 tokens (~half of Whisper's 224-token window, leaving room for Custom Dictionary)
Visual progress bar with color thresholds (gray → yellow at 80% → red at 95%) instead of the raw character counter

Shortened presets:

All presets now fit within the token budget
CJK presets trimmed significantly while preserving full punctuation variety (！？「」""。，——)

Truncation direction documented:

Updated the comment in audioManager.js to explicitly state that Whisper truncates initial_prompt from the left (keeping rightmost tokens), with a reference to whisper.cpp's tokenize logic

Updated description in all 10 locales to explain the shared token budget with Custom Dictionary.

Add user-editable "Transcription Prompt" textarea in Settings → Transcription with dropdown presets for 10 languages. Whisper copies the formatting style of this prompt, so a well-punctuated paragraph nudges it to produce punctuated output. - Empty by default (avoids language bias in auto-detect mode) - "Insert preset" dropdown: en, es, fr, de, pt, it, ru, ja, zh-CN, zh-TW - Each preset uses native punctuation (Russian «ёлочки», German „Gänsefüßchen", etc.) - Token-aware budget with estimateTokens() heuristic (CJK ×2.2, Cyrillic ×0.5, Latin ×0.25) — progress bar replaces flat character limit - Budget capped at ~112 tokens (~half of Whisper's 224-token window), leaving room for Custom Dictionary - Dictionary words prepended automatically (truncated first by Whisper's 224-token window; left-truncation documented in code) - i18n: all 10 locales updated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

egsok force-pushed the feat/custom-transcription-prompt branch from f3d048d to eb0a350 Compare April 3, 2026 07:23

JiwaniZakir reviewed Apr 4, 2026

View reviewed changes

egsok force-pushed the feat/custom-transcription-prompt branch from bbfd3fe to 5382f8e Compare April 5, 2026 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add custom transcription prompt with language presets#537

feat: add custom transcription prompt with language presets#537
egsok wants to merge 1 commit intoOpenWhispr:mainfrom
egsok:feat/custom-transcription-prompt

egsok commented Apr 1, 2026 •

edited

Loading

Uh oh!

egsok commented Apr 1, 2026

Uh oh!

JiwaniZakir left a comment

Uh oh!

egsok commented Apr 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

egsok commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

What I tested

Why not just hardcode a default prompt

Solution

Test plan

Uh oh!

egsok commented Apr 1, 2026

Uh oh!

JiwaniZakir left a comment

Choose a reason for hiding this comment

Uh oh!

egsok commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

egsok commented Apr 1, 2026 •

edited

Loading

egsok commented Apr 5, 2026 •

edited

Loading