v1.17.3 — TTS: scope hai-filler to Qwen3 + fullwidth-ify JA halfwidth digits#56
Merged
Conversation
…JA text
Two independent corrections to the shared text pipeline that affect how
Kokoro (the default engine) reads Japanese utterances.
1. The "はい、" leading filler was added to shared_text to nudge Qwen3
away from Mandarin pronunciation when a Japanese sentence opens with
a halfwidth ASCII or numeric token. Kokoro+misaki does not have that
drift, so receiving the filler added a spurious "Yes," before
sentences such as "execplanを作成しました。". Move
apply_japanese_leading_numeric_filler and
apply_japanese_leading_unknown_ascii_filler out of
normalize_japanese_tts_text and into the Japanese branch of
prepare_qwen3_text. Qwen3 behavior is unchanged; Kokoro stops getting
the prefix.
2. Halfwidth digits embedded in Japanese text used to fall through to
misaki's English G2P, so "今日は5月23日です。" rendered as
"ファイブ月とウェンティースリー日". Convert all halfwidth digits to
fullwidth in normalize_japanese_tts_text so misaki keeps them on the
Japanese G2P path ("今日は5月23日です。"). Pure English utterances
are routed through normalize_english_tts_text and are not touched,
so "The build runs at 5:30 on port 8080." still reads as English.
Tests:
- shared_text: 6 new cases covering the JA-digit conversion (date
phrase, dotted version, mixed sentence with thousands separator,
pure-English routing, decoration). Updated 3 existing cases whose
asserted output no longer includes the moved-out "はい、" prefix or
the previously-untouched halfwidth digits.
- qwen3_text: 3 new cases asserting that Qwen3 in Japanese mode still
receives the leading filler for halfwidth-start sentences while
known-token sentences (GitHub...) remain unaffected.
- Full python suite: 42/42 pass. Full node suite: 372/372 pass.
Verified end-to-end through the live Kokoro pipeline:
- "今日は5月23日です。" → JA reading
- "execplanを作成しました。" → no spurious leading "はい、"
- "23日までに完了します。" → JA digit reading, no filler
- "The build runs at 5:30 on port 8080." → English reading preserved
- "現在のバージョンは1.2.3です。" → JA reading of 1.2.3
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Patch release for the Kokoro-side TTS text-pipeline cleanup. Six sites move together: package.json, mcp-server/dist/index.js SERVER_VERSION, tts-worker/pyproject.toml, asr-worker/pyproject.toml, and both uv.lock files (refreshed via `uv lock`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
apply_japanese_leading_*_fillercalls out of the shared text normalizer and into the Japanese branch ofprepare_qwen3_text. The leading "はい、" was a Qwen3-only Mandarin-drift mitigation; Kokoro+misaki has no such drift and was getting an unwanted prefix on every halfwidth-leading sentence (e.g.execplanを作成しました。→はい、execplanを…).normalize_japanese_tts_textso misaki/Kokoro stays on the Japanese G2P path. Fixes the primary motivating case:今日は5月23日です。was rendering asファイブ月とウェンティースリー日です. Pure-English utterances are unaffected because they go throughnormalize_english_tts_textinstead.Test plan
cd tts-worker && uv run python -m unittest discover -s tests→ 42/42 passnpm test→ 372/372 pass今日は5月23日です。→ Japanese date readingexecplanを作成しました。→ no leading "はい、"23日までに完了します。→ Japanese digit reading, no fillerThe build runs at 5:30 on port 8080.→ English reading preserved現在のバージョンは1.2.3です。→ Japanese reading of 1.2.3API 2.0 を試した→API 2.0 を試した— acceptable since misaki is already in JA mode when JA chars are present)🤖 Generated with Claude Code