Skip to content

v1.17.3 — TTS: scope hai-filler to Qwen3 + fullwidth-ify JA halfwidth digits#56

Merged
amariichi merged 2 commits into
mainfrom
release/v1.17.3
May 23, 2026
Merged

v1.17.3 — TTS: scope hai-filler to Qwen3 + fullwidth-ify JA halfwidth digits#56
amariichi merged 2 commits into
mainfrom
release/v1.17.3

Conversation

@amariichi
Copy link
Copy Markdown
Owner

Summary

  • Move the apply_japanese_leading_*_filler calls out of the shared text normalizer and into the Japanese branch of prepare_qwen3_text. The leading "はい、" was a Qwen3-only Mandarin-drift mitigation; Kokoro+misaki has no such drift and was getting an unwanted prefix on every halfwidth-leading sentence (e.g. execplanを作成しました。はい、execplanを…).
  • Convert halfwidth digits to fullwidth inside normalize_japanese_tts_text so misaki/Kokoro stays on the Japanese G2P path. Fixes the primary motivating case: 今日は5月23日です。 was rendering as ファイブ月とウェンティースリー日です. Pure-English utterances are unaffected because they go through normalize_english_tts_text instead.
  • 6 new shared_text tests (date phrase, dotted version, mixed sentence with thousands separator, pure-English routing, decoration); 3 existing tests updated to match the new behavior; 3 new qwen3_text tests asserting Qwen3-Japanese-mode still receives the filler.

Test plan

  • cd tts-worker && uv run python -m unittest discover -s tests → 42/42 pass
  • npm test → 372/372 pass
  • End-to-end through the live Kokoro pipeline (operator stack restarted in-place):
    • 今日は5月23日です。 → Japanese date reading
    • execplanを作成しました。 → no leading "はい、"
    • 23日までに完了します。 → Japanese digit reading, no filler
    • The build runs at 5:30 on port 8080. → English reading preserved
    • 現在のバージョンは1.2.3です。 → Japanese reading of 1.2.3
  • Reviewer: spot-check that the fullwidth conversion does not introduce surprising readings on mixed JA/English sentences (e.g. API 2.0 を試したAPI 2.0 を試した — acceptable since misaki is already in JA mode when JA chars are present)

🤖 Generated with Claude Code

amariichi and others added 2 commits May 23, 2026 21:32
…JA text

Two independent corrections to the shared text pipeline that affect how
Kokoro (the default engine) reads Japanese utterances.

1. The "はい、" leading filler was added to shared_text to nudge Qwen3
   away from Mandarin pronunciation when a Japanese sentence opens with
   a halfwidth ASCII or numeric token. Kokoro+misaki does not have that
   drift, so receiving the filler added a spurious "Yes," before
   sentences such as "execplanを作成しました。". Move
   apply_japanese_leading_numeric_filler and
   apply_japanese_leading_unknown_ascii_filler out of
   normalize_japanese_tts_text and into the Japanese branch of
   prepare_qwen3_text. Qwen3 behavior is unchanged; Kokoro stops getting
   the prefix.

2. Halfwidth digits embedded in Japanese text used to fall through to
   misaki's English G2P, so "今日は5月23日です。" rendered as
   "ファイブ月とウェンティースリー日". Convert all halfwidth digits to
   fullwidth in normalize_japanese_tts_text so misaki keeps them on the
   Japanese G2P path ("今日は5月23日です。"). Pure English utterances
   are routed through normalize_english_tts_text and are not touched,
   so "The build runs at 5:30 on port 8080." still reads as English.

Tests:

- shared_text: 6 new cases covering the JA-digit conversion (date
  phrase, dotted version, mixed sentence with thousands separator,
  pure-English routing, decoration). Updated 3 existing cases whose
  asserted output no longer includes the moved-out "はい、" prefix or
  the previously-untouched halfwidth digits.
- qwen3_text: 3 new cases asserting that Qwen3 in Japanese mode still
  receives the leading filler for halfwidth-start sentences while
  known-token sentences (GitHub...) remain unaffected.
- Full python suite: 42/42 pass. Full node suite: 372/372 pass.

Verified end-to-end through the live Kokoro pipeline:

- "今日は5月23日です。" → JA reading
- "execplanを作成しました。" → no spurious leading "はい、"
- "23日までに完了します。" → JA digit reading, no filler
- "The build runs at 5:30 on port 8080." → English reading preserved
- "現在のバージョンは1.2.3です。" → JA reading of 1.2.3

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Patch release for the Kokoro-side TTS text-pipeline cleanup. Six sites
move together: package.json, mcp-server/dist/index.js SERVER_VERSION,
tts-worker/pyproject.toml, asr-worker/pyproject.toml, and both uv.lock
files (refreshed via `uv lock`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@amariichi amariichi merged commit dd0dec1 into main May 23, 2026
1 check passed
@amariichi amariichi deleted the release/v1.17.3 branch May 23, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant