v1.18.1 — TTS: strip JA full-stops before Kokoro misaki phonemizer by amariichi · Pull Request #59 · amariichi/MinimumHeadroom

amariichi · 2026-05-24T11:12:55Z

Summary

Fixes an audible artifact at the end of Japanese sentences spoken through Kokoro: misaki's pyopenjtalk-backed G2P maps the JA full-stop 「。」 (and its fullwidth ASCII twin 「．」) to an actual phoneme rather than silence, so Kokoro rendered chunk endings as a short "ye"-like sound.

Diagnosis

Confirmed on hardware:

あ。 → user heard あ plus an extra sound
あ。。。。。 → user heard あ followed by five repeated イヤ-like sounds (one per period), demonstrating each 。 reaches misaki and produces a phoneme

Fix

New module tts-worker/src/tts_worker/kokoro_text.py exposes strip_japanese_silent_punctuation (regex [。．]+ → empty).
KokoroEngine._to_ja_phonemes strips the input before calling misaki, and returns '' if the chunk becomes empty after stripping.
KokoroEngine.synthesize_chunks now skips chunks that come back empty from _to_ja_phonemes (avoids misaki ja g2p returned empty phoneme output).
Other JA punctuation (「、」「！」「？」「・」「…」) is intentionally left in place — those either drive prosodic pausing or have not been observed to produce artifacts.
Shared text normalization (tts_worker.shared_text) is untouched, so the existing "preserve JA punctuation" contract there still holds; the strip is engine-local to Kokoro, parallel to how qwen3_text.py houses Qwen3-specific text prep.

Tests

tts-worker/tests/test_kokoro_text.py (8 tests): trailing 「。」, repeated 「。」 runs, internal+trailing, fullwidth 「．」, preservation of other JA punctuation, punctuation-only input → empty, empty input passthrough, ASCII passthrough.

Test plan

Ran new + existing tts-worker tests locally — 25/25 pass.
Live hardware test on AtomS3R after a fresh Kokoro restart: あ。 now sounds like あ only, no trailing artifact.
Reviewer: confirm synthesize_chunks still finalizes correctly when every chunk happens to be punctuation-only (rare; the chunk loop simply produces no audio and returns the zero-sample fallback).

🤖 Generated with Claude Code

Misaki's pyopenjtalk-backed Japanese G2P maps the JA full-stop 「。」 (and its fullwidth ASCII twin 「．」) to an audible phoneme rather than silence, so Kokoro rendered chunk endings as a short "ye"-like artifact. Confirmed on hardware with `あ。` (one artifact) and `あ。。。。。` (five artifacts). Fix: new `tts_worker.kokoro_text.strip_japanese_silent_punctuation` runs in `KokoroEngine._to_ja_phonemes` immediately before misaki sees the chunk text. Chunk boundaries already separate sentences, so dropping these characters removes the artifact without losing meaningful prosody. Other JA punctuation (「、」「！」「？」「・」「…」) is left in place because it either drives prosodic pausing or has not been observed to produce an artifact. If a chunk becomes empty after stripping, the engine now skips it instead of feeding empty text to misaki (which raised "misaki ja g2p returned empty phoneme output"). Tests: 8 new unit tests in `tts-worker/tests/test_kokoro_text.py` cover trailing, repeated, internal, fullwidth, ASCII passthrough, empty input, and preservation of other JA punctuation. Verified end-to-end on hardware after a fresh Kokoro restart. Version bumped across all six sites. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

amariichi merged commit 0ea6511 into main May 24, 2026
1 check passed

amariichi deleted the fix/kokoro-ja-period-artifact branch May 24, 2026 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.18.1 — TTS: strip JA full-stops before Kokoro misaki phonemizer#59

v1.18.1 — TTS: strip JA full-stops before Kokoro misaki phonemizer#59
amariichi merged 1 commit into
mainfrom
fix/kokoro-ja-period-artifact

amariichi commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amariichi commented May 24, 2026

Summary

Diagnosis

Fix

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant