Feat/supertonic tts engine by tantara · Pull Request #1067 · heygen-com/hyperframes

tantara · 2026-05-24T21:16:57Z

What

Supertonic 3 as a second --engine, engine abstraction refactor, skill routing guidance, download hardening — with usage examples.

Why

Kokoro's 8-language / Python+espeak-ng limits vs Supertonic's 31 languages in-process, and the deliberate English-or-Chinese → Kokoro, everything else → Supertonic routing (Chinese is Kokoro-only).

How

per-file breakdown plus the three notable decisions: zero new deps, the 24 kHz↔44.1 kHz sample-rate handling (producer resamples to 48 kHz), and abstraction-over-replacement.

Examples

[kokoro:english]

scope-persona-intro-english-voiceover-under10mb.mp4

[supertonic:korean]

scope-persona-intro-korean-supertonic-under10mb.mp4

Test plan

How was this tested?

Unit tests added/updated
Manual testing performed
Documentation updated (if applicable)

Refactor the Kokoro-only TTS command into a pluggable engine system and add Supertonic 3 as a second backend. Unlike Kokoro (which shells out to Python via kokoro-onnx), Supertonic runs the full 4-model pipeline in-process through onnxruntime-node -- no Python, no new dependencies. - engine.ts: TtsEngine interface + lazy getEngine() registry (kokoro|supertonic) - engines/kokoro.ts: adapter over the existing kokoro-onnx path (unchanged behavior) - engines/supertonic/manager.ts: auto-downloads models + voice styles from huggingface.co/Supertone/supertonic-3 to ~/.cache/hyperframes/tts/supertonic/ - engines/supertonic/runtime.ts: inference pipeline ported from upstream helper.js - engines/supertonic/index.ts: SupertonicEngine implementation - commands/tts.ts: add --engine and --steps flags; route voices/langs per engine 44.1 kHz Supertonic output is compatible with the producer audio mixer, which resamples all inputs to 48 kHz. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add engine-selection guidance to the media skill now that two TTS engines exist. Rule of thumb: English or Chinese -> Kokoro; everything else -> Supertonic. - Kokoro is the only engine with Chinese (zh); Supertonic has no Chinese. - Supertonic covers 31 languages and needs no Python/espeak-ng, so it is the preferred path for Korean, German, Russian, Arabic, and the other non-English languages Kokoro cannot synthesize. - Document --engine/--steps usage, Supertonic voices (F1-F5, M1-M5), the full 31-code --lang list, and per-engine requirements. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Follow 303/307/308 in addition to 301/302, resolve relative Location headers against the request URL, and cap redirects at 10 to avoid infinite loops. Improves reliability of model downloads that bounce through CDN redirects (e.g. Hugging Face for the Supertonic TTS assets). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tantara and others added 3 commits May 24, 2026 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/supertonic tts engine#1067

Feat/supertonic tts engine#1067
tantara wants to merge 3 commits into
heygen-com:mainfrom
tantara:feat/supertonic-tts-engine

tantara commented May 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tantara commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Examples

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tantara commented May 24, 2026 •

edited

Loading