Skip to content

feat: add MiniMax Cloud TTS engine (speech-2.8-hd / speech-2.8-turbo)#53

Open
octo-patch wants to merge 1 commit into
DrewThomasson:mainfrom
octo-patch:feature/add-minimax-tts
Open

feat: add MiniMax Cloud TTS engine (speech-2.8-hd / speech-2.8-turbo)#53
octo-patch wants to merge 1 commit into
DrewThomasson:mainfrom
octo-patch:feature/add-minimax-tts

Conversation

@octo-patch

Copy link
Copy Markdown

Summary

This PR adds MiniMax Cloud TTS as a first-class TTS engine in VoxNovel, alongside the existing local Coqui / StyleTTS2 models.

Changes

  • minimax_tts.py (new): Lightweight client for the MiniMax T2A v2 API. Accepts text + voice ID, decodes hex-encoded MP3 response, converts to WAV via ffmpeg. Supports speech-2.8-hd and speech-2.8-turbo models.
  • 2GPU_Audio_generation.py: "MiniMax TTS" added to TTS model combobox; 12 verified English voice IDs added to character voice-actor dropdowns; generate_audio() dispatches to MiniMax API when selected.
  • headless_voxnovel.py: Same three changes for the headless / Colab path.
  • tests/test_minimax_tts.py (new): 24 unit tests (HTTP + subprocess mocked) + 3 integration tests; all 27 pass.

How to use

  1. Set your API key: export MINIMAX_API_KEY=your_key_here
  2. Launch VoxNovel and select MiniMax TTS in the TTS model dropdown.
  3. Each character voice-actor dropdown now includes minimax:English_Graceful_Lady, minimax:Deep_Voice_Man, etc.
  4. Assign a MiniMax voice to each character and click Generate.

Available voices

English_Graceful_Lady, English_Insightful_Speaker, English_radiant_girl, English_Persuasive_Man, English_Lucky_Robot, Wise_Woman, cute_boy, lovely_girl, Friendly_Person, Inspirational_girl, Deep_Voice_Man, sweet_girl

Why MiniMax TTS?

  • No GPU required - great for machines without CUDA
  • High quality speech-2.8-hd model
  • Fast speech-2.8-turbo variant for lower latency
  • Zero local model download - ideal for Colab

Test plan

  • 24 unit tests pass without network (python -m pytest tests/test_minimax_tts.py -k "not Integration")
  • 3 live integration tests pass with MINIMAX_API_KEY set
  • All 27 tests pass

Add MiniMax TTS (speech-2.8-hd / speech-2.8-turbo) as a selectable TTS
engine alongside the existing local Coqui/StyleTTS2 models.

Changes:
- minimax_tts.py: new MiniMaxTTS client
- 2GPU_Audio_generation.py / headless_voxnovel.py: MiniMax TTS in dropdown + voice actor list + generate_audio() dispatch
- tests/test_minimax_tts.py: 24 unit + 3 integration tests (all 27 pass)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant