Releases · HAKORADev/VODER · GitHub

07 Apr 18:36

HAKORADev

Major Update v2 - SE Mode, SFX, Script Directives Latest

Latest

04/08/2026

Status: Stable, all features work, still developing
Major Update — Two New Modes, Script Directives, SFX Integration, and Speech Enhancement

Added

SE (Speech Enhancement) Mode - Audio quality improvement using UniSE model
SFX (Sound Effects Generation) Mode - Custom sound effects from text using TangoFlux
Script Directives - Per-line control over timing, volume, and duration
SFX Character in Dialogue - Embed sound effects directly in dialogue scripts
Enhanced Dialogue Assembly - Complete rewrite of dialogue generation pipeline
Cross-use Feature - Mix generated and cloned voices in the same dialogue
Music Volume Level Control - Fine-grained control over background music volume

Changed

Expanded Processing Modes - VODER now has 9 processing modes (up from 7)

Assets 2

07 Apr 18:36

HAKORADev

Enhancement - Music Chunking and Duration Fix

04/03/2026

Status: Stable, all features work, still developing

Added

Background Music Chunking for Long Dialogues - Generate multiple music chunks for dialogues longer than 250 seconds

Fixed

Music Generation Minimum Duration - Enforced 10-second minimum for music generation to match ACE-Step model requirements

Assets 2

07 Apr 18:36

HAKORADev

Major Update v1 - STT, Diarization, OCR, YouTube

04/02/2026

Status: Stable, all features work, still developing

Added

STT (Speech-to-Text) Standalone Mode - Transcribe audio, video, image, or YouTube URL to text
Speaker Diarization (Pyannote Integration) - Identify and separate speakers in audio
Image Text Extraction (EasyOCR) - Extract text from images
YouTube and Video Platform Download (yt-dlp) - Download audio from YouTube, Bilibili, TikTok
Automatic Voice Clip Extraction - Extract individual speaker voice clips automatically

Changed

Centralized Model Management System - Complete overhaul of model storage and caching

Assets 2

07 Apr 18:36

HAKORADev

Major Update - MSTS and Memory Offloading

02/24/2026

Status: Stable, all features work, still developing

Added

MSTS (Music-STS) in STS mode - STS now supports musical inputs via the Seed-VC v1 model
TTS+VC dialogue voice cloning stability - Voice characteristics extracted once per character

Optimized

Memory offloading after processing - Models explicitly unloaded from memory/VRAM

Assets 2

07 Apr 18:36

HAKORADev

Major Update - Dialogue Support

02/12/2026

Status: Stable, all features work, under aggressive testing, still developing

Added

Full dialogue support in CLI - Both interactive and one-liner modes now support multi-speaker scripts
Optional background music for dialogue scripts - Available in TTS and TTS+VC modes
Row-based dialogue editor in GUI - Replaced free-text script box with per-row Character/Dialogue fields

Fixed

Memory optimisation for TTM+VC

Assets 2

07 Apr 18:36

HAKORADev

Seed-VC v2 Fix

02/10/2026

Status: Stable, all features work, under aggressive testing, still developing

Fixed

Seed-VC v2 unmatched tensor error which caused both STS and TTM+VC to fail. Now STS works perfectly; TTM+VC will receive further optimisations.

Assets 2

07 Apr 18:36

HAKORADev

Initial Release - Unstable Development Build Pre-release

Pre-release

02/09/2026

Status: unstable, untested, under development

Initial Release - Unstable Development Build

First public release of VODER.

Assets 2