Releases: HAKORADev/VODER
Releases · HAKORADev/VODER
Major Update v2 - SE Mode, SFX, Script Directives
04/08/2026
- Status: Stable, all features work, still developing
- Major Update — Two New Modes, Script Directives, SFX Integration, and Speech Enhancement
Added
- SE (Speech Enhancement) Mode - Audio quality improvement using UniSE model
- SFX (Sound Effects Generation) Mode - Custom sound effects from text using TangoFlux
- Script Directives - Per-line control over timing, volume, and duration
- SFX Character in Dialogue - Embed sound effects directly in dialogue scripts
- Enhanced Dialogue Assembly - Complete rewrite of dialogue generation pipeline
- Cross-use Feature - Mix generated and cloned voices in the same dialogue
- Music Volume Level Control - Fine-grained control over background music volume
Changed
- Expanded Processing Modes - VODER now has 9 processing modes (up from 7)
Enhancement - Music Chunking and Duration Fix
04/03/2026
- Status: Stable, all features work, still developing
Added
- Background Music Chunking for Long Dialogues - Generate multiple music chunks for dialogues longer than 250 seconds
Fixed
- Music Generation Minimum Duration - Enforced 10-second minimum for music generation to match ACE-Step model requirements
Major Update v1 - STT, Diarization, OCR, YouTube
04/02/2026
- Status: Stable, all features work, still developing
Added
- STT (Speech-to-Text) Standalone Mode - Transcribe audio, video, image, or YouTube URL to text
- Speaker Diarization (Pyannote Integration) - Identify and separate speakers in audio
- Image Text Extraction (EasyOCR) - Extract text from images
- YouTube and Video Platform Download (yt-dlp) - Download audio from YouTube, Bilibili, TikTok
- Automatic Voice Clip Extraction - Extract individual speaker voice clips automatically
Changed
- Centralized Model Management System - Complete overhaul of model storage and caching
Major Update - MSTS and Memory Offloading
02/24/2026
- Status: Stable, all features work, still developing
Added
- MSTS (Music-STS) in STS mode - STS now supports musical inputs via the Seed-VC v1 model
- TTS+VC dialogue voice cloning stability - Voice characteristics extracted once per character
Optimized
- Memory offloading after processing - Models explicitly unloaded from memory/VRAM
Major Update - Dialogue Support
02/12/2026
- Status: Stable, all features work, under aggressive testing, still developing
Added
- Full dialogue support in CLI - Both interactive and one-liner modes now support multi-speaker scripts
- Optional background music for dialogue scripts - Available in TTS and TTS+VC modes
- Row-based dialogue editor in GUI - Replaced free-text script box with per-row Character/Dialogue fields
Fixed
- Memory optimisation for TTM+VC
Seed-VC v2 Fix
02/10/2026
- Status: Stable, all features work, under aggressive testing, still developing
Fixed
- Seed-VC v2 unmatched tensor error which caused both STS and TTM+VC to fail. Now STS works perfectly; TTM+VC will receive further optimisations.
Initial Release - Unstable Development Build
02/09/2026
- Status: unstable, untested, under development
Initial Release - Unstable Development Build
First public release of VODER.