A comprehensive Gradio-based Web User Interface for managing, training, and running inference on the Index-TTS text-to-speech model. This interface streamlines the entire ML pipeline from data preparation to final audio generation.
ORIGINAL REPO : INDEX-TTS Official Repository
Note on Language Support: This project is specifically designed for training the Turkish language; nevertheless, it can be used to train other Latin-based languages. For non-Latin languages, modifications to the code may be required.
This WebUI provides a modular, tabbed workflow:
- Home: Project management and real-time hardware monitoring (CPU, RAM, VRAM, Temps).
- Models: Model checkpoint selection and management.
- Corpus & Dataset: Audio and text data ingestion, formatting, and dataset compilation.
- Tokenizer & Preprocessor: Text tokenization and audio preprocessing pipelines for model ingestion.
- Trainer: Interface to configure and monitor Index-TTS model training/fine-tuning.
- Inference: Generate high-fidelity audio from text using trained checkpoints.
- TTS: A standalone inference engine that bypasses project settings for direct model loading, zero-shot controls, and rapid generation.
- Voice Conversion (RVC): Integrated Applio RVC architecture for zero-shot voice conversion and highly accurate timbre modification.
- NVIDIA GPU (Highly recommended for Training and Inference)
- CUDA Toolkit compatible with your PyTorch installation
- Windows 10+
To set up the ITTS-TR environment properly, please follow these steps:
- Get the Repository: Clone or download this repository to your local machine.
- Run the Installer: Navigate to the bat folder containing the setup scripts and double-click the
install.batfile. - Follow the On-Screen Prompts: The batch script will guide you through the following automated setup phases:
- Git Installation: You will be prompted to install a portable version of GitHub if you do not already have it.
- Python Setup: Enter Python version 3.11.9 when prompted. The script will download, extract, and configure an isolated Python environment, including necessary C++ headers and libraries via NuGet.
- Base Dependencies: The script installs modern build backends (
uvandsetuptools) and automatically installs the core Python requirements defined inrequirments.txt. - PyTorch & CUDA Configuration: The script will automatically detect the recommended Torch version (e.g., 2.8.0). You will be asked if you want to install it with CUDA support. If you proceed, you can select your preferred CUDA version (12.6, 12.8, or 13.0) to ensure proper GPU acceleration. Version 12.8 is highly recommended.
- FFmpeg Installation: You will be prompted to install FFmpeg, with options to choose either the Stable (v7.1.1) or the Latest Release.
- yt-dlp: You can optionally choose to install the
yt-dlpexecutable for media downloading. - Core Model Cloning: You will be asked to clone the sparse
index-ttsrepository. - RVC Integration: You will be prompted to clone the sparse
Appliorepository to integrate RVC features into the pipeline. - Finalization & Patching: Finally, the script will automatically initialize the WebUI workspace folders (
uixandwui) and apply mandatory dependency fixes to the Index-TTS, SpeechBrain, and RVC codebases.
The paths.bat configuration file contains an HF_TOKEN environment variable. This token is strictly required to authenticate and download certain gated models and weights from the Hugging Face Hub.
If you do not already have an HF_TOKEN configured as a global environment variable on your Windows system, you must open paths.bat in a text editor and manually insert your Hugging Face access token before attempting to download models in the WebUI.
To launch the interface, run the webui batch script from the root directory:
webui.batThe application will generate a projects/ directory to store your workspace data and a wui.json file for your global UI preferences (like language settings). Open the local URL provided in your terminal (typically http://127.0.0.1:7860) in your browser.
To launch the tensorboard, run the tensorboard batch script located inside the bat folder:
bat\tensorboard.batTo achieve maximum training and inference speed by utilizing dynamically compiled GPU kernels, you can enable OpenAI's Triton. Since Triton compiles kernels natively at runtime, Windows users must configure a strict build environment.
System Requirements for Triton: 1. Visual Studio C++ Build Tools: Download the Visual Studio Installer and install the "Desktop development with C++" workload. This provisions the essential MSVC compiler (cl.exe).
-
NVIDIA CUDA Toolkit: Install the official standalone CUDA Toolkit. The version must exactly match the CUDA version you selected for PyTorch during the
install.batphase (e.g., 12.6, 12.8, or 13.0). -
Strict Path Configuration: The dynamic compiler relies on hardcoded system paths. Ensure that your
paths.batfile is configured so that its directory variables strictly match your local MSVC and CUDA Toolkit installation paths. Ifnvccorcl.execannot be located by the batch script's internal routing, Triton will fail to compile the kernels.
This repository utilizes a dual-licensing structure:
1. User Interface & Wrapper Code (Apache 2.0)
The overarching Gradio interface, project management logic, and utility scripts located in the root directory are licensed under the Apache License 2.0. See the LICENSE file in the root directory for full details.
2. Index-TTS Core Model (Bilibili Model Use License Agreement)
The core text-to-speech model, model weights, and specific training code located within the indextts/ directory are owned by Bilibili and are strictly governed by the Bilibili Model Use License Agreement. You can find this agreement in the official gitHub repository of index-tts. By using this software, you agree to comply with its terms, including prohibitions on high-risk deployment.
Any modifications made to the original model in this Derivative Work are not endorsed, warranted, or guaranteed by the original right-holder of the original model, and the original right-holder disclaims all liability related to this Derivative Work.