I created a fully linux and windows compatible fork.
https://github.com/mytait/Zonos
Easy Installaion
pre-requisite: make sure CUDA toolkit and drivers at least 12.4 is installed.
All you need to do to install (on any of both sytems) is:
- create a python 3.10 environment
- clone the repo (download it)
- install espeak NG: (linux ubuntu
apt install -y espeak-ng) (windows with the official installer at https://github.com/espeak-ng/espeak-ng/releases or easier with the command i provide in the readme)
- run one command:
pip install -r requirements1.txt -r requirements2.txt
nothing else. then it shold run!!!
IMPORTANT: Gradio will ask you to use adress: 0.0.0.0:7860. That does not work: use http://127.0.0.1:7860/ instead
Easy as peasy or what?
I tested this on windows11 with latest updates and Ubuntu24.04 latest updates
Constraints
-I only tested cli inference with local downloaded models.
-Gradio on windows seems to display adress: 0.0.0.0:7860 use http://127.0.0.1:7860/ instead
-did not test MacOS
could you guys try it on windows/linux? then we could create a pull request if its ok by gabrielclark3330 and the others.
Changes
The full code is unchanged apart from:
-repackaged the requirements into 2 requirement files
-added 4 lines for windows to conditioning.py
no need for:
-uv
-pyproject.toml (i kept it but you dont need it anymore for normal inference i think)
HUGE CREDIT to: sdbds! i actually took his solution from #47 and the missing libraries he compiled and repackaged them in a python friendly, os-indepentent manner removing the need for the powershell script.
also i got the hints from Dao-AILab/causal-conv1d#46
details:
-linux: the installation happens in 2 phases: first pytorch then the other libs. else conv1d shows an error that it can not find torch (actually windows only needs the requirements2.txt but its nice to have a single command)
-in windows the enspeak-ng package is not seen by python so i added an environment variable to the code so its found (these are the 4 lines)
IF and only if you run into problems because MSVC (visual studio compiler) it might be because you dont have it installed or you have a wrong version. you can run this command in a command shell with admin rights to install the 2022 MSVC:
Windows11:
winget install --id=Microsoft.VisualStudio.2022.BuildTools --force --override "--wait --passive --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 --add Microsoft.VisualStudio.Component.Windows11SDK.22621" -e --silent --accept-package-agreements --accept-source-agreements
rem use this on windows 10
winget install --id=Microsoft.VisualStudio.2022.BuildTools --force --override "--wait --passive --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 --add Microsoft.VisualStudio.Component.Windows10SDK" -e --silent --accept-package-agreements --accept-source-agreements
or you can get the official installer from microsoft and make sure to install the compiler components as well as the windows SDK
I have it installed anyway but in other forums i read that you might need it for triton.
I created a fully linux and windows compatible fork.
https://github.com/mytait/Zonos
Easy Installaion
pre-requisite: make sure CUDA toolkit and drivers at least 12.4 is installed.
All you need to do to install (on any of both sytems) is:
apt install -y espeak-ng) (windows with the official installer athttps://github.com/espeak-ng/espeak-ng/releasesor easier with the command i provide in the readme)pip install -r requirements1.txt -r requirements2.txtnothing else. then it shold run!!!
IMPORTANT: Gradio will ask you to use adress: 0.0.0.0:7860. That does not work: use http://127.0.0.1:7860/ instead
Easy as peasy or what?
I tested this on windows11 with latest updates and Ubuntu24.04 latest updates
Constraints
-I only tested cli inference with local downloaded models.
-Gradio on windows seems to display adress: 0.0.0.0:7860 use http://127.0.0.1:7860/ instead
-did not test MacOS
could you guys try it on windows/linux? then we could create a pull request if its ok by gabrielclark3330 and the others.
Changes
The full code is unchanged apart from:
-repackaged the requirements into 2 requirement files
-added 4 lines for windows to conditioning.py
no need for:
-uv
-pyproject.toml (i kept it but you dont need it anymore for normal inference i think)
HUGE CREDIT to: sdbds! i actually took his solution from #47 and the missing libraries he compiled and repackaged them in a python friendly, os-indepentent manner removing the need for the powershell script.
also i got the hints from Dao-AILab/causal-conv1d#46
details:
-linux: the installation happens in 2 phases: first pytorch then the other libs. else conv1d shows an error that it can not find torch (actually windows only needs the requirements2.txt but its nice to have a single command)
-in windows the enspeak-ng package is not seen by python so i added an environment variable to the code so its found (these are the 4 lines)
IF and only if you run into problems because MSVC (visual studio compiler) it might be because you dont have it installed or you have a wrong version. you can run this command in a command shell with admin rights to install the 2022 MSVC:
Windows11:
winget install --id=Microsoft.VisualStudio.2022.BuildTools --force --override "--wait --passive --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 --add Microsoft.VisualStudio.Component.Windows11SDK.22621" -e --silent --accept-package-agreements --accept-source-agreementsrem use this on windows 10
winget install --id=Microsoft.VisualStudio.2022.BuildTools --force --override "--wait --passive --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 --add Microsoft.VisualStudio.Component.Windows10SDK" -e --silent --accept-package-agreements --accept-source-agreementsor you can get the official installer from microsoft and make sure to install the compiler components as well as the windows SDK
I have it installed anyway but in other forums i read that you might need it for triton.