A collection of vocal tract parameter manipulation tools for converting audio to articulatory parameters and applying various transformations.
| Package | Installation |
|---|---|
| python-flucoma | https://github.com/jamesb93/python-flucoma (follow instructions there) |
| tensortract2 | pip install tensortract2 |
| vocaltractlab-cython | pip install vocaltractlab-cython |
| soundfile | pip install soundfile |
| scipy | pip install scipy (for cubic interpolation) |
Create the output folder structure from the script directory:
audio/output/
Simple audio-to-speech with manipulations to VTL timeseries.
Command Line Examples:
# Basic conversion (no manipulation)
python audio2tract_v1.py my_audio.wav
# Multiply TCX by 1.5 (increase by 50%)
python audio2tract_v1.py my_audio.wav -m TCX multiply 1.5
# Set f0 to constant 150 Hz
python audio2tract_v1.py my_audio.wav -m f0 set 150
# Add 0.5 to TTY
python audio2tract_v1.py my_audio.wav -m TTY add 0.5
# Smooth the pressure parameter with window size 10
python audio2tract_v1.py my_audio.wav -m pressure smooth 10
# Invert TCX around its mean
python audio2tract_v1.py my_audio.wav -m TCX invert 0
# Multiple manipulations at once
python audio2tract_v1.py my_audio.wav -m TCX multiply 1.5 -m TTY add 0.5 -m f0 smooth 10
# List available parameters
python audio2tract_v1.py --list-params
# List available operations
python audio2tract_v1.py --list-opsaudio2tract_v1_morph2files_drawreplace_GUI.py
Morphing patch between articulator timeseries between two audio files.
Features:
- Vox Prima and Vox Secunda inputs
- Morphs between timeseries over the whole duration
- Each parameter can follow:
- The global curve (Global Settings tab)
- Its own local curve (Tract Parameters & Glottis Parameters tabs)
- Be replaced by the Vox Secunda timeseries
3. audio2tract_v1_scaling_dynamicmovement2_freezepass_individual_shift_segmentandtimemod_stoch_directSynthesis.py + GUI
audio2tract_v1_scaling_dynamicmovement2_freezepass_individual_shift_segmentandtimemod_stoch_directSynthesis_GUI.py
Advanced segmentation and time-stretching patch using FluCoMa novelty detection.
Features:
| Feature | Description |
|---|---|
| Novelty Slicing | Auto-segment audio based on spectral changes (spectrum/mfcc/chroma/pitch/loudness) |
| Time Stretching | Speed up/slow down segments independently |
| Stochastic Stretch | Stretch factors evolve across iterations (starts at 1.0x, drifts within range) |
| Parameter Scaling | Dynamic variation with stochastic curves (passthrough/freeze/stochastic modes) |
| Parameter Shifting | Remap movements between articulators (numeric or visual drag-and-drop) |
| Iterations | Generate multiple concatenated variations |
Stochastic Stretch Distributions:
random_walk– Small random steps each iterationuniform– Completely random within rangegaussian– Normal distribution around current valuebrownian– Accumulating random motionmean_reverting– Tends back toward center
CLI Examples:
# Basic with segmentation
python audio2tract_v2_segments.py audio.wav --enable-slicing
# With stochastic stretch evolution
python audio2tract_v2_segments.py audio.wav \
--enable-slicing \
--stochastic-stretch \
--stretch-min 0.7 --stretch-max 1.5 \
--stretch-step-size 0.15 \
--num-iterations 5
# With parameter shifting
python audio2tract_v2_segments.py audio.wav \
--enable-shift --tract-shift 3 --glottis-shift -2Tract Parameters (0-18):
HX, HY, JX, JA, LP, LD, VS, VO, TCX, TCY, TTX, TTY, TBX, TBY, TRX, TRY, TS1, TS2, TS3
Glottis Parameters (19-29):
f0, pressure, x_bottom, x_top, chink_area, lag, rel_amp, double_pulsing, pulse_skewness, flutter, aspiration_strength
Generates VTL tract sequence files (.txt) in audio/output/ compatible with VocalTractLab 2.3