Skip to content

hogobogobogo/audio2tract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio2Tract Patches

A collection of vocal tract parameter manipulation tools for converting audio to articulatory parameters and applying various transformations.


Requirements

Package Installation
python-flucoma https://github.com/jamesb93/python-flucoma (follow instructions there)
tensortract2 pip install tensortract2
vocaltractlab-cython pip install vocaltractlab-cython
soundfile pip install soundfile
scipy pip install scipy (for cubic interpolation)

Setup

Create the output folder structure from the script directory:

audio/output/

Patches

1. audio2tract_v1.py

Simple audio-to-speech with manipulations to VTL timeseries.

Command Line Examples:

# Basic conversion (no manipulation)
python audio2tract_v1.py my_audio.wav

# Multiply TCX by 1.5 (increase by 50%)
python audio2tract_v1.py my_audio.wav -m TCX multiply 1.5

# Set f0 to constant 150 Hz
python audio2tract_v1.py my_audio.wav -m f0 set 150

# Add 0.5 to TTY
python audio2tract_v1.py my_audio.wav -m TTY add 0.5

# Smooth the pressure parameter with window size 10
python audio2tract_v1.py my_audio.wav -m pressure smooth 10

# Invert TCX around its mean
python audio2tract_v1.py my_audio.wav -m TCX invert 0

# Multiple manipulations at once
python audio2tract_v1.py my_audio.wav -m TCX multiply 1.5 -m TTY add 0.5 -m f0 smooth 10

# List available parameters
python audio2tract_v1.py --list-params

# List available operations
python audio2tract_v1.py --list-ops

2. audio2tract_v1_morph2files_drawreplace.py + GUI

⚠️ Run GUI script only: audio2tract_v1_morph2files_drawreplace_GUI.py

Morphing patch between articulator timeseries between two audio files.

Features:

  • Vox Prima and Vox Secunda inputs
  • Morphs between timeseries over the whole duration
  • Each parameter can follow:
    • The global curve (Global Settings tab)
    • Its own local curve (Tract Parameters & Glottis Parameters tabs)
    • Be replaced by the Vox Secunda timeseries

3. audio2tract_v1_scaling_dynamicmovement2_freezepass_individual_shift_segmentandtimemod_stoch_directSynthesis.py + GUI

⚠️ Run GUI script only: audio2tract_v1_scaling_dynamicmovement2_freezepass_individual_shift_segmentandtimemod_stoch_directSynthesis_GUI.py

Advanced segmentation and time-stretching patch using FluCoMa novelty detection.

Features:

Feature Description
Novelty Slicing Auto-segment audio based on spectral changes (spectrum/mfcc/chroma/pitch/loudness)
Time Stretching Speed up/slow down segments independently
Stochastic Stretch Stretch factors evolve across iterations (starts at 1.0x, drifts within range)
Parameter Scaling Dynamic variation with stochastic curves (passthrough/freeze/stochastic modes)
Parameter Shifting Remap movements between articulators (numeric or visual drag-and-drop)
Iterations Generate multiple concatenated variations

Stochastic Stretch Distributions:

  • random_walk – Small random steps each iteration
  • uniform – Completely random within range
  • gaussian – Normal distribution around current value
  • brownian – Accumulating random motion
  • mean_reverting – Tends back toward center

CLI Examples:

# Basic with segmentation
python audio2tract_v2_segments.py audio.wav --enable-slicing

# With stochastic stretch evolution
python audio2tract_v2_segments.py audio.wav \
    --enable-slicing \
    --stochastic-stretch \
    --stretch-min 0.7 --stretch-max 1.5 \
    --stretch-step-size 0.15 \
    --num-iterations 5

# With parameter shifting
python audio2tract_v2_segments.py audio.wav \
    --enable-shift --tract-shift 3 --glottis-shift -2

Parameter Reference

Tract Parameters (0-18): HX, HY, JX, JA, LP, LD, VS, VO, TCX, TCY, TTX, TTY, TBX, TBY, TRX, TRY, TS1, TS2, TS3

Glottis Parameters (19-29): f0, pressure, x_bottom, x_top, chink_area, lag, rel_amp, double_pulsing, pulse_skewness, flutter, aspiration_strength


Output

Generates VTL tract sequence files (.txt) in audio/output/ compatible with VocalTractLab 2.3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages