A fully cross-platform audio generation system using Stability AI's Stable Audio Open (small) model. This project generates musical loops from text prompts using C++ and ONNX Runtime for true native performance on Windows, macOS, Linux, Android, and iOS.
- ✅ Load
stabilityai/stable-audio-open-smallmodel locally via ONNX Runtime - ✅ Generate playable WAV files from text prompts
- ✅ Cross-platform support (Windows, macOS, Linux, Android, iOS)
- ✅ Native C++ implementation for performance and portability
- ✅ Command-line interface for audio generation
- ✅ Mobile proof-of-concept (model loading and inference demonstration)
This project uses C++ with ONNX Runtime for cross-platform compatibility:
- Core: C++ library with ONNX Runtime
- Model: ONNX format (converted from PyTorch)
- Build: CMake for all platforms
- Deployment: Native binaries (no Python runtime needed)
See ARCHITECTURE.md for detailed architecture documentation.
- CMake 3.18+
- C++17 compiler (GCC, Clang, or MSVC)
- ONNX Runtime (platform-specific, see BUILD.md)
- Python 3.10+ (only for model conversion, not runtime)
See BUILD.md for platform-specific instructions.
Quick example (macOS/Linux):
cd third_party
wget https://github.com/microsoft/onnxruntime/releases/download/v1.15.0/onnxruntime-linux-x64-1.15.0.tgz
tar -xzf onnxruntime-linux-x64-1.15.0.tgz
mv onnxruntime-linux-x64-1.15.0 onnxruntime-linux# Install Python dependencies (only needed for conversion)
pip install torch onnx transformers diffusers huggingface_hub
# Convert model
python scripts/convert_to_onnx.py \
--model-id stabilityai/stable-audio-open-small \
--output models/stable_audio.onnxmacOS/Linux:
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . -j$(nproc)Windows:
mkdir build
cd build
cmake .. -G "Visual Studio 16 2019" -A x64
cmake --build . --config Release./build/bin/stable_audio_cli \
--model models/stable_audio.onnx \
--prompt "120 BPM, 4 bar loop, electric piano, C major" \
--output output.wav \
--seconds 11 \
--seed 42stable-audio-generator/
├── cpp/ # C++ core library
│ ├── include/ # Public headers
│ ├── src/ # Implementation
│ ├── cli/ # Command-line tool
│ └── CMakeLists.txt # Build configuration
├── scripts/ # Python utilities
│ └── convert_to_onnx.py # Model conversion script
├── models/ # ONNX models (generated, not in git)
├── third_party/ # ONNX Runtime (downloaded separately)
├── BUILD.md # Detailed build instructions
├── README_CPP.md # C++ version quick start
├── ARCHITECTURE.md # Architecture documentation
└── CPP_MIGRATION_SUMMARY.md # Migration details
stable_audio_cli \
--model models/stable_audio.onnx \
--prompt "Your text prompt" \
--output output.wav \
--seconds 11 \
--bars 4 \
--bpm 120 \
--key C \
--seed 42 \
--device cpuParameters:
--model(required): Path to ONNX model file--prompt(required): Text description of desired audio--output: Output WAV file path (default: output.wav)--seconds: Length in seconds (default: 11.0)--bars: Number of musical bars (overrides seconds)--bpm: Beats per minute (default: 120)--key: Musical key (optional)--seed: Random seed (0 = random)--device: Device (cpu, cuda, coreml)
| Platform | Status | Build Instructions |
|---|---|---|
| Windows | ✅ | See BUILD.md |
| macOS | ✅ | See BUILD.md |
| Linux/Ubuntu | ✅ | See BUILD.md |
| Android | 🔄 | See mobile/android/README.md |
| iOS | 🔄 | See mobile/ios/README.md |
To verify Milestone 1 is working:
- ✅ Build project:
cmake --build . - ✅ Convert model:
python scripts/convert_to_onnx.py - ✅ Generate test audio:
./stable_audio_cli --model models/stable_audio.onnx --prompt "test" - ✅ Verify
output.wavis created and plays correctly
See verification.md for detailed acceptance criteria.
- BUILD.md - Detailed build instructions for all platforms
- README_CPP.md - C++ version quick start guide
- ARCHITECTURE.md - Technical architecture details
- CPP_MIGRATION_SUMMARY.md - Migration from Python to C++
- cpp/README.md - C++ library API documentation
- Ensure ONNX Runtime is extracted to
third_party/onnxruntime-{platform}/ - Check that
include/andlib/directories exist - See BUILD.md for platform-specific setup
- Ensure Hugging Face authentication:
huggingface-cli login - Accept model license: https://huggingface.co/stabilityai/stable-audio-open-small
- Check Python dependencies are installed
- Verify CMake version:
cmake --version(needs 3.18+) - Check C++17 compiler is available
- See BUILD.md troubleshooting section
- This project code is provided under the license specified in the contract
- Stable Audio Open model is provided by Stability AI under their license
- Model must be downloaded and converted separately after accepting Hugging Face terms
- See
LICENSE.mdfor full details
This project uses C++ with ONNX Runtime instead of Python to achieve:
- ✅ True cross-platform: Same codebase for Windows, macOS, Linux, Android, iOS
- ✅ Native performance: No interpreter overhead
- ✅ Mobile support: Native Android/iOS builds
- ✅ Small binaries: No Python runtime needed
- ✅ Optimized inference: ONNX Runtime optimized for each platform
See CPP_MIGRATION_SUMMARY.md for details.
For issues or questions:
- Check BUILD.md for build issues
- Review verification.md for common problems
- Contact developer per contract terms
- C++ project structure with CMake
- ONNX Runtime integration
- Command-line audio generation tool
- Model conversion script (Python)
- Cross-platform build configurations
- Documentation (build, architecture, API)
- Mobile proof-of-concept structure
- Model conversion completed (requires Hugging Face access)
- Mobile builds (Android/iOS)
Milestone 1 Status: ✅ C++/ONNX architecture complete, ready for model integration and testing
For detailed build instructions, see BUILD.md