VoxCPM2 API

Production-ready FastAPI and WebSocket service for VoxCPM2, with:

REST synthesis via /v1/speech
streaming synthesis via /v1/stream
optional transcription via /v1/transcribe
Linux CUDA auto-selection for nano-vllm-voxcpm
official voxcpm backend support for prompt and reference audio
macOS Apple Silicon compatibility patches for VoxCPM2 CPU inference
a matching Tauri desktop client released alongside the API

Release Artifacts

Every tagged release publishes three separate artifact families:

voxcpm2_api-<version>-py3-none-any.whl The API package with FastAPI, WebSocket endpoints, Docker support, and the built-in compatibility layer.
voxcpm2_compat-<version>-py3-none-any.whl The standalone compatibility wrapper for custom Python integrations that need the macOS and CPU safety patches without the API service.
VoxCPM2-ui-macos-<tag>-<arch>.zip The Tauri desktop application bundle for macOS.

Installation

From a GitHub release wheel

Install the API directly from a release asset:

pip install "voxcpm2-api[voxcpm] @ https://github.com/fabianzimber/voxcpm2-api/releases/download/v0.2.0/voxcpm2_api-0.2.0-py3-none-any.whl"

If you only want the compatibility wrapper for your own VoxCPM2 code:

pip install "voxcpm2-compat @ https://github.com/fabianzimber/voxcpm2-api/releases/download/v0.2.0/voxcpm2_compat-0.2.0-py3-none-any.whl"

From source

python3 -m venv .venv
. .venv/bin/activate
pip install -e ".[dev,voxcpm]"
cp .env.example .env
voxcpm2-api

The helper scripts do the same:

./scripts/bootstrap.sh
./scripts/run-dev.sh

Docker

The default Docker image now installs the official voxcpm backend automatically.

cp .env.example .env
docker compose up --build

Backend Strategy

The service keeps one API surface and swaps runtimes underneath it:

Linux + NVIDIA CUDA: prefers nano-vllm-voxcpm for plain text synthesis
macOS, Windows, generic Linux, or conditioned requests: uses the official voxcpm Python package
prompt or reference audio requests: always route to the official voxcpm backend

macOS / Apple Silicon

VoxCPM2 is currently safest on macOS when it runs on CPU:

MPS is disabled because VoxCPM2 uses bfloat16 and the public runtime is not stable on Apple MPS
the API patches PyTorch scaled_dot_product_attention for the exact CPU decoding path used by VoxCPM2
OpenMP duplicate-library crashes are suppressed for mixed native dependency stacks

These patches are built into voxcpm2-api and published separately as voxcpm2-compat.

Configuration

All runtime configuration is environment-driven.

Important variables:

VOXCPM2_MODEL_ID
VOXCPM2_MODEL_PATH
VOXCPM2_MODEL_CACHE_DIR
VOXCPM2_PREFER_BACKEND
VOXCPM2_LOAD_DENOISER
VOXCPM2_OPTIMIZE_MODEL
VOXCPM2_LOCAL_FILES_ONLY
VOXCPM2_STARTUP_LOAD_MODEL
VOXCPM2_HF_ENDPOINT
VOXCPM2_CORS_ORIGINS
VOXCPM2_NANOVLLM_DEVICES

See ./.env.example for the full template.

API

Health

curl http://localhost:8000/health
curl http://localhost:8000/api/status
curl http://localhost:8000/v1/runtime

Synthesis

Return WAV:

curl -X POST http://localhost:8000/v1/speech \
  -H 'Content-Type: application/json' \
  -d '{"text":"Hello from VoxCPM2","response_format":"wav"}' \
  --output out.wav

Return base64:

curl -X POST http://localhost:8000/v1/speech \
  -H 'Content-Type: application/json' \
  -d '{"text":"Hello from VoxCPM2","response_format":"base64"}'

Prompt continuation:

{
  "text": "Continue this in the same voice.",
  "prompt_text": "Earlier context",
  "prompt_audio_base64": "<wav-as-base64>",
  "reference_audio_base64": "<optional-reference-wav>",
  "response_format": "base64"
}

Streaming

Connect to /v1/stream, send one JSON request, then read:

session.started
repeated audio.chunk
audio.completed

Example payload:

{
  "text": "Stream this sentence.",
  "chunk_format": "pcm16"
}

Transcription

curl -X POST http://localhost:8000/v1/transcribe \
  -H 'Content-Type: application/json' \
  -d '{"audio_base64":"<wav-as-base64>"}'

Desktop Client

The repository also contains a Tauri desktop app in ./voxcpm2-ui. Tagged releases attach a macOS bundle ZIP so the UI can be downloaded without building Rust locally.

Local development:

cd voxcpm2-ui/src-tauri
cargo tauri dev

Testing and Release

. .venv/bin/activate
pytest
ruff check .
./scripts/build-release.sh

CI runs on GitHub Actions for Linux and macOS. Tagged releases automatically publish:

the API wheel and sdist
the standalone compatibility wheel and sdist
the macOS Tauri desktop bundle

License

AGPL-3.0-only. See ./LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
packages/voxcpm2-compat		packages/voxcpm2-compat
scripts		scripts
src/voxcpm2_api		src/voxcpm2_api
tests		tests
voxcpm2-ui		voxcpm2-ui
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoxCPM2 API

Release Artifacts

Installation

From a GitHub release wheel

From source

Docker

Backend Strategy

macOS / Apple Silicon

Configuration

API

Health

Synthesis

Streaming

Transcription

Desktop Client

Testing and Release

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoxCPM2 API

Release Artifacts

Installation

From a GitHub release wheel

From source

Docker

Backend Strategy

macOS / Apple Silicon

Configuration

API

Health

Synthesis

Streaming

Transcription

Desktop Client

Testing and Release

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages