Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -170,5 +170,8 @@ tests/data
.secrets
.ruff_cache
.aider*
.DS_Store
speaches_debug

# OS-specific
Thumbs.db
.DS_Store
39 changes: 35 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,43 @@
# Speaches

> [!NOTE]
> This project was previously named `faster-whisper-server`. I've decided to change the name from `faster-whisper-server`, as the project has evolved to support more than just ASR.

# Speaches

`speaches` is an OpenAI API-compatible server supporting streaming transcription, translation, and speech generation. Speach-to-Text is powered by [faster-whisper](https://github.com/SYSTRAN/faster-whisper) and for Text-to-Speech [piper](https://github.com/rhasspy/piper) and [Kokoro](https://huggingface.co/hexgrad/Kokoro-82M) are used. This project aims to be Ollama, but for TTS/STT models.

See the documentation for installation instructions and usage: [speaches.ai](https://speaches.ai/)

## Quick Start

Get a fully functional `speaches` server running in a few commands.

### 1. Installation

Install the `speaches` command-line tool and all its dependencies using `uv`. The default installation includes the web server and UI.

```bash

git clone https://github.com/speaches-ai/speaches.git
cd speaches
uv venv
source .venv/bin/activate
uv sync --all-extras --upgrade
uv tool install .

# Downloading a Text To Speech (TTS) model:
uvx speaches model download speaches-ai/Kokoro-82M-v1.0-ONNX

# Downloading a Speech To Text (STT) model:
uvx speaches model download Systran/faster-distil-whisper-small.en

# run the speaches server then open http://localhost:8000 in your web browser to try speaches
speaches serve --host 0.0.0.0 --port 8000
```

Visit http://localhost:8000 in your web browser.

The server will start, and the console will display the correct URL (e.g., `http://localhost:8000`) to access the Gradio web UI. Once the server is running, you can open a new terminal to use client commands like `speaches model ls`.

## Features:

- OpenAI API compatible. All tools and SDKs that work with OpenAI's API should work with `speaches`.
Expand All @@ -19,8 +50,8 @@ See the documentation for installation instructions and usage: [speaches.ai](htt
- Text-to-Speech via `kokoro`(Ranked #1 in the [TTS Arena](https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena)) and `piper` models.
- GPU and CPU support.
- [Deployable via Docker Compose / Docker](https://speaches.ai/installation/)
- [Highly configurable](https://speaches.ai/usage/realtime-api)
- [Realtime API](https://speaches.ai/configuration/)
- [Highly configurable](https://speaches.ai/configuration/)
- [Realtime API](https://speaches.ai/usage/realtime-api/)

Please create an issue if you find a bug, have a question, or a feature suggestion.

Expand Down
28 changes: 25 additions & 3 deletions contributing.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
uv venv
source .venv/bin/activate
uv sync --all-extras
### Development Environment Setup

We use `uv` for fast and reliable dependency management. Follow these steps to set up your environment for contributing.

1. **Clone the Repository:**
```bash
git clone https://github.com/path/to/speaches.git
cd speaches
```

2. **Create and Activate a Virtual Environment:**
Using a virtual environment is essential for isolating project dependencies.
```bash
uv venv
source .venv/bin/activate
uv sync --all-extras --upgrade
```

3. **Install All Dependencies in Editable Mode:**
The following command installs the `speaches` package itself, plus all optional dependencies required for development and running the full test suite. The `-e` flag (for "editable") links the installation to your source code, so you don't need to reinstall after making changes.
```bash
uv pip install -e '.[dev]'
```

You are now set up for development. You can run the server with `speaches serve` and run the test suite with `pytest`.
38 changes: 36 additions & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,11 +108,45 @@ docker compose up --detach

## Python (requires Python 3.12+ and `uv` package manager)

# Installation

The `speaches` package is distributed as a single, "batteries-included" application. The standard installation provides all features, including the API server, web UI, and client tools.

## For Users

The recommended way to install `speaches` is as a command-line tool using `uv`. This installs the application and its dependencies into an isolated environment, making the `speaches` command available globally on your system.

```bash
git clone https://github.com/speaches-ai/speaches.git
cd speaches
uv venv
source .venv/bin/activate
uv sync --all-extras
uvicorn --factory --host 0.0.0.0 speaches.main:create_app
uv sync --all-extras --upgrade
uv tool install .
speaches serve --host 0.0.0.0 --port 8000
```

After installation, you can run the server with `speaches serve` or explore other commands with `speaches --help`.

## For Developers (Contributing to Speaches)

If you plan to contribute to the `speaches` project, you must install it in "editable" mode from a local clone of the repository. This setup links the `speaches` command directly to your source code, so your edits are reflected immediately without reinstalling.

1. **Clone the Repository:**
```bash
git clone https://github.com/path/to/speaches.git
cd speaches
```

2. **Create and Activate a Virtual Environment:**
```bash
uv venv
source .venv/bin/activate
```

3. **Install in Editable Mode with Development Extras:**
This command installs the project along with all optional dependencies needed for running tests and other development tasks.
```bash
s uv pip install -e '.[dev]'
```
The `speaches` command is now available in your shell for development and testing.
8 changes: 4 additions & 4 deletions docs/usage/model-discovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Before you can do anything useful with `speaches`, you'll need to want to downlo
=== "Speaches CLI"

```bash
uvx speaches-cli registry ls
uvx speaches registry ls
```

=== "cURL"
Expand All @@ -21,7 +21,7 @@ The above command will display a list of all available models. You can filter th
=== "Speaches CLI"

```bash
uvx speaches-cli registry ls --task automatic-speech-recognition
uvx speaches registry ls --task automatic-speech-recognition
```

=== "cURL"
Expand All @@ -37,7 +37,7 @@ You'll then want to download the model you want to use. You can do this by makin
=== "Speaches CLI"

```bash
uvx speaches-cli model download Systran/faster-distil-whisper-small.en
uvx speaches model download Systran/faster-distil-whisper-small.en
```

=== "cURL"
Expand All @@ -51,7 +51,7 @@ The downloaded model will now be included in the list of available models when y
=== "Speaches CLI"

```bash
uvx speaches-cli model ls
uvx speaches model ls
```

=== "cURL"
Expand Down
6 changes: 3 additions & 3 deletions docs/usage/speech-to-text.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ TODO: add a note about vad
export SPEACHES_BASE_URL="http://localhost:8000"

# Listing all available STT models
uvx speaches-cli registry ls --task automatic-speech-recognition | jq '.data | [].id'
uvx speaches registry ls --task automatic-speech-recognition | jq '.data | [].id'

# Downloading a Systran/faster-distil-whisper-small.en model
uvx speaches-cli model download Systran/faster-distil-whisper-small.en
uvx speaches model download Systran/faster-distil-whisper-small.en

# Check that the model has been installed
uvx speaches-cli model ls --task text-to-speech | jq '.data | map(select(.id == "Systran/faster-distil-whisper-small.en"))'
uvx speaches model ls --task text-to-speech | jq '.data | map(select(.id == "Systran/faster-distil-whisper-small.en"))'
```

## Usage
Expand Down
6 changes: 3 additions & 3 deletions docs/usage/text-to-speech.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
export SPEACHES_BASE_URL="http://localhost:8000"

# Listing all available TTS models
uvx speaches-cli registry ls --task text-to-speech | jq '.data | [].id'
uvx speaches registry ls --task text-to-speech | jq '.data | [].id'

# Downloading a TTS model
uvx speaches-cli model download speaches-ai/Kokoro-82M-v1.0-ONNX
uvx speaches model download speaches-ai/Kokoro-82M-v1.0-ONNX

# Check that the model has been installed
uvx speaches-cli model ls --task text-to-speech | jq '.data | map(select(.id == "speaches-ai/Kokoro-82M-v1.0-ONNX"))'
uvx speaches model ls --task text-to-speech | jq '.data | map(select(.id == "speaches-ai/Kokoro-82M-v1.0-ONNX"))'
```

## Usage
Expand Down
1 change: 0 additions & 1 deletion packages/speaches-cli/.python-version

This file was deleted.

Empty file removed packages/speaches-cli/README.md
Empty file.
20 changes: 0 additions & 20 deletions packages/speaches-cli/pyproject.toml

This file was deleted.

4 changes: 0 additions & 4 deletions packages/speaches-cli/src/speaches_cli/__init__.py

This file was deleted.

63 changes: 0 additions & 63 deletions packages/speaches-cli/src/speaches_cli/main.py

This file was deleted.

Loading
Loading