GitHub - omachala/diction: iOS keyboard that transcribes speech to text using Whisper

The free, open-source alternative to Wispr Flow.
Self-hosted speech-to-text keyboard for iOS.

Website • Self-Hosting Guide • Privacy Policy

Why Diction?

Voice-to-text keyboards like Wispr Flow cost $15/month and send your audio to their cloud. Apple's built-in dictation is free but unreliable.

Diction is different:

Self-hosted is free - no subscription, no word limits, no trial that expires. Bring your own server.
Your server, your data - audio goes to a Whisper server you run. Not our cloud. Not anyone's cloud. Your network.
Open source infrastructure - the server setup is right here. Inspect it, modify it, contribute to it.
Model agnostic - point it at any OpenAI-compatible endpoint. Whisper tiny, large-v3, distil, fine-tuned models, future models. You choose.
Zero-dependency iOS app - pure Swift, no third-party SDKs, no analytics, no tracking. Fully auditable.

Don't want to self-host? Diction Cloud provides the same experience with zero setup.

Think of it like Bitwarden - free and self-hosted for those who want control, with a hosted cloud option for convenience.

How It Works

Run the gateway + a Whisper model on any machine (home server, NAS, cloud VM, Raspberry Pi)
Make it reachable from your phone (local IP, reverse proxy, or Cloudflare Tunnel)
Paste the URL into the Diction app
Switch to the Diction keyboard in any app → tap mic → speak → text appears

That's the entire setup. Three commands to start the server:

git clone https://github.com/omachala/diction.git
cd diction
docker compose up -d gateway whisper-small

Gateway is now running at http://<your-server-ip>:9000. Done.

How is this different from...

	Diction	Wispr Flow	Apple Dictation
Price	Free (self-hosted)	$15/month	Free
Audio stays on your network	✅	❌ Cloud	✅
Open source server	✅	❌	❌
iOS keyboard	✅	✅	✅ Built-in
Model agnostic	✅ Any model, any URL	❌ Locked in	❌ Locked in
Zero third-party SDKs	✅	❌	N/A

Diction is pure transcription: what you say is what you get. No AI rewriting, no filler word removal. If you want that, paid alternatives exist. Diction's trade-off is freedom, privacy, and cost.

Gateway

The gateway sits in front of your Whisper models and provides:

Model routing — switch models from the app without changing your server URL
WebSocket streaming — audio streams to the server during recording, so transcription starts instantly when you stop (no upload wait)
Health checks — GET /health and GET /v1/models report which backends are up

docker compose up -d gateway whisper-small

Point the Diction app to http://<your-server-ip>:9000. The gateway routes requests to the right model backend automatically.

You can also skip the gateway and connect directly to a model (e.g. http://<ip>:9002 for small). The gateway is optional but recommended.

Models

Diction is model agnostic. It works with any OpenAI-compatible speech-to-text endpoint - public models, private models, fine-tuned models, future models. You're not locked into anything.

This repo includes a Docker Compose setup with popular faster-whisper models to get you started:

docker compose up -d whisper-tiny          # ~350 MB RAM, ~1-2s
docker compose up -d whisper-small         # ~800 MB RAM, ~3-4s  ← recommended
docker compose up -d whisper-medium        # ~1.8 GB RAM, ~8-12s
docker compose up -d whisper-large         # ~3.5 GB RAM, ~20-30s
docker compose up -d whisper-distil-large  # ~2 GB RAM, ~4-6s

Run multiple models at once and switch between them in the app:

docker compose up -d gateway whisper-small whisper-medium whisper-large

But you can point Diction at anything: whisper.cpp, OpenAI's API, a custom fine-tuned model for your language or domain, or any future model that speaks the same protocol. If it has an /v1/audio/transcriptions endpoint, Diction works with it.

No Public IP?

No problem. You don't need to open ports on your router:

Cloudflare Tunnel - free, outbound-only connection to Cloudflare's edge. No port forwarding needed.
Tailscale - free WireGuard mesh VPN. Install on server + phone, connect from anywhere.
ngrok - instant public URL, great for testing.

See the Self-Hosting Guide for detailed instructions.

Privacy

This is a keyboard extension. We take privacy seriously:

Self-hosted: Audio goes only to your server. Full stop.
Cloud mode: Audio is processed and immediately discarded. Not stored, not used for training.
No analytics, no tracking, no telemetry. The app contains zero third-party SDKs.
Full Access is required by iOS for network - the keyboard needs to reach the Whisper endpoint. No keylogging, no clipboard access.

Read the full Privacy Policy.

Requirements

iOS 16.0+ (iPhone)
For self-hosting: any machine that can run Docker (the gateway itself uses ~15 MB RAM)

Diction Cloud

Don't want to self-host? Diction Cloud is a hosted alternative - same accuracy, zero setup, no server to maintain. Priced to be cheaper than running your own VPS. See diction.one for details.

Contributing

We welcome contributions to the self-hosting infrastructure, documentation, and Docker setup. See CONTRIBUTING.md.

License

MIT - see LICENSE.

The iOS app is distributed via the App Store. This repository contains the self-hosting infrastructure and documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
assets		assets
docs		docs
gateway		gateway
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why Diction?

How It Works

How is this different from...

Gateway

Models

No Public IP?

Privacy

Requirements

Diction Cloud

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

omachala/diction

Folders and files

Latest commit

History

Repository files navigation

Why Diction?

How It Works

How is this different from...

Gateway

Models

No Public IP?

Privacy

Requirements

Diction Cloud

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages