The free, open-source alternative to Wispr Flow.
Self-hosted speech-to-text keyboard for iOS.
Website • Self-Hosting Guide • Privacy Policy
Voice-to-text keyboards like Wispr Flow cost $15/month and send your audio to their cloud. Apple's built-in dictation is free but unreliable.
Diction is different:
- Self-hosted is free - no subscription, no word limits, no trial that expires. Bring your own server.
- Your server, your data - audio goes to a Whisper server you run. Not our cloud. Not anyone's cloud. Your network.
- Open source infrastructure - the server setup is right here. Inspect it, modify it, contribute to it.
- Model agnostic - point it at any OpenAI-compatible endpoint. Whisper tiny, large-v3, distil, fine-tuned models, future models. You choose.
- Zero-dependency iOS app - pure Swift, no third-party SDKs, no analytics, no tracking. Fully auditable.
Don't want to self-host? Diction Cloud provides the same experience with zero setup.
Think of it like Bitwarden - free and self-hosted for those who want control, with a hosted cloud option for convenience.
- Run the gateway + a Whisper model on any machine (home server, NAS, cloud VM, Raspberry Pi)
- Make it reachable from your phone (local IP, reverse proxy, or Cloudflare Tunnel)
- Paste the URL into the Diction app
- Switch to the Diction keyboard in any app → tap mic → speak → text appears
That's the entire setup. Three commands to start the server:
git clone https://github.com/omachala/diction.git
cd diction
docker compose up -d gateway whisper-smallGateway is now running at http://<your-server-ip>:9000. Done.
| Diction | Wispr Flow | Apple Dictation | |
|---|---|---|---|
| Price | Free (self-hosted) | $15/month | Free |
| Audio stays on your network | ✅ | ❌ Cloud | ✅ |
| Open source server | ✅ | ❌ | ❌ |
| iOS keyboard | ✅ | ✅ | ✅ Built-in |
| Model agnostic | ✅ Any model, any URL | ❌ Locked in | ❌ Locked in |
| Zero third-party SDKs | ✅ | ❌ | N/A |
Diction is pure transcription: what you say is what you get. No AI rewriting, no filler word removal. If you want that, paid alternatives exist. Diction's trade-off is freedom, privacy, and cost.
The gateway sits in front of your Whisper models and provides:
- Model routing — switch models from the app without changing your server URL
- WebSocket streaming — audio streams to the server during recording, so transcription starts instantly when you stop (no upload wait)
- Health checks —
GET /healthandGET /v1/modelsreport which backends are up
docker compose up -d gateway whisper-smallPoint the Diction app to http://<your-server-ip>:9000. The gateway routes requests to the right model backend automatically.
You can also skip the gateway and connect directly to a model (e.g. http://<ip>:9002 for small). The gateway is optional but recommended.
Diction is model agnostic. It works with any OpenAI-compatible speech-to-text endpoint - public models, private models, fine-tuned models, future models. You're not locked into anything.
This repo includes a Docker Compose setup with popular faster-whisper models to get you started:
docker compose up -d whisper-tiny # ~350 MB RAM, ~1-2s
docker compose up -d whisper-small # ~800 MB RAM, ~3-4s ← recommended
docker compose up -d whisper-medium # ~1.8 GB RAM, ~8-12s
docker compose up -d whisper-large # ~3.5 GB RAM, ~20-30s
docker compose up -d whisper-distil-large # ~2 GB RAM, ~4-6s
Run multiple models at once and switch between them in the app:
docker compose up -d gateway whisper-small whisper-medium whisper-largeBut you can point Diction at anything: whisper.cpp, OpenAI's API, a custom fine-tuned model for your language or domain, or any future model that speaks the same protocol. If it has an /v1/audio/transcriptions endpoint, Diction works with it.
No problem. You don't need to open ports on your router:
- Cloudflare Tunnel - free, outbound-only connection to Cloudflare's edge. No port forwarding needed.
- Tailscale - free WireGuard mesh VPN. Install on server + phone, connect from anywhere.
- ngrok - instant public URL, great for testing.
See the Self-Hosting Guide for detailed instructions.
This is a keyboard extension. We take privacy seriously:
- Self-hosted: Audio goes only to your server. Full stop.
- Cloud mode: Audio is processed and immediately discarded. Not stored, not used for training.
- No analytics, no tracking, no telemetry. The app contains zero third-party SDKs.
- Full Access is required by iOS for network - the keyboard needs to reach the Whisper endpoint. No keylogging, no clipboard access.
Read the full Privacy Policy.
- iOS 16.0+ (iPhone)
- For self-hosting: any machine that can run Docker (the gateway itself uses ~15 MB RAM)
Don't want to self-host? Diction Cloud is a hosted alternative - same accuracy, zero setup, no server to maintain. Priced to be cheaper than running your own VPS. See diction.one for details.
We welcome contributions to the self-hosting infrastructure, documentation, and Docker setup. See CONTRIBUTING.md.
MIT - see LICENSE.
The iOS app is distributed via the App Store. This repository contains the self-hosting infrastructure and documentation.

