A macOS menu bar dictation tool that runs entirely on-device. Hold the Fn key to record, release to transcribe and paste into any app.
No cloud services, no subscriptions, no data leaves your Mac.
- Push-to-talk dictation via the Fn key (or configurable hotkey)
- Two speech-to-text backends:
- Moonshine -- lightweight, CPU-based, 5 model sizes
- MLX Whisper -- Apple Silicon GPU-accelerated, 6 model sizes
- Live model switching from the menu bar (no restart needed)
- Post-processing pipeline:
- Hallucination loop detection and removal
- Trailing silence trimming
- Filler word removal (um, uh, you know, etc.)
- Vocabulary corrections via user-editable files
- Transcription history viewer with copy support
- Training data collection -- saves audio/transcript pairs for future fine-tuning
- Paste Last Transcription -- re-inject the last result into any app
- macOS 13+ (Ventura or later)
- Python 3.10+
- For MLX Whisper: Apple Silicon Mac (M1/M2/M3/M4), ffmpeg (
brew install ffmpeg) - For Moonshine: Any Mac (CPU-only, also works on Apple Silicon)
git clone https://github.com/JohnAllsopp/our-voice.git
cd our-voice
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# For MLX Whisper backend (recommended on Apple Silicon):
pip install mlx-whisper
brew install ffmpeg# Run with defaults (MLX Whisper large-v3-turbo)
python run.py
# Run with Moonshine backend
python run.py --backend moonshine
# Run with a specific MLX Whisper model
python run.py --backend mlx-whisper --model large-v3-turbo
# Disable post-processing
python run.py --no-post-processingOnce running, look for OV in your menu bar:
- Hold Fn to start recording (red circle appears)
- Release Fn to stop and transcribe
- Text is automatically pasted into the focused app
- Model -- switch between Moonshine and MLX Whisper models on the fly
- Vocabulary > Edit Corrections -- add word corrections (e.g.,
whisper eye -> WhisperAI) - Vocabulary > Edit Prompt Terms -- add domain-specific terms to improve recognition
- View Transcriptions -- browse all past transcriptions
- Paste Last Transcription -- re-paste the most recent result
| Model | Type | Notes |
|---|---|---|
| tiny | Non-streaming | Smallest, fastest |
| base | Non-streaming | Larger, better accuracy |
| tiny-streaming | Streaming arch | Can also run in batch mode |
| small-streaming | Streaming arch | Mid-size |
| medium-streaming | Streaming arch | Best accuracy (default) |
| Model | Size | Notes |
|---|---|---|
| tiny | ~75 MB | Fastest, lower accuracy |
| base | ~140 MB | Good for quick tasks |
| small | ~460 MB | Good accuracy |
| medium | ~1.5 GB | High accuracy |
| large-v3 | ~3 GB | Highest accuracy, slower |
| large-v3-turbo | ~1.6 GB | Near-large accuracy, much faster (default) |
Models are downloaded automatically on first use.
Our Voice needs two macOS permissions:
- Accessibility -- to paste text into the focused application
- Microphone -- to record audio
The app will prompt for these on first run, or you can check them from the menu bar: Check Permissions.
run.py -- Entry point and CLI
our_voice/
app.py -- Menu bar app (rumps), UI, queue consumer
config.py -- All configuration constants
transcription.py -- Recording, transcription pipeline, post-processing
hotkey.py -- Fn key listener (Quartz CGEventTap)
text_injector.py -- Paste text via clipboard + Cmd+V
permissions.py -- macOS permission checks
vocabulary.py -- Initial prompt and user vocabulary
post_processing.py -- Regex-based corrections
training_data.py -- Audio/transcript pair saving
backends/
base.py -- Abstract TranscriptionBackend interface
mlx_whisper.py -- MLX Whisper backend
moonshine.py -- Moonshine Voice backend
All user data is stored in ~/.our-voice/:
transcriptions.jsonl-- transcription logcorrections.txt-- user word correctionsprompt_terms.txt-- custom vocabulary termstraining_data/-- saved audio/transcript pairs
MIT -- see LICENSE.