Add ExecuWhisper macOS dictation example (Parakeet + LFM2.5 formatter)#237
Open
seyeong-han wants to merge 8 commits into
Open
Add ExecuWhisper macOS dictation example (Parakeet + LFM2.5 formatter)#237seyeong-han wants to merge 8 commits into
seyeong-han wants to merge 8 commits into
Conversation
fdcbab8 to
29caabb
Compare
7451fc8 to
3b4a4db
Compare
Add a native macOS dictation app that runs fully on-device using ExecuTorch:
NVIDIA Parakeet-TDT for ASR (Metal backend) plus a fine-tuned LiquidAI
LFM2.5-350M for cleaning up disfluencies, casing, and punctuation (MLX
delegate).
Layout follows the voxtral_realtime/macos/ convention:
execuwhisper/
macos/
ExecuWhisper/ Swift app source
ExecuWhisperTests/ XCTest target
docs/ Demo script, support runbook, release QA checklist
scripts/ Build / DMG / sign / verify / probe scripts
project.yml xcodegen spec (no DEVELOPMENT_TEAM hard-coded;
supply via env var)
README.md Public README with prebuilt + from-source paths
THIRD_PARTY_NOTICES Upstream component attribution
CHANGELOG.md v0.1.0 initial open-source release notes
.gitignore xcodeproj/, build/, DMG, etc.
Models live in two Hugging Face repos:
younghan-meta/Parakeet-TDT-ExecuTorch-Metal (ASR runtime)
younghan-meta/LFM2.5-350M-ExecuWhisper-Formatter (formatter runtime + fp32)
Helper binaries depend on three upstream ExecuTorch PRs in review:
pytorch/executorch#18861 - parakeet_helper (ASR runtime)
pytorch/executorch#19195 - LFM2.5 MLX export pipeline
pytorch/executorch#19562 - lfm25_formatter_helper (formatter runtime)
Until those land, build via the README from-source path or use the
prebuilt arm64 helpers attached to the GitHub Release on this PR.
Eval: AMI release-gate run for the formatter shows forbidden 0.030 (gate
0.10) and coverage 0.874 (gate 0.85). Full eval reports in the formatter
HF repo under eval/.
No telemetry. The only network call is the first-launch model download
from huggingface.co.
cc889c3 to
aec1d11
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fully On-Device Free Dictation App
clean-reformatting.mp4
What you hear in the clip:
What gets pasted:
execuwhisper-overlay-dictation.mp4
Summary
ExecuWhisper is a native macOS dictation app that runs fully on-device using ExecuTorch. ASR via NVIDIA Parakeet-TDT (Metal backend); a fine-tuned LiquidAI LFM2.5-350M cleans disfluencies, casing, and punctuation (MLX delegate). No cloud, no API keys, no telemetry.
This PR adds the source under
execuwhisper/macos/, mirroring the existingvoxtral_realtime/macos/layout.What's in this diff
execuwhisper/macos/directory:ExecuWhisper/— Swift app source.ExecuWhisperTests/— XCTest target.docs/— demo script, support runbook, release-QA checklist.scripts/—build.sh,create_dmg.sh,sign_release.sh,verify_*,probe_formatter.py,benchmark_helper.py.project.yml— xcodegen spec (noDEVELOPMENT_TEAMhard-coded; user supplies via env var).README.md,CHANGELOG.md,THIRD_PARTY_NOTICES.md,.gitignore.Upstream dependencies (in review)
The helper binaries this app embeds live in three PRs against
pytorch/executorchthat are still in review. Reviewers/users have two paths:pytorch/executorch:mainand build everything from a single source checkout.parakeet_helper(ASR runtime, Metal) +make parakeet-metallfm2_5_350mmodel class +lfm_2_5-mlxMakefile targetlfm25_formatter_helper(formatter runtime, MLX) +make lfm_2_5_formatter-mlxModels
younghan-meta/Parakeet-TDT-ExecuTorch-Metalyounghan-meta/LFM2.5-350M-ExecuWhisper-Formatter(new dedicated repo for the ExecuWhisper-specific fine-tune; includes fp32 weights, MLX-exported.pte, tokenizer, and full eval reports)Prebuilt binaries
GitHub Release
execuwhisper-v0.1.0on the fork (until this PR merges) ships:lfm25_formatter_helper-arm64-darwin.tar.gzparakeet_helper-arm64-darwin.tar.gzmlx.metallibhelpers-arm64-darwin.tar.gz(one-shot bundle of the above)Helpers are pre-signed with the hardened runtime +
disable-library-validation+allow-dyld-environment-variablesentitlements. Not redistributinglibomp.dylib(third-party LLVM OpenMP) — users obtain viabrew install libomp. Not redistributing the.app/.dmg(legal review pending).Validation
Release-gate eval for the formatter (AMI corpus subset):
In-app smoke tests: dictation overlay, hotkey, replacements, snippets, session export, long-input chunking, helper warm reuse — all passing on macOS 14.x and 15.x, MacBook Pro M3 / Mac Studio M2.
Known limitations
What changed since the previous push of this branch
ExecuWhisper/(top-level) toexecuwhisper/macos/(mirrorsvoxtral_realtime/macos/convention)..xcodeproj/; added.gitignoreand anxcodegen generatebuild step.DEVELOPMENT_TEAM; users supply via env var.requirements_et-mlx.txt/requirements_et-metal.txt(removed@ file:///Users/younghan/...editable installs; documentedpip install -e <executorch_path>step in README).younghan-meta/LFM2.5-ExecuTorch-MLXto the new dedicatedyounghan-meta/LFM2.5-350M-ExecuWhisper-Formatter.FINETUNING_HANDOVER.md,FINETUNING_DATASET_HANDOFF.md).THIRD_PARTY_NOTICES.md, ASCII architecture diagram, acknowledgements.TextPipelineTests.swiftwith a generic example.CHANGELOG.md,RELEASE_QA_CHECKLIST.md, andcreate_dmg.sh.probe_formatter.py,sign_release.sh,verify_project_settings.sh,verify_release.sh.eval/.Reviewer guidance
ExecuWhisper/Services/,ExecuWhisper/Views/), the build wiring (project.yml,scripts/build.sh), and the README's prebuilt-vs-source story.xcodebuild -scheme ExecuWhisper testexercises the formatter prompt builder, replacement pipeline, session compatibility, and helper bridge reuse.