Skip to content

feat: add client-side audio chunking for large file transcription#503

Open
NathanSkene wants to merge 1 commit intoOpenWhispr:mainfrom
NathanSkene:pr/byok-chunking
Open

feat: add client-side audio chunking for large file transcription#503
NathanSkene wants to merge 1 commit intoOpenWhispr:mainfrom
NathanSkene:pr/byok-chunking

Conversation

@NathanSkene
Copy link
Copy Markdown
Contributor

Summary

  • Splits large audio recordings into smaller chunks for transcription
  • Enables transcription of longer recordings without API timeout issues
  • Chunks are processed sequentially and results concatenated

Test plan

  • Record a long dictation (>30 seconds)
  • Verify transcription completes without timeout
  • Verify output text is coherent across chunk boundaries

Instead of rejecting files >25MB with a paywall message, split large
audio files into 10-minute MP3 chunks using ffmpeg, transcribe each
chunk sequentially via the third-party API, and reassemble the full
transcript. Reuses the existing chunk progress UI from OpenWhispr Cloud.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@NathanSkene
Copy link
Copy Markdown
Contributor Author

Manual validation update on macOS with the packaged installed app:

This feature passed validation.

What I tested:

  • a genuinely long BYOK transcription using meeting audio
  • the transcription completed successfully instead of failing or timing out
  • the output stayed coherent across the full recording without obvious chunk-boundary corruption (no abrupt truncation, duplicated joins, or missing blocks)

Follow-up note: the output still needs better formatting / paragraphing for long meeting transcripts, but that is a separate UX/output-quality issue rather than a failure of the chunking feature itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant