Skip to content

Fix Groq hallucinations around long silence#517

Open
NathanSkene wants to merge 1 commit intoOpenWhispr:mainfrom
NathanSkene:codex/groq-no-speech-filter
Open

Fix Groq hallucinations around long silence#517
NathanSkene wants to merge 1 commit intoOpenWhispr:mainfrom
NathanSkene:codex/groq-no-speech-filter

Conversation

@NathanSkene
Copy link
Copy Markdown
Contributor

Summary

  • replace the failed feat: add VAD-based silence compression for efficient transcription #504 approach with a Groq-specific long-silence fix
  • gate full-clip silence in the recorder before cloud transcription
  • trim long leading/trailing silence before Groq upload
  • suppress custom dictionary prompt bias for long-leading-silence Groq clips
  • use Groq verbose_json segment metadata to reject likely silence hallucinations

Why

Manual validation showed #504 was not usable: long silence still produced hallucinated text. This replacement was tuned against real failing clips and validated in the installed app.

Validated behavior

  • pure silence for ~10s -> blank, no hallucination
  • ~10s silence then short phrase -> only the spoken phrase
  • short phrase then ~10s silence -> only the spoken phrase
  • ambient tapping / room noise without speech -> blank
  • custom-dictionary term after silence still transcribes
  • normal dictation remains intact

Notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant