[🐞 BUG] Dictation activation delay causes first words to be dropped

## Bug Description

When activating dictation via hotkey, there is a significant delay (~500ms–1s+) before the microphone actually starts capturing audio. Any words spoken during this startup window are permanently lost, as there is no mechanism to retroactively capture them.

This is especially noticeable when you start speaking immediately after pressing the hotkey, which is the natural user behavior.

## Steps to Reproduce

1. Launch FluidVoice with Parakeet (or any model)
2. Press the dictation hotkey
3. Immediately begin speaking a sentence (e.g., "Hello this is a test")
4. Observe the transcription output

**Expected:** Full sentence is captured — "Hello this is a test"
**Actual:** First 1 word is dropped — "This is a test"


**The following is done by Opus as why this is happening:**

## Root Cause Analysis

The activation flow in `ASRService.start()` runs multiple sequential blocking steps **before** installing the audio tap:

1. **Hotkey → `start()` dispatch** (~200ms) — `Task { @MainActor }` dispatch plus ContentView callback overhead
2. **Start sound plays synchronously** — before `ASRService.start()` is even called
3. **`configureSession()`** (~85–275ms) — forces `AVAudioEngine` input node instantiation
4. **`startEngine()`** (~150–320ms) — device binding + `prepare()` + `engine.start()`
5. **`setupEngineTap()`** (~1ms) — mic finally starts capturing here
6. **1-second minimum audio accumulation** — `processStreamingChunk()` requires 16,000 samples (1s at 16kHz) before attempting transcription

The core issue is that **audio capture does not begin until all setup is complete**, and `ThreadSafeAudioBuffer` is cleared on every `start()` call with no circular/ring buffer to preserve earlier audio.

Additionally, `stop()` deallocates the `AVAudioEngine` entirely, forcing a full rebuild on every subsequent `start()`.

## Related Issues

- #188 — ~5 second delay traced to `MediaRemoteAdapter` blocking main thread
- #98 — ~1 second delay acknowledged by developer, partially reduced by ~150ms

## Suggested Improvements

### 1. Always-on mic with a rolling circular buffer
Keep a ~2–3 second ring buffer of audio running in the background. On activation, prepend the buffered audio to the recording session. This way words spoken during startup are preserved.

### 2. Keep `AVAudioEngine` alive between sessions
Instead of tearing down and recreating the engine on every `stop()`/`start()` cycle, keep it running (or at least prepared) between sessions. This would eliminate the ~300–500ms `configureSession()` + `startEngine()` cost on each activation.

### 3. Start audio capture before UI work
Move `setupEngineTap()` to fire as early as possible in the hotkey callback — before the start sound, before UI state updates, before media pause. Capture first, update UI after.

### 4. Reduce the 16,000-sample minimum for first chunk
Allow the first `processStreamingChunk()` call to run with less than 1 second of audio. Even a partial first transcription is better than dropping words entirely.

## Environment

- macOS (Apple Silicon)
- FluidVoice latest
- Parakeet TDT v3


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐞 BUG] Dictation activation delay causes first words to be dropped #236

Bug Description

Steps to Reproduce

Root Cause Analysis

Related Issues

Suggested Improvements

1. Always-on mic with a rolling circular buffer

2. Keep `AVAudioEngine` alive between sessions

3. Start audio capture before UI work

4. Reduce the 16,000-sample minimum for first chunk

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[🐞 BUG] Dictation activation delay causes first words to be dropped #236

Description

Bug Description

Steps to Reproduce

Root Cause Analysis

Related Issues

Suggested Improvements

1. Always-on mic with a rolling circular buffer

2. Keep AVAudioEngine alive between sessions

3. Start audio capture before UI work

4. Reduce the 16,000-sample minimum for first chunk

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

2. Keep `AVAudioEngine` alive between sessions