Skip to content

stability: polish, stability & performance sprint (P0-P3)#57

Merged
shuv1337 merged 2 commits intomasterfrom
claude/polish-stability-performance-6stZv
Apr 16, 2026
Merged

stability: polish, stability & performance sprint (P0-P3)#57
shuv1337 merged 2 commits intomasterfrom
claude/polish-stability-performance-6stZv

Conversation

@shuv1337
Copy link
Copy Markdown
Owner

@shuv1337 shuv1337 commented Apr 15, 2026

Thread safety:

  • Add lock protection to waybar keybind caches (1.1)
  • Protect NeMo ASR encode cache with inference lock (1.2)
  • Replace _asr_disabled bool with threading.Event for safe signaling (1.3)

Resource cleanup:

  • Ensure subprocess file handles are closed on all exception paths
    in tts_melotts.py and tts_local.py (1.4)
  • Track daemon threads and join with timeout in do_shutdown (1.5)
  • Add stderr drain thread for Piper subprocess IPC to prevent
    pipe buffer deadlock (1.6)

Performance:

  • Replace blocking sleep loop in ASR drain with Event.wait (2.1)
  • Add exponential backoff to TTS queue retry loops (2.2)
  • Memoize resolved_sherpa_decode_mode property (2.3)
  • Bound transcript overlap detection to 200-char window (2.4)
  • Add configurable typing_subprocess_timeout config key (2.5)

Polish & error handling:

  • Implement ASR circuit breaker with 30s cooldown recovery (3.1)
  • Surface voice-list loading failures to TTS overlay (3.2)
  • Improve download cancellation feedback in wizard (3.3)
  • Replace silent except/pass blocks with diagnostic logging across
    control.py, app.py, tts_player.py, config_io.py (3.4)
  • Fix dropped chunk logging to capture first drop (3.5)

Security:

  • Validate tts_model_id and tts_default_voice_id against safe regex (4.1)
  • Truncate oversized clipboard selection before TTS with warning (4.2)
  • Validate model download URLs against HTTPS + allowed domains (4.3)

Code health:

  • Enable ruff B, SIM, UP, RUF rule sets and fix all violations (5.4)

https://claude.ai/code/session_01Gmuqv2TiFvfWejD3d5Bv5r

Greptile Summary

A broad stability and polish sprint touching 39 files: thread-safe threading.Event for ASR disable signaling, circuit-breaker recovery (30s cooldown), Event-based drain wait replacing busy-poll, daemon thread join on shutdown, stderr drain thread for Piper deadlock prevention, exponential backoff in TTS queue retries, URL allow-list + safe-ID regex validation, and a full ruff lint pass (B/SIM/UP/RUF rule sets).

  • P1 — dependency_errors() always returns []: The ruff F401 auto-fix replaced import torch/nemo/sherpa_onnx/moonshine_onnx # noqa: F401 with bare pass in all three ASR backends (asr_nemo.py, asr_sherpa.py, asr_moonshine.py). Because pass never raises, every except block is dead code and the methods unconditionally return an empty list. shuvoice preflight and the wizard's dependency probe will silently report "deps OK" even when the backend libraries are not installed.

Confidence Score: 4/5

Safe to merge after fixing the broken dependency_errors() methods in the three ASR backends; all other changes are well-structured improvements.

One P1 finding: ruff F401 auto-fix silently replaced all import-based dependency checks with bare pass, making shuvoice preflight and wizard dependency probes report OK for any missing ASR library. The fix is mechanical. All other thread-safety, resource-cleanup, security, and performance changes look correct.

shuvoice/asr_nemo.py, shuvoice/asr_sherpa.py, shuvoice/asr_moonshine.py — dependency_errors() dead-code fix required before merge.

Important Files Changed

Filename Overview
shuvoice/app.py Adds circuit breaker for ASR (30s cooldown), replaces _asr_disabled bool with threading.Event, adds _processing_done Event replacing busy-poll, tracks and joins daemon threads on shutdown, truncates oversized TTS clipboard input, surfaces TTS voice-list errors to overlay — logic looks sound.
shuvoice/asr_nemo.py Adds inference lock to encode cache; ruff F401 auto-fix replaced try-import dependency checks with bare pass, silently breaking dependency_errors() — always returns [].
shuvoice/asr_sherpa.py Same ruff F401 regression in dependency_errors() and dead import guard in _cuda_provider_available(); other changes (round() cleanup, noqa removals) are correct.
shuvoice/asr_moonshine.py dependency_errors() broken by same ruff F401 pass substitution; exception chaining fix (raise ... from exc) is correct.
shuvoice/tts_local.py Adds stderr drain thread to prevent Piper pipe-buffer deadlock; adds finally block to close all subprocess handles on all exit paths — both changes are correct.
shuvoice/tts_melotts.py Adds finally block for handle cleanup (correct), but does not add stderr drain thread, leaving the same pipe-buffer deadlock risk that was fixed in tts_local.py.
shuvoice/tts_player.py Adds exponential backoff to TTS queue put retries with a max-attempt cap; improves worker join logging — changes are correct and improve reliability.
shuvoice/config.py Adds typing_subprocess_timeout config key, safe-ID regex validation for tts_model_id/tts_default_voice_id, and memoizes resolved_sherpa_decode_mode property — all correct.
shuvoice/piper_setup.py Adds _validate_download_url enforcing HTTPS + allowlist (github.com, huggingface.co) before every download; curated voice URLs all use huggingface.co, so the allowlist is satisfied.
shuvoice/waybar/hyprland.py Adds threading.Lock to protect module-level keybind cache globals — correct fix for the stated thread-safety concern.
shuvoice/transcript.py Bounds suffix-overlap search to 200 chars to prevent O(n²) stutter on long transcripts — straightforward and correct.
shuvoice/runtime/state_machine.py Updates all _asr_disabled bool references to _asr_disabled_event.is_set(); adds _processing_done.clear() before _processing.set() to correctly arm the new Event-based wait — logic is sound.
shuvoice/control.py Replaces socket.timeout with TimeoutError (Python 3.3+ alias, correct), replaces silent except/pass blocks with diagnostic log.debug calls — clean improvement.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Recording stops\non_recording_stop] -->|_processing_done.clear\n_processing.set| B[ASR Worker\n_asr_worker loop]
    B --> C{_asr_disabled_event\n.is_set?}
    C -->|Yes| D[_try_asr_circuit_recovery\ncooldown elapsed?]
    D -->|No — wait| B
    D -->|Yes — asr.reset| E{reset OK?}
    E -->|OK| F[_asr_disabled_event.clear\ncircuit closed]
    E -->|Failed| G[reset open_at\ncooldown restarts]
    G --> B
    F --> B
    C -->|No| H[_process_chunk_safe\nunder _asr_lock]
    H --> I{consecutive_failures\n>= threshold?}
    I -->|Yes| J[_disable_asr\ncircuit opens\n30s cooldown]
    I -->|No| K[_recover_asr_after_failure]
    K --> B
    J --> B
    H -->|success| L[transcript commit]
    L -->|_processing.clear\n_processing_done.set| M[_wait_for_processing\nunblocked]
Loading

Comments Outside Diff (2)

  1. shuvoice/asr_nemo.py, line 80-93 (link)

    P1 Dead dependency checks — dependency_errors() always returns []

    The ruff F401 auto-fix replaced import torch # noqa: F401 (and import nemo.collections.asr # noqa: F401) with bare pass statements. Because pass never raises, the except clauses are unreachable dead code, and dependency_errors() unconditionally returns an empty list regardless of whether dependencies are installed.

    This silently breaks shuvoice preflight (check_asr_stack calls dependency_errors() and treats no errors as "deps OK"), and the wizard's maybe_download_model backend dependency probe. The same pattern is broken identically in asr_sherpa.py (line 65–73) and asr_moonshine.py (line 120–131).

    The fix is to restore the guarded imports, e.g. by renaming to avoid the F401 violation:

    try:
        import torch as _torch_check  # noqa: F401
    except Exception as e:
        errors.append(
            f"Missing PyTorch dependency: {e}. Install torch (or python-pytorch-cuda on Arch)."
        )
    
    try:
        import nemo.collections.asr as _nemo_check  # noqa: F401
    except Exception as e:
        errors.append(
            f"Missing NeMo ASR dependency: {e}. "
            "Install nemo-toolkit[asr] (or NeMo from git main)."
        )
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: shuvoice/asr_nemo.py
    Line: 80-93
    
    Comment:
    **Dead dependency checks — `dependency_errors()` always returns `[]`**
    
    The ruff F401 auto-fix replaced `import torch  # noqa: F401` (and `import nemo.collections.asr  # noqa: F401`) with bare `pass` statements. Because `pass` never raises, the `except` clauses are unreachable dead code, and `dependency_errors()` unconditionally returns an empty list regardless of whether dependencies are installed.
    
    This silently breaks `shuvoice preflight` (`check_asr_stack` calls `dependency_errors()` and treats no errors as "deps OK"), and the wizard's `maybe_download_model` backend dependency probe. The same pattern is broken identically in `asr_sherpa.py` (line 65–73) and `asr_moonshine.py` (line 120–131).
    
    The fix is to restore the guarded imports, e.g. by renaming to avoid the F401 violation:
    
    ```python
    try:
        import torch as _torch_check  # noqa: F401
    except Exception as e:
        errors.append(
            f"Missing PyTorch dependency: {e}. Install torch (or python-pytorch-cuda on Arch)."
        )
    
    try:
        import nemo.collections.asr as _nemo_check  # noqa: F401
    except Exception as e:
        errors.append(
            f"Missing NeMo ASR dependency: {e}. "
            "Install nemo-toolkit[asr] (or NeMo from git main)."
        )
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. shuvoice/tts_melotts.py, line 184-192 (link)

    P2 MeloTTS still reads stderr synchronously — same deadlock risk as Piper

    PR item 1.6 added a stderr drain thread to tts_local.py to prevent a pipe-buffer deadlock where the Piper subprocess fills the 64 KB OS pipe buffer on stderr while the parent is blocked reading stdout. The same pattern is not applied here: proc.stderr.read() is still called synchronously after the stdout yield loop. If the MeloTTS helper emits PyTorch/CUDA progress output that exceeds the pipe buffer while stdout is being read, both sides block and the call never completes.

    The fix is the same as tts_local.py — start a threading.Thread to drain stderr into a buffer before entering the stdout read loop, then join it and decode after proc.wait().

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: shuvoice/tts_melotts.py
    Line: 184-192
    
    Comment:
    **MeloTTS still reads stderr synchronously — same deadlock risk as Piper**
    
    PR item 1.6 added a stderr drain thread to `tts_local.py` to prevent a pipe-buffer deadlock where the Piper subprocess fills the 64 KB OS pipe buffer on stderr while the parent is blocked reading stdout. The same pattern is not applied here: `proc.stderr.read()` is still called synchronously after the stdout yield loop. If the MeloTTS helper emits PyTorch/CUDA progress output that exceeds the pipe buffer while stdout is being read, both sides block and the call never completes.
    
    The fix is the same as `tts_local.py` — start a `threading.Thread` to drain stderr into a buffer before entering the stdout read loop, then join it and decode after `proc.wait()`.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: shuvoice/asr_nemo.py
Line: 80-93

Comment:
**Dead dependency checks — `dependency_errors()` always returns `[]`**

The ruff F401 auto-fix replaced `import torch  # noqa: F401` (and `import nemo.collections.asr  # noqa: F401`) with bare `pass` statements. Because `pass` never raises, the `except` clauses are unreachable dead code, and `dependency_errors()` unconditionally returns an empty list regardless of whether dependencies are installed.

This silently breaks `shuvoice preflight` (`check_asr_stack` calls `dependency_errors()` and treats no errors as "deps OK"), and the wizard's `maybe_download_model` backend dependency probe. The same pattern is broken identically in `asr_sherpa.py` (line 65–73) and `asr_moonshine.py` (line 120–131).

The fix is to restore the guarded imports, e.g. by renaming to avoid the F401 violation:

```python
try:
    import torch as _torch_check  # noqa: F401
except Exception as e:
    errors.append(
        f"Missing PyTorch dependency: {e}. Install torch (or python-pytorch-cuda on Arch)."
    )

try:
    import nemo.collections.asr as _nemo_check  # noqa: F401
except Exception as e:
    errors.append(
        f"Missing NeMo ASR dependency: {e}. "
        "Install nemo-toolkit[asr] (or NeMo from git main)."
    )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: shuvoice/asr_sherpa.py
Line: 156-161

Comment:
**Dead import guard in `_cuda_provider_available()`**

The same ruff F401 auto-fix replaced `import sherpa_onnx  # noqa: F401` with `pass`, making the `except` block unreachable. When `sherpa_onnx` is not installed, the early-exit path `return False, "failed to import sherpa_onnx (...)"` is never taken. Instead the function falls through to `sherpa_lib_dir()`, which catches the missing import itself and returns `None`, so the function ends up returning `(False, "runtime lib directory not found for sherpa_onnx")` — different and less informative, but still `False`. Functional impact is minor, but the diagnostic message is misleading.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: shuvoice/tts_melotts.py
Line: 184-192

Comment:
**MeloTTS still reads stderr synchronously — same deadlock risk as Piper**

PR item 1.6 added a stderr drain thread to `tts_local.py` to prevent a pipe-buffer deadlock where the Piper subprocess fills the 64 KB OS pipe buffer on stderr while the parent is blocked reading stdout. The same pattern is not applied here: `proc.stderr.read()` is still called synchronously after the stdout yield loop. If the MeloTTS helper emits PyTorch/CUDA progress output that exceeds the pipe buffer while stdout is being read, both sides block and the call never completes.

The fix is the same as `tts_local.py` — start a `threading.Thread` to drain stderr into a buffer before entering the stdout read loop, then join it and decode after `proc.wait()`.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "stability: polish, stability & performan..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Thread safety:
- Add lock protection to waybar keybind caches (1.1)
- Protect NeMo ASR encode cache with inference lock (1.2)
- Replace _asr_disabled bool with threading.Event for safe signaling (1.3)

Resource cleanup:
- Ensure subprocess file handles are closed on all exception paths
  in tts_melotts.py and tts_local.py (1.4)
- Track daemon threads and join with timeout in do_shutdown (1.5)
- Add stderr drain thread for Piper subprocess IPC to prevent
  pipe buffer deadlock (1.6)

Performance:
- Replace blocking sleep loop in ASR drain with Event.wait (2.1)
- Add exponential backoff to TTS queue retry loops (2.2)
- Memoize resolved_sherpa_decode_mode property (2.3)
- Bound transcript overlap detection to 200-char window (2.4)
- Add configurable typing_subprocess_timeout config key (2.5)

Polish & error handling:
- Implement ASR circuit breaker with 30s cooldown recovery (3.1)
- Surface voice-list loading failures to TTS overlay (3.2)
- Improve download cancellation feedback in wizard (3.3)
- Replace silent except/pass blocks with diagnostic logging across
  control.py, app.py, tts_player.py, config_io.py (3.4)
- Fix dropped chunk logging to capture first drop (3.5)

Security:
- Validate tts_model_id and tts_default_voice_id against safe regex (4.1)
- Truncate oversized clipboard selection before TTS with warning (4.2)
- Validate model download URLs against HTTPS + allowed domains (4.3)

Code health:
- Enable ruff B, SIM, UP, RUF rule sets and fix all violations (5.4)

https://claude.ai/code/session_01Gmuqv2TiFvfWejD3d5Bv5r
Comment thread shuvoice/asr_sherpa.py
Comment on lines 156 to 161
@staticmethod
def _cuda_provider_available() -> tuple[bool, str]:
try:
import sherpa_onnx # noqa: F401
except Exception as exc: # noqa: BLE001
pass
except Exception as exc:
return False, f"failed to import sherpa_onnx ({exc})"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead import guard in _cuda_provider_available()

The same ruff F401 auto-fix replaced import sherpa_onnx # noqa: F401 with pass, making the except block unreachable. When sherpa_onnx is not installed, the early-exit path return False, "failed to import sherpa_onnx (...)" is never taken. Instead the function falls through to sherpa_lib_dir(), which catches the missing import itself and returns None, so the function ends up returning (False, "runtime lib directory not found for sherpa_onnx") — different and less informative, but still False. Functional impact is minor, but the diagnostic message is misleading.

Prompt To Fix With AI
This is a comment left during a code review.
Path: shuvoice/asr_sherpa.py
Line: 156-161

Comment:
**Dead import guard in `_cuda_provider_available()`**

The same ruff F401 auto-fix replaced `import sherpa_onnx  # noqa: F401` with `pass`, making the `except` block unreachable. When `sherpa_onnx` is not installed, the early-exit path `return False, "failed to import sherpa_onnx (...)"` is never taken. Instead the function falls through to `sherpa_lib_dir()`, which catches the missing import itself and returns `None`, so the function ends up returning `(False, "runtime lib directory not found for sherpa_onnx")` — different and less informative, but still `False`. Functional impact is minor, but the diagnostic message is misleading.

How can I resolve this? If you propose a fix, please make it concise.

- Restore guarded imports (torch, nemo, sherpa_onnx, moonshine_onnx)
  in dependency_errors() and _cuda_provider_available() that were
  incorrectly replaced with `pass` by ruff F401 auto-fix. Without
  the imports, dependency checks always returned [] and preflight/
  wizard probes silently reported deps as OK.

- Add stderr drain thread to tts_melotts.py matching the fix already
  applied to tts_local.py, preventing pipe buffer deadlock when the
  MeloTTS helper emits verbose PyTorch/CUDA output on stderr.

https://claude.ai/code/session_01Gmuqv2TiFvfWejD3d5Bv5r
@shuv1337 shuv1337 merged commit a65c604 into master Apr 16, 2026
2 of 7 checks passed
@shuv1337 shuv1337 deleted the claude/polish-stability-performance-6stZv branch April 16, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants