Skip to content

Add Vosk Offline Speech‑to‑Text Integration to GreenProductive#3

Open
Lumb3 wants to merge 3 commits intoBtelgeuse:masterfrom
Lumb3:test_branch
Open

Add Vosk Offline Speech‑to‑Text Integration to GreenProductive#3
Lumb3 wants to merge 3 commits intoBtelgeuse:masterfrom
Lumb3:test_branch

Conversation

@Lumb3
Copy link
Copy Markdown

@Lumb3 Lumb3 commented Dec 23, 2025


Summary

This pull request introduces a new feature that allows users to add tasks using voice input instead of typing. The feature is powered by the Vosk offline speech‑to‑text model, enabling full functionality without requiring an internet connection.

Users can now click the microphone icon in the UI and simply speak their task, which will automatically populate the task input field.


What’s Included

1. New Backend Speech Service

  • backend/speech_service.py
    A Python script that loads the Vosk model and streams microphone audio for transcription.

2. PyInstaller Build Configuration

  • speech_service.spec
    Used to package the Python script into a standalone binary so users do not need Python installed.

3. Binary Build Output (Ignored in Git)

  • build/ and dist/ folders
    These are generated by PyInstaller and are included in .gitignore.
    The dist/ folder contains the compiled binary:

    dist/speech_service
    

    This binary must be shipped with the application for speech recognition to work.


4. Electron Integration

  • main.js updated to:
    • Launch the speech_service binary as a child process
    • Stream transcription results back to the renderer
    • Provide IPC handlers for:
      • startSpeechService
      • stopSpeechService
      • onSpeechFinal
      • onSpeechPartial

5. UI Enhancements

  • Added a microphone button to the task input area.
  • When activated, the microphone triggers the speech service and inserts recognized text directly into the task field.
  • The icon visually toggles between active/inactive states.

Screenshot for reference:

Screenshot 2025-12-22 at 16 50 11

Important Note for Reviewers

Because GitHub cannot store platform‑specific binaries, the speech_service binary is not included in this PR.
To test the feature locally:

  1. Install Python + Vosk model
  2. Run PyInstaller using speech_service.spec to generate:
dist/speech_service

OR
4. Download the prebuilt binary from test_branch → Releases and place it in the dist/ folder

Alternatively, we can discuss:

  • Automating builds through CI

Why This Feature Matters

  • Enables hands‑free task entry
  • Improves accessibility
  • Works fully offline
  • Integrates seamlessly with the existing UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant