🎙️ WhispererAI 🤖

An intelligent voice-based AI assistant that transcribes speech and answers questions in real-time using OpenAI's Whisper and Llama models.

✨ Core Features

🎤 Real-time Audio Recording & Transcription: Capture and convert speech to text instantly.
🧠 Local Speech Recognition: Utilizes the Whisper Base model for efficient on-device processing.
💡 AI-Powered Responses: Leverages Llama (via Ollama) for intelligent question answering.
🔊 High-Quality Audio Processing: Includes noise filtering for clearer audio input.
🚀 CUDA Acceleration: Supports GPU acceleration for faster performance.
💻 Cross-Platform Compatibility: Works on Windows, Linux, and macOS.

🛠️ Tech Stack

Programming Language: Python 3.8+
Speech-to-Text: OpenAI Whisper (Base model)
Language Model: Llama (via Ollama)
Core Libraries:
- PyTorch
- Transformers
- SoundFile
Audio Backend: FFMPEG

📋 Prerequisites

🐍 Python 3.8 or higher.
🎮 CUDA-capable GPU (Optional, but highly recommended for performance).
🎞️ FFMPEG installed and accessible in your system's PATH.
🦙 Ollama installed and running locally.
🎧 A compatible audio input device (Defaults to HyperX Cloud Stinger Core Wireless on Windows, or system default otherwise).

🚀 Getting Started: Installation

Clone the Repository:

git clone https://github.com/yourusername/WhispererAI.git
cd WhispererAI

Set Up a Virtual Environment:

python -m venv venv

On Windows:
```
venv\Scripts\activate
```
On macOS/Linux:
```
source venv/bin/activate
```

Install Dependencies:
```
pip install -r requirements.txt
```
Install and Run Ollama:
- Download and install Ollama from ollama.com.
- Ensure the Ollama service is running.
- Pull the Llama model you intend to use (e.g., ollama pull llama3.2).

💻 How to Use

Launch the Application:
```
python app.py
```
Interact with the Assistant:
- Press R to Start Recording your voice.
- Press S to Stop Recording and process the audio.
- Press C to Clear the terminal screen.
- Press Q to Quit the application.

⚙️ Configuration Details

The application comes with the following default settings:

Audio Sample Rate: 48kHz
Audio Channels: Mono
Whisper Model: openai/whisper-base
LLM (via Ollama): llama3.2 (Ensure this model is available in your Ollama setup)
Processing Device: CUDA (if available), otherwise CPU.
Audio Filters:
- High-pass: 50Hz
- Low-pass: 15kHz
- Volume Boost: 1.5x

🎤 Audio Device Setup

Windows: Attempts to automatically detect "Microphone (HyperX Cloud Stinger Core Wireless DTS)".
Linux/macOS: Uses the default system audio input device.
ℹ️ If the preferred device isn't found, the application will list available audio devices. You may need to modify app.py to specify your device.

🤝 Contributing

Contributions are highly encouraged and welcome! If you have improvements or bug fixes, please:

Fork the repository.
Create a new branch (git checkout -b feature/YourAmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/YourAmazingFeature).
Open a Pull Request.

⚠️ Important Notes

Ensure Ollama is running with the specified model before starting WhispererAI.
Configure your audio input device in app.py if the default settings don't work for your setup.
For the best performance, a CUDA-capable GPU is recommended.

📝 License

This project is licensed under the MIT License. See the LICENSE file for more details (assuming a LICENSE file exists or will be created).

Happy Whispering! 💬

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ WhispererAI 🤖

✨ Core Features

🛠️ Tech Stack

📋 Prerequisites

🚀 Getting Started: Installation

💻 How to Use

⚙️ Configuration Details

🎤 Audio Device Setup

🤝 Contributing

⚠️ Important Notes

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ WhispererAI 🤖

✨ Core Features

🛠️ Tech Stack

📋 Prerequisites

🚀 Getting Started: Installation

💻 How to Use

⚙️ Configuration Details

🎤 Audio Device Setup

🤝 Contributing

⚠️ Important Notes

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages