🗣️ Voice Assistant Starter Template

A simplified, general-purpose voice assistant backend you can use as a starting point to build your own custom voice-driven applications.

This project lets you ask questions in voice and get AI-generated audio replies, demonstrating streaming audio processing, speech-to-text, LLM-powered responses, and text-to-speech output.

🎥 Demo

demo.mp4

✨ Features

Accepts voice input and generates AI audio replies
Streams audio for low latency
Modular FastAPI backend
Easy to extend for your own use case

🎯 Use Cases

Personal voice assistants
Customer support bots
Interactive voice apps
Smart home interfaces

🛠️ Quick Setup

⚠️ Requires Python 3.11 only

1️⃣ Clone the Repository

git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name

2️⃣ Verify Python Version

Make sure you're using Python 3.11:

python --version
# Should output: Python 3.11.x

If not, install it from python.org.

3️⃣ Create a Virtual Environment

python -m venv venv
source venv/bin/activate   # macOS/Linux
venv\Scripts\activate      # Windows

4️⃣ Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

5️⃣ Install ffmpeg

This project needs ffmpeg for audio processing:

macOS
```
brew install ffmpeg
```
Ubuntu/Linux
```
sudo apt update
sudo apt install ffmpeg
```
Windows

Download from FFmpeg.org and add it to your PATH.

6️⃣ Create Environment File

Get your Mistral API key by signing up and following their API Quickstart Guide.

Create a .env file:

API_KEY=your_mistral_api_key_here

Sure—here’s a clean snippet you can drop right into your README under the Setup section (e.g., after Install ffmpeg) or as a new Install eSpeak step:

7️⃣ Install eSpeak

This project uses eSpeak to enable additional Coqui TTS voice models.

💻 Windows

Download and install eSpeak for Windows: 👉 https://espeak.sourceforge.net/
After installation, add the espeak/command-line folder to your PATH environment variable so the espeak command is available in the terminal.

🐧 Linux (Ubuntu/Debian)

sudo apt update
sudo apt install espeak

🍎 macOS

brew install espeak

▶️ Running the Project

uvicorn src.main:app --reload --host 0.0.0.0 --port 8000

Access it in your browser or API client at:

http://localhost:8000

📣 Example Interaction

Visit the /ui endpoint in your browser:

http://localhost:8000/ui

Use the web interface to record your question.
Hear an AI-generated audio reply instantly.

🏗️ Architecture

Gradio web UI streams audio to the FastAPI backend.
Voice Activity Detection (VAD) finds when speech starts and stops.
Detected speech is transcribed to text.
Text is sent to the LLM (Mistral) to generate a response.
The response is converted to speech with TTS.
Audio reply streams back to the user in real time.

🧩 Next Steps

Swap in your preferred LLM
Customize prompts or dialogue logic
Add authentication and logging
Containerize with Docker
Deploy on AWS Lambda, ECS, etc.

🙏 Credits

Built to help anyone bootstrap their own voice assistant with clean, minimal code.

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

📫 Questions?

Open an issue or reach out on LinkedIn.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
README.md		README.md
demo.mp4		demo.mp4
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️ Voice Assistant Starter Template

🎥 Demo

✨ Features

🎯 Use Cases

🛠️ Quick Setup

1️⃣ Clone the Repository

2️⃣ Verify Python Version

3️⃣ Create a Virtual Environment

4️⃣ Install Dependencies

5️⃣ Install ffmpeg

6️⃣ Create Environment File

7️⃣ Install eSpeak

▶️ Running the Project

📣 Example Interaction

🏗️ Architecture

🧩 Next Steps

🙏 Credits

📜 License

📫 Questions?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🗣️ Voice Assistant Starter Template

🎥 Demo

✨ Features

🎯 Use Cases

🛠️ Quick Setup

1️⃣ Clone the Repository

2️⃣ Verify Python Version

3️⃣ Create a Virtual Environment

4️⃣ Install Dependencies

5️⃣ Install ffmpeg

6️⃣ Create Environment File

7️⃣ Install eSpeak

▶️ Running the Project

📣 Example Interaction

🏗️ Architecture

🧩 Next Steps

🙏 Credits

📜 License

📫 Questions?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages