Skip to content

nexpectArpit/VoiGent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VoiGent 🤖

The Multilingual AI Voice Assistant

VoiGent is a professional-grade, voice-enabled AI assistant designed to handle To-Do lists and remember important user details across sessions. It features a high-fidelity voice interface with automatic language detection between English and Hindi.

Deployments :

frontend : https://voigent.netlify.app

backend : https://voigent.onrender.com

Key Features

  • High-Fidelity Transcription: Uses Groq Whisper-v3 for near-instant and highly accurate speech-to-text.
  • Perfect Language Autodetection: Automatically identifies if you are speaking English, Hindi, or "Hinglish" without any manual toggles.
  • Hybrid TTS Engine:
    • English (Fast Path): Uses the browser's native synthesis for zero-latency response.
    • Hindi (Quality Path): Routes through the backend using gTTS for authentic pronunciation.
  • Long-term Memory: Remembers your name, preferences, and important facts using an integrated SQLite database.
  • Task Management: Fully functional To-Do list (Add, List, Update, Delete) via LLM function calling.
  • Premium Glassmorphic UI: A modern, responsive interface with vibrant gradients and smooth animations.
  • Mindful Usage System: Built-in persistent message counter and popup to encourage responsible use of free-tier AI resources.

Project Architecture

graph TD
    User((User)) <--> UI[Frontend: Vanilla JS/CSS3]
    UI <--> BE[Backend: FastAPI]
    BE <--> Groq[Groq AI: Llama 3.1 & Whisper]
    BE <--> DB[(SQLite: agent_data.db)]
    BE <--> gTTS[Google TTS: Hindi Voice]
Loading

Project Structure

voice-agent/
├── backend/
│   ├── voice/
│   │   ├── detect.py       # Language detection (EN/HI)
│   │   └── tts.py          # gTTS Hindi synthesis
│   ├── agent.py            # LLM & Function calling logic
│   ├── database.py         # SQLite persistence
│   ├── main.py             # FastAPI entry point
│   └── requirements.txt    # Python dependencies
├── frontend/
│   ├── api.js              # Backend API wrappers
│   ├── app.js              # UI & Voice logic
│   ├── index.html          # Application structure
│   └── style.css           # Premium styling
└── README.md

Quick Start

1. Prerequisites

2. Backend Setup

  1. From the project root (voice-agent/), create a virtual environment:
    python3 -m venv venv
  2. Activate the virtual environment:
    • Mac/Linux: source venv/bin/activate
    • Windows: venv\Scripts\activate
  3. Navigate to the backend folder:
    cd backend
  4. Create a .env file and add your key:
    GROQ_API_KEY=your_key_here
  5. Install dependencies:
    pip install -r requirements.txt
  6. Start the server:
    uvicorn main:app --reload

3. Frontend Setup

  1. Navigate to the frontend folder.
    cd frontend
  2. Start a local server:
    python3 -m http.server 3000
  3. Open http://localhost:3000 in your browser.

About

a voice agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors