Skip to content

garvit-arora/Jarvis

Repository files navigation


Typing SVG



React Vite Python OpenAI Cloudinary Playwright


🤖 A voice-first AI assistant with offline speech recognition, OpenAI-powered responses and text-to-speech, wrapped in a sleek React + Vite interface.



📋 Table of Contents


🧠 Architecture

Voice Pipeline

  🎤 Mic  ──▶  🎙 Vosk STT  ──▶  🧠 OpenAI LLM  ──▶  🔊 OpenAI TTS  ──▶  🔈 Speaker

Tech Stack

Layer Technology Role
🖥️ Frontend React + Vite UI & hot module replacement
🐍 Backend Python Server & orchestration
🎙️ STT Vosk Offline speech-to-text
🧠 AI OpenAI GPT Response generation
🔊 TTS OpenAI TTS Voice synthesis
☁️ Storage Cloudinary Audio & asset hosting
🎭 Automation Playwright Browser control

Project Structure

jarvis-main/
├── src/                              # ⚛️  React + Vite frontend
├── server/
│   ├── main.py                       # 🐍  Backend entry point
│   ├── speech_io.py                  # 🎙️  Vosk STT (loads model at L13)
│   └── requirements.txt
├── vosk-model-small-en-us-0.15/      # 📦  Must stay in project root
├── .env                              # 🔑  Credentials (git-ignored)
└── vite.config.js

⚙️ Setup

Step 1 — Frontend dependencies

Install npm packages and Playwright browser binaries:

npm install
sudo npx playwright install-deps

Step 2 — Python virtual environment

Create an isolated environment and install backend dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install -r server/requirements.txt

Step 3 — Environment variables

Create .env or server/.env in the project root:

OPENAI_API_KEY=your_key_here
YOUR_CLOUD_NAME=your_cloud_name
YOUR_API_KEY=your_cloudinary_key
YOUR_API_SECRET=your_cloudinary_secret

Warning

Never commit .env files. Both .env and server/.env are already listed in .gitignore.


Step 4 — Download the Vosk model

wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip

Important

Keep the extracted vosk-model-small-en-us-0.15/ folder in the project root. speech_io.py loads it from there at line 13.


🔑 Environment Variables

Variable Description Required
OPENAI_API_KEY OpenAI secret key for LLM completions and TTS
YOUR_CLOUD_NAME Cloudinary cloud name from your dashboard
YOUR_API_KEY Cloudinary API key
YOUR_API_SECRET Cloudinary API secret — treat as a password

🚀 Running the App

Start the frontend (in one terminal):

npm run dev

Start the backend (in another terminal):

source .venv/bin/activate
python -m server.main

📦 Notes

  • The following are excluded from version control via .gitignore:

    vosk-model-small-en-us-0.15/
    .venv/
    .env
    server/.env
    audio output files
    screenshots/
    
  • The Vosk model enables fully offline speech recognition — no API call needed for STT.

  • OpenAI is used for both response generation (GPT) and voice synthesis (TTS).


Made with ❤️ — "Just A Rather Very Intelligent System"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors