GitHub - garvit-arora/Jarvis

🤖 A voice-first AI assistant with offline speech recognition, OpenAI-powered responses and text-to-speech, wrapped in a sleek React + Vite interface.

📋 Table of Contents

🧠 Architecture
⚙️ Setup
🔑 Environment Variables
🚀 Running the App
📦 Notes

🧠 Architecture

Voice Pipeline

  🎤 Mic  ──▶  🎙 Vosk STT  ──▶  🧠 OpenAI LLM  ──▶  🔊 OpenAI TTS  ──▶  🔈 Speaker

Tech Stack

Layer	Technology	Role
🖥️ Frontend	React + Vite	UI & hot module replacement
🐍 Backend	Python	Server & orchestration
🎙️ STT	Vosk	Offline speech-to-text
🧠 AI	OpenAI GPT	Response generation
🔊 TTS	OpenAI TTS	Voice synthesis
☁️ Storage	Cloudinary	Audio & asset hosting
🎭 Automation	Playwright	Browser control

Project Structure

jarvis-main/
├── src/                              # ⚛️  React + Vite frontend
├── server/
│   ├── main.py                       # 🐍  Backend entry point
│   ├── speech_io.py                  # 🎙️  Vosk STT (loads model at L13)
│   └── requirements.txt
├── vosk-model-small-en-us-0.15/      # 📦  Must stay in project root
├── .env                              # 🔑  Credentials (git-ignored)
└── vite.config.js

⚙️ Setup

Step 1 — Frontend dependencies

Install npm packages and Playwright browser binaries:

npm install
sudo npx playwright install-deps

Step 2 — Python virtual environment

Create an isolated environment and install backend dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install -r server/requirements.txt

Step 3 — Environment variables

Create .env or server/.env in the project root:

OPENAI_API_KEY=your_key_here
YOUR_CLOUD_NAME=your_cloud_name
YOUR_API_KEY=your_cloudinary_key
YOUR_API_SECRET=your_cloudinary_secret

Warning

Never commit .env files. Both .env and server/.env are already listed in .gitignore.

Step 4 — Download the Vosk model

wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip

Important

Keep the extracted vosk-model-small-en-us-0.15/ folder in the project root. speech_io.py loads it from there at line 13.

🔑 Environment Variables

Variable	Description	Required
`OPENAI_API_KEY`	OpenAI secret key for LLM completions and TTS	✅
`YOUR_CLOUD_NAME`	Cloudinary cloud name from your dashboard	✅
`YOUR_API_KEY`	Cloudinary API key	✅
`YOUR_API_SECRET`	Cloudinary API secret — treat as a password	✅

🚀 Running the App

Start the frontend (in one terminal):

npm run dev

Start the backend (in another terminal):

source .venv/bin/activate
python -m server.main

📦 Notes

The following are excluded from version control via .gitignore:

vosk-model-small-en-us-0.15/
.venv/
.env
server/.env
audio output files
screenshots/

The Vosk model enables fully offline speech recognition — no API call needed for STT.
OpenAI is used for both response generation (GPT) and voice synthesis (TTS).

Made with ❤️ — "Just A Rather Very Intelligent System"

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
public		public
server		server
src		src
tests		tests
.codex		.codex
.gitignore		.gitignore
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
playwright.config.js		playwright.config.js
requirements.txt		requirements.txt
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📋 Table of Contents

🧠 Architecture

Voice Pipeline

Tech Stack

Project Structure

⚙️ Setup

Step 1 — Frontend dependencies

Step 2 — Python virtual environment

Step 3 — Environment variables

Step 4 — Download the Vosk model

🔑 Environment Variables

🚀 Running the App

📦 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📋 Table of Contents

🧠 Architecture

Voice Pipeline

Tech Stack

Project Structure

⚙️ Setup

Step 1 — Frontend dependencies

Step 2 — Python virtual environment

Step 3 — Environment variables

Step 4 — Download the Vosk model

🔑 Environment Variables

🚀 Running the App

📦 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages