Commute Zen is a personalised voice-first news assitant that generates calm, personalized commute briefings from current news across your chosen topics.
It uses Gemini Live for conversation, Gemini models for search + summarization + text-to-speech, and Firebase for authentication and cloud history.
Prerequisites: Ensure you have Node.js (v18 or newer) installed.
- Open your terminal and navigate to your project folder.
- Install dependencies:
npm install- Create a
.env.localfile in the root of your project and add your Gemini API key (you can generate one in Google AI Studio):
NEXT_PUBLIC_GEMINI_API_KEY="your_actual_gemini_api_key_here"- Start the local development server:
npm run dev- Open your browser and go to:
http://localhost:3000
graph TD
subgraph Client ["Client (Web Browser)"]
UI["React UI (Next.js)"]
AudioEngine["Web Audio API / AudioWorklet"]
State["React State & idb-keyval"]
GenAISDK["Google GenAI SDK"]
end
subgraph Firebase ["Firebase Services"]
Auth["Firebase Auth (Google)"]
Firestore["Cloud Firestore"]
FS_Users["/users"]
FS_History["/history (metadata)"]
FS_Chunks["/chunks (audio chunks)"]
Firestore --> FS_Users
Firestore --> FS_History
FS_History --> FS_Chunks
end
subgraph Gemini ["Google Gemini API"]
LiveModality["Gemini Live (Multimodal Audio)"]
SearchModel["Gemini 2.5 Flash (Google Search)"]
ScriptModel["Gemini 3 Flash (Scripting)"]
TTSModel["Gemini 2.5 Flash TTS (Audio Gen)"]
end
subgraph External ["External Data"]
WebNews["Web / Google Search Index"]
end
%% Interactions
UI <--> GenAISDK
UI <--> Auth
UI <--> Firestore
GenAISDK <--> LiveModality
GenAISDK -- Fetch News --> SearchModel
GenAISDK -- Draft Script --> ScriptModel
GenAISDK -- Generate Audio --> TTSModel
SearchModel <--> WebNews
%% Logic Flow
LiveModality -- "Tool Call: generateCommuteSummary" --> UI
UI -- "Process Summary" --> SearchModel
TTSModel -- "PCM Audio" --> UI
UI -- "Store Summary & Audio" --> Firestore
- Starts a live voice session with Gemini and asks for your preferred news domains/topics.
- Fetches current topic-specific news with Gemini + Google Search tool.
- Produces a short, podcast-style transcript for commute listening.
- Generates spoken audio and plays it in a custom in-app player.
- Stores transcript + audio history in Firestore when signed in.
- Falls back to local browser storage (IndexedDB) when signed out.
- Next.js 15 (App Router) + React 19 + TypeScript
- Google GenAI SDK (
@google/genai) - Firebase Auth + Firestore
- Tailwind CSS v4 + Motion (animations)
- idb-keyval (local fallback history persistence)
- User taps the mic button to start Gemini Live.
- Live agent asks for topics and calls a tool function (
generateCommuteSummary). - App fetches fresh topic-specific news via Gemini with
googleSearch. - App creates a concise spoken script with Gemini.
- App converts script to audio using Gemini TTS.
- App saves summary metadata, transcript, and chunked audio:
- Firestore when authenticated
- IndexedDB when anonymous
app/
├─ globals.css # Global Tailwind + theme setup
├─ layout.tsx # Root layout, fonts, metadata, ErrorBoundary wrapper
└─ page.tsx # Main UI + Gemini live flow + summary generation logic
components/
└─ ErrorBoundary.tsx # Runtime + Firestore-aware UI error boundary
hooks/
└─ use-mobile.ts # Responsive utility hook
lib/
├─ firebase.ts # Firebase app/auth/firestore bootstrap
└─ utils.ts # Utility className merger
firestore.rules # Firestore security rules
firebase-blueprint.json # Firestore entity + path blueprint
firebase-applet-config.json # Firebase client config
users/{userId}: profile metadatausers/{userId}/history/{historyId}: one generated summary (metadata + transcript)users/{userId}/history/{historyId}/chunks/{chunkId}: base64 audio chunks
Security rules enforce:
- Authenticated access only
- Owner-only read/write per user path
- Field validation and size limits
- Immutable field constraints on updates
Create a .env.local for local development.
NEXT_PUBLIC_GEMINI_API_KEY(required): Gemini API key used by the clientAPP_URL(optional for local, used in hosted contexts): app base URL
Reference: .env.example includes both keys and comments.
npm run dev- Start local development servernpm run build- Create production buildnpm run start- Run production servernpm run lint- Run ESLintnpm run clean- Project clean script as currently configured
- Microphone access is required for live voice interaction.
- Google sign-in popup must be allowed by the browser.
- If Firestore permissions fail, the app surfaces structured diagnostics through the Error Boundary.
- Current creation flow is voice/topic driven (despite some UI text mentioning links).
- "Gemini API key is missing": verify
.env.localhasNEXT_PUBLIC_GEMINI_API_KEYand restart dev server. - Mic button does nothing: check browser mic permissions and HTTPS/localhost context.
- Sign-in fails or closes: allow popups and confirm Firebase Auth provider setup.
- History not loading: verify Firestore rules and that authenticated user owns the data path.