🎙️ Voice Memoir Agent

A lightweight AI-powered voice journaling assistant that lets you record memories via speech. The system transcribes, understands, and responds using natural language, while remembering what you've said through semantic search.

🧠 Features

🎤 Voice Input: Upload voice recordings and have them transcribed via OpenAI Whisper (openai-whisper).
💬 AI Conversation: GPT-4 guides and responds to user memories with context-awareness.
🔊 TTS Reply: Replies are converted to audio using pyttsx3.
🧾 Semantic Memory: Stores and retrieves related past memories using FAISS + Embedding.
⚡ FastAPI Backend: Simple, extendable architecture for demo or production.

🗂️ Project Structure

voice_memoir_agent/
├── app.py               # FastAPI entrypoint
├── whisper_utils.py     # Speech-to-text (openai-whisper)
├── gpt_utils.py         # GPT-4 conversation logic
├── tts_utils.py         # Text-to-speech (pyttsx3)
├── memory_faiss.py      # Memory vector store (FAISS)
└── requirements.txt

🚀 Quickstart (Local)

1. Clone repo

git clone https://github.com/xbfightn/voice-memoir-agent.git
cd voice-memoir-agent

2. Setup environment (conda recommanded)

create and activate env:

  conda create -n voice-memoir python=3.11
  conda activate voice-memoir

instal with conda:

  conda install -c conda-forge fastapi uvicorn openai faiss-cpu

install with pip:

  pip install -r requirements.txt

运行

请根据你的项目实际入口文件运行，例如：

python app.py

3. Set your OpenAI API key

Edit gpt_utils.py:

openai.api_key = "sk-xxxx"

4. Run the app

uvicorn app:app --reload

效果

5. Test with cURL or Postman

curl -X POST "http://localhost:8000/memoir" \
  -H "accept: audio/wav" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@your_audio.mp3" \
  --output reply.wav

效果

🧰 Tech Stack

Layer	Tool
ASR	OpenAI Whisper (openai-whisper)
LLM	OpenAI GPT-4
Memory	SentenceTransformers + FAISS
TTS	pyttsx3
Backend	FastAPI + Uvicorn

📌 TODOs

Frontend: Add React-based voice recording UI
Auth: Per-user memory separation
Memory persistence: Save FAISS + memory to disk
Multilingual support
Optional: HuggingFace/LLM local model support

📄 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
images		images
local_audio		local_audio
.gitignore		.gitignore
app.py		app.py
gpt_utils.py		gpt_utils.py
memory_faiss.py		memory_faiss.py
pyproject.toml		pyproject.toml
readme.md		readme.md
reply.wav		reply.wav
requirements.txt		requirements.txt
tts_utils.py		tts_utils.py
whisper_utils.py		whisper_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Voice Memoir Agent

🧠 Features

🗂️ Project Structure

🚀 Quickstart (Local)

1. Clone repo

2. Setup environment (conda recommanded)

运行

3. Set your OpenAI API key

4. Run the app

效果

5. Test with cURL or Postman

效果

🧰 Tech Stack

📌 TODOs

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Voice Memoir Agent

🧠 Features

🗂️ Project Structure

🚀 Quickstart (Local)

1. Clone repo

2. Setup environment (conda recommanded)

运行

3. Set your OpenAI API key

4. Run the app

效果

5. Test with cURL or Postman

效果

🧰 Tech Stack

📌 TODOs

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages