AI-Powered Memory Assistant for Smart Glasses
MindTrace is a production-ready AI memory assistant designed for Ray-Ban Meta smart glasses and similar wearable devices. It combines real-time face recognition, live speech transcription, and context-aware AI assistance to help users—particularly those with memory challenges—navigate social interactions with confidence.
💡 See it in action: Check out the Screenshots section below to explore the dashboard, smart glasses HUD, and key features.
- Real-Time Face Recognition: Instant identification using InsightFace Buffalo-S with ArcFace embeddings
- Live Speech-to-Text: Continuous transcription via Faster Whisper with WebRTC VAD
- Context-Aware AI Assistant: Google Gemini 2.5 Flash-powered chat with RAG over user data
- AI Summarizer & Insights: Generate conversation summaries and behavioral insights
- Vector Search: ChromaDB for semantic search across conversations and face embeddings
- Emergency SOS System: One-touch alerts with GPS location sharing
- Smart Reminders: Medication, meal, and activity scheduling with notifications
- Comprehensive Dashboard: Mobile-responsive web interface for caregivers and users
Main dashboard showing interaction statistics, recent contacts, and quick access to key features
Manage contacts with profile photos, relationships, and interaction history
Generate intelligent summaries and insights from your interaction history
Set medication, meal, and activity reminders with customizable schedules
One-touch emergency alerts with GPS location sharing for caregivers
Real-time HUD overlay on Ray-Ban Meta smart glasses with face recognition and transcription
Technology Stack:
- Detection Model: RetinaFace (InsightFace Buffalo-S)
- Embedding Model: ArcFace (512-dimensional face embeddings)
- Inference: ONNX Runtime 1.23.2 with CPU optimization
- Storage: ChromaDB with cosine similarity search
- Threshold: 0.45 similarity score for positive identification
- Detection Size: 320x320 for optimal speed/accuracy balance
How It Works:
- Camera captures frame from smart glasses
- RetinaFace detects all faces with bounding boxes (det_score ≥ 0.5)
- ArcFace generates 512-dim embeddings for each detected face
- ChromaDB performs vector similarity search against stored contacts
- Results streamed back to HUD overlay with name, relationship, and confidence
- Multi-face detection supported with sorted results by confidence
Performance:
- Detection: ~50-100ms per frame on CPU
- Recognition: ~20-30ms per face
- Model warmup on server startup for zero cold-start latency
Technology Stack:
- ASR Model: Faster Whisper (base.en model)
- Backend: CTranslate2 with INT8 quantization
- VAD: WebRTC Voice Activity Detection (aggressiveness=2, min_silence=500ms)
- Streaming: WebSocket with 30ms frame duration
- Sample Rate: 16kHz mono audio
- Beam Size: 1 (greedy decoding for maximum speed)
Pipeline:
- Audio captured from smart glasses microphone
- WebRTC VAD filters non-speech frames
- Speech segments buffered and sent to Faster Whisper
- Transcriptions streamed to HUD in real-time
- Full conversations stored in ChromaDB for semantic search
- Conversation history maintained with automatic session management
Optimizations:
- INT8 quantization for 3-4x speedup
- VAD filtering reduces unnecessary processing
- Greedy decoding prevents hallucinations on short chunks
- Transcript caching for smoother output
Technology Stack:
- Model: Google Gemini 2.5 Flash
- RAG Framework: LangChain 1.1+ with HuggingFace embeddings
- Vector DB: ChromaDB with all-MiniLM-L6-v2 embeddings
- Context Window: Multi-turn conversation history (last 3 turns)
- Retrieval: Top-K semantic search (K=5-30 depending on query type)
Features:
- Multi-turn Chat: Natural conversations about your history with context retention
- Summarization: Generate brief, detailed, or analytical summaries of interactions
- Brief: 2-3 paragraphs highlighting key patterns
- Detailed: Comprehensive breakdown by person, timeline, and topics
- Analytical: Insights, trends, and relationship recommendations
- Insights: Discover patterns in conversations (health topics, family interactions, etc.)
- Contact-Aware: Integrates PostgreSQL contact data with ChromaDB interactions
- Statistics: Real-time aggregation of interaction counts, frequencies, and trends
- Plain Text Output: No markdown formatting for clean HUD display
RAG Pipeline:
- Query analysis to determine data sources (contacts, stats, interactions)
- Semantic search across ChromaDB conversation collection
- Structured queries to PostgreSQL for contact info and statistics
- Context assembly with contact details, stats, and relevant interactions
- Prompt engineering with strict anti-hallucination instructions
- Gemini 2.5 Flash generation with plain text formatting
Supported Query Types:
- Contact information ("Who is Sarah?", "What's John's phone number?")
- Interaction history ("What did I discuss with Mom last week?")
- Statistics ("How many times did I talk to my doctor?")
- Temporal queries ("When did I last see my neighbor?")
- Pattern analysis ("What topics do I discuss most with family?")
Technology Stack:
- Framework: React 19.2 with React Router 7.10
- Build Tool: Vite 7.2 for lightning-fast HMR
- Styling: Tailwind CSS 4.1 with glassmorphism design
- Icons: Lucide React 0.555
- Animations: Framer Motion 12.23
- Maps: React Leaflet 5.0 for GPS tracking
- HTTP Client: Axios 1.13
Features:
- Adaptive Layouts: Grids transform to lists/cards on small screens
- Touch-Optimized: Larger touch targets for mobile interactions
- Progressive Web App (PWA): Installable on home screen
- Theme: Modern glassmorphism UI with smooth animations
- Real-time Updates: Live data synchronization with backend
- Responsive Navigation: Collapsible sidebar for mobile devices
Pages:
- Dashboard Home: Quick stats, recent interactions, and alerts
- Contact Management: Add, edit, delete contacts with profile photos
- Interaction History: Searchable timeline of all interactions
- AI Summarizer: Generate insights and summaries
- Reminders: Medication, meal, and activity scheduling
- SOS: Emergency alert system with GPS tracking
- Settings: User preferences and system configuration
┌─────────────────────────────────────────────────────────────────┐
│ Smart Glasses │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Camera │ │ Microphone │ │ GPS/Sensors │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
└─────────┼──────────────────┼──────────────────┼─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Glass Client (React 19) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ HUD Overlay: Face labels, Transcriptions, Alerts │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────┬───────────────────────────────────┘
│ HTTP/WebSocket
▼
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI Server (Python) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Face/ASR │ │ AI/RAG │ │ Stats/Search│ │
│ │ Routes │ │ Routes │ │ Routes │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ InsightFace │ │Faster Whisper│ │ Gemini │ │
│ │ Buffalo-S │ │ + WebRTC │ │ 2.5 Flash │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
└─────────┼──────────────────┼──────────────────┼─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ PostgreSQL │ │ ChromaDB │ │ File Store │ │
│ │ /SQLite │ │ (Vectors) │ │ (Photos) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
▲
│
┌─────────┴───────────────────────────────────────────────────────┐
│ Dashboard Client (React 19) │
│ Contact Management │ Reminders │ SOS │ AI Insights │ History │
└─────────────────────────────────────────────────────────────────┘
Core Framework:
- FastAPI 0.123+ (async web framework)
- Uvicorn 0.38+ (ASGI server)
- SQLAlchemy 2.0+ (ORM)
- Pydantic 2.12+ (data validation)
AI/ML Models:
- Face Recognition: InsightFace (Buffalo-S model with RetinaFace + ArcFace)
- Speech-to-Text: Faster Whisper 1.2+ (base.en model with CTranslate2)
- LLM: Google Gemini 2.5 Flash via google-genai 1.0+
- Embeddings: LangChain + HuggingFace (all-MiniLM-L6-v2)
Deep Learning:
- PyTorch 2.9+ (neural network framework)
- TorchVision 0.24+ (computer vision utilities)
- TorchAudio 2.9+ (audio processing)
- Transformers 4.57+ (HuggingFace models)
- ONNX 1.20 + ONNX Runtime 1.23 (optimized inference)
Computer Vision:
- OpenCV 4.10+ (image processing)
- Pillow 12.0 (image manipulation)
- scikit-image 0.25 (advanced image processing)
- Albumentations 2.0 (image augmentation)
Audio Processing:
- SoundDevice 0.5+ (audio I/O)
- WebRTC VAD 2.0+ (voice activity detection)
- NumPy 2.0+ (numerical computing)
Vector Database:
- ChromaDB 0.4.22+ (vector storage and similarity search)
Relational Database:
- PostgreSQL 13+ (production) / SQLite (development)
- psycopg2-binary 2.9+ (PostgreSQL adapter)
Authentication & Security:
- python-jose 3.5+ (JWT tokens)
- passlib 1.7+ with bcrypt 4.1 (password hashing)
- python-multipart 0.0.20+ (file uploads)
Utilities:
- python-dotenv 1.2+ (environment variables)
- httpx 0.28+ (async HTTP client)
- requests 2.32 (HTTP client)
- pyyaml 6.0 (YAML parsing)
- python-dateutil 2.9 (date utilities)
- tqdm 4.67 (progress bars)
- coloredlogs 15.0 (colored logging)
Scientific Computing:
- NumPy 2.0+ (arrays and matrices)
- SciPy 1.15+ (scientific algorithms)
- scikit-learn 1.7+ (machine learning utilities)
- matplotlib 3.10+ (plotting)
Dashboard Client:
- React 19.2 (UI framework)
- React Router 7.10 (routing)
- Vite 7.2 (build tool)
- Tailwind CSS 4.1 (styling)
- Lucide React 0.555 (icons)
- Framer Motion 12.23 (animations)
- Axios 1.13 (HTTP client)
- React Hot Toast 2.6 (notifications)
- React Leaflet 5.0 (maps)
- Lenis 1.3 (smooth scrolling)
Glass Client:
- React 19.2 (UI framework)
- React Router 7.10 (routing)
- Vite 7.2 (build tool)
- Tailwind CSS 4.1 (styling)
- Lucide React 0.555 (icons)
- Axios 1.13 (HTTP client)
Development Tools:
- ESLint 9.39 (linting)
- Vite Plugin React 5.1 (Fast Refresh)
- TypeScript types for React 19.2
- Node.js 18+ and npm
- Python 3.10-3.12
- uv (Python package manager) - Installation Guide
- PostgreSQL 13+ (optional, SQLite works for development)
- ChromaDB server (optional, can use embedded mode)
# 1. Clone repository
git clone https://github.com/yourusername/mindtrace.git
cd mindtrace
# 2. Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# 3. Setup server
cd server
uv sync # Installs all dependencies from pyproject.toml
# 4. Configure environment variables
cp .env.example .env
# Edit .env with your API keys and configuration
# 5. Setup dashboard client
cd ../client
npm install
cp .env.example .env
# 6. Setup glass client (optional)
cd ../glass-client
npm install
cp .env.example .env# Server Configuration
PORT=8000
CLIENT_URL=http://localhost:5173
GLASS_URL=http://localhost:5174
SECRET_KEY=your-secret-key-here-min-32-chars-for-jwt
# AI Services (Required)
GEMINI_API_KEY=your-gemini-api-key-here
# Database (PostgreSQL or SQLite)
DATABASE_URL=sqlite:///./mindtrace.db
# For PostgreSQL: postgresql://user:password@localhost:5432/mindtrace
# ChromaDB Configuration
CHROMA_HOST=localhost
CHROMA_PORT=8000
CHROMA_API_KEY= # Optional, for cloud ChromaDB
CHROMA_TENANT=default_tenant
CHROMA_DATABASE=default_databaseVITE_API_URL=http://localhost:8000VITE_API_URL=http://localhost:8000# Terminal 1: Start ChromaDB (if using external server)
# Skip this if using embedded mode
chroma run --host localhost --port 8000
# Terminal 2: Start FastAPI server
cd server
uv run main.py
# Server runs at http://localhost:8000
# API docs at http://localhost:8000/docs
# Terminal 3: Start dashboard client
cd client
npm run dev
# Dashboard runs at http://localhost:5173
# Terminal 4: Start glass client (optional)
cd glass-client
npm run dev
# Glass HUD runs at http://localhost:5174- Create an account: Navigate to
http://localhost:5173and register - Add contacts: Upload profile photos and contact information
- Sync face embeddings: The system will automatically generate face embeddings
- Test face recognition: Use the glass client to test real-time recognition
- Record interactions: Start conversations and see transcriptions in real-time
- Explore AI features: Try the summarizer and chat with your memory
mindtrace/
├── server/ # FastAPI Backend
│ ├── main.py # Application entry point
│ ├── pyproject.toml # Python dependencies (uv)
│ ├── requirements.txt # Python dependencies (pip)
│ ├── .env # Environment variables
│ │
│ ├── app/ # Main application package
│ │ ├── app.py # FastAPI app initialization
│ │ ├── database.py # SQLAlchemy database setup
│ │ ├── models.py # Database models
│ │ ├── chroma_client.py # ChromaDB client singleton
│ │ ├── scheduler.py # Reminder scheduler
│ │ │
│ │ └── routes/ # API route handlers
│ │ ├── authRoutes.py # Authentication (login, register)
│ │ ├── faceRoutes.py # Face recognition API
│ │ ├── asrRoutes.py # Speech-to-text WebSocket
│ │ ├── aiRoutes.py # AI Summarizer & RAG
│ │ ├── contactRoutes.py # Contact CRUD operations
│ │ ├── interactionRoutes.py # Interaction history
│ │ ├── reminderRoutes.py # Reminder management
│ │ ├── sosRoutes.py # Emergency SOS system
│ │ ├── statsRoutes.py # Dashboard statistics
│ │ ├── searchRoutes.py # Semantic search
│ │ ├── chatRoutes.py # AI chat interface
│ │ ├── alertRoutes.py # Alert management
│ │ └── userRoutes.py # User profile management
│ │
│ ├── ai_engine/ # ML/AI Models
│ │ ├── face_engine.py # InsightFace (Buffalo-S)
│ │ ├── rag_engine.py # RAG with Gemini 2.5 Flash
│ │ ├── summarizer.py # Interaction summarization
│ │ │
│ │ └── asr/ # Speech recognition
│ │ ├── asr_engine.py # Faster Whisper engine
│ │ └── conversation_store.py # Conversation management
│ │
│ └── data/ # Local storage
│ ├── faces/ # Face embeddings cache
│ └── conversations/ # Conversation transcripts
│
├── client/ # Dashboard (React 19 + Vite)
│ ├── package.json # Node dependencies
│ ├── vite.config.js # Vite configuration
│ ├── tailwind.config.js # Tailwind CSS config
│ ├── index.html # HTML entry point
│ │
│ └── src/
│ ├── main.jsx # React entry point
│ ├── App.jsx # Main app component
│ ├── index.css # Global styles
│ │
│ ├── pages/ # Page components
│ │ ├── DashboardHome.jsx # Dashboard overview
│ │ ├── InteractionHistory.jsx # Interaction timeline
│ │ ├── AiSummarizer.jsx # AI insights & summaries
│ │ ├── Contacts.jsx # Contact management
│ │ ├── Reminders.jsx # Reminder management
│ │ ├── SOS.jsx # Emergency system
│ │ └── Settings.jsx # User settings
│ │
│ ├── components/ # Reusable components
│ │ ├── DashboardLayout.jsx # Layout wrapper
│ │ ├── Sidebar.jsx # Navigation sidebar
│ │ ├── EditContactModal.jsx # Contact editor
│ │ ├── chatbot/ # AI chat components
│ │ └── ...
│ │
│ ├── services/ # API services
│ │ └── api.js # Axios API client
│ │
│ ├── hooks/ # Custom React hooks
│ ├── utils/ # Utility functions
│ ├── constants/ # Constants and configs
│ └── types/ # TypeScript types (JSDoc)
│
└── glass-client/ # Smart Glasses HUD
├── package.json # Node dependencies
├── vite.config.js # Vite configuration
│
└── src/
├── main.jsx # React entry point
├── App.jsx # Main app component
│
├── pages/
│ └── FaceRecognition.jsx # Main HUD page
│
└── components/
└── HUDOverlay.jsx # Overlay UI component
Recognize faces in an uploaded image.
Request:
{
"image": "base64_encoded_image_data"
}Response:
{
"faces": [
{
"name": "John Doe",
"relation": "Friend",
"confidence": 0.87,
"bbox": [100, 150, 300, 400],
"det_score": 0.95,
"contact_id": 123
}
]
}Sync face embeddings from contact profile photos.
Response:
{
"success": true,
"count": 15
}Stream audio for real-time transcription.
Message Format:
{
"audio": "base64_encoded_audio_chunk",
"user_id": 1,
"contact_name": "John Doe"
}Response:
{
"transcript": "Hello, how are you doing today?",
"is_final": true
}Generate a summary of interactions.
Request:
{
"summary_type": "brief",
"days": 7,
"contact_id": 123,
"focus_areas": ["health", "family"]
}Response:
{
"summary": "Over the past week, you had 5 interactions...",
"interaction_count": 5,
"time_period": {
"start": "2024-01-01T00:00:00Z",
"end": "2024-01-07T23:59:59Z",
"days": 7
}
}Ask questions about your interaction history.
Request:
{
"question": "What did I discuss with Sarah last week?",
"user_id": 1,
"n_results": 10
}Response:
{
"answer": "Last week, you discussed...",
"sources": [
{
"interaction_id": 456,
"contact_name": "Sarah",
"timestamp": "2024-01-05T14:30:00Z",
"relevance_score": 0.92,
"snippet": "We talked about the upcoming project..."
}
],
"retrieved_count": 5
}Multi-turn conversation with context.
Request:
{
"question": "What about her family?",
"user_id": 1,
"conversation_history": [
{
"question": "What did I discuss with Sarah?",
"answer": "You discussed work projects..."
}
]
}Generate insights about interaction patterns.
Request:
{
"user_id": 1,
"topic": "health"
}Response:
{
"insights": "Your health-related interactions show...",
"analyzed_interactions": 30,
"total_contacts": 15
}Get all contacts for a user.
Response:
{
"contacts": [
{
"id": 1,
"name": "John Doe",
"relationship": "friend",
"relationship_detail": "College friend",
"phone_number": "+1234567890",
"email": "john@example.com",
"notes": "Met at university",
"visit_frequency": "weekly",
"last_seen": "2024-01-05T14:30:00Z"
}
]
}Create a new contact with optional profile photo.
Request (multipart/form-data):
name: "Jane Smith"
relationship: "family"
relationship_detail: "Sister"
phone_number: "+1234567890"
email: "jane@example.com"
profile_photo: [file]
Update contact information.
Delete a contact.
Get interaction history with optional filters.
Query Parameters:
contact_id: Filter by contactstart_date: Filter by start dateend_date: Filter by end datelimit: Number of results (default: 50)
Create a new interaction record.
Request:
{
"contact_id": 1,
"contact_name": "John Doe",
"summary": "Discussed project timeline",
"full_details": "We talked about...",
"key_topics": ["work", "deadlines"],
"location": "Office"
}Get dashboard statistics.
Response:
{
"total_contacts": 25,
"total_interactions": 150,
"interactions_this_week": 12,
"interactions_this_month": 45,
"top_contacts": [
{
"name": "John Doe",
"count": 20,
"last_interaction": "2024-01-05T14:30:00Z"
}
],
"recent_interactions": [...],
"interaction_trend": [...]
}Semantic search across interactions.
Request:
{
"query": "health discussions",
"user_id": 1,
"n_results": 10
}Response:
{
"results": [
{
"interaction_id": 123,
"contact_name": "Dr. Smith",
"timestamp": "2024-01-03T10:00:00Z",
"content": "Discussed blood pressure...",
"relevance_score": 0.89
}
]
}For complete API documentation, visit http://localhost:8000/docs after starting the server.
Model Architecture:
- Detection: RetinaFace (lightweight variant)
- Embedding: ArcFace ResNet-50
- Input Size: 320x320 pixels
- Output: 512-dimensional L2-normalized embeddings
- Inference Backend: ONNX Runtime with CPU optimization
Performance Characteristics:
- Speed: ~3x faster than Buffalo-L with minimal accuracy loss
- Detection Threshold: 0.5 (det_score)
- Recognition Threshold: 0.45 (cosine similarity)
- Optimal Range: 0.5m - 3m from camera
- Multi-face: Supports multiple faces per frame
Why Buffalo-S?
- Optimized for real-time wearable applications
- Lower memory footprint (~100MB vs ~300MB for Buffalo-L)
- Faster inference on CPU (50-100ms vs 150-300ms)
- Sufficient accuracy for close-range face recognition
- Better suited for battery-powered devices
Model Architecture:
- Base Model: OpenAI Whisper base.en
- Backend: CTranslate2 (optimized C++ inference)
- Quantization: INT8 (4x speedup, minimal accuracy loss)
- Parameters: ~74M (base model)
- Languages: English only (en)
Performance Characteristics:
- Speed: ~4x faster than original Whisper
- Latency: ~100-200ms per chunk
- Accuracy: ~95% WER on clean speech
- Memory: ~500MB RAM
- Beam Size: 1 (greedy decoding for speed)
Optimizations:
- VAD filtering reduces unnecessary processing by ~60%
- INT8 quantization provides 3-4x speedup
- Greedy decoding prevents hallucinations
- Condition on previous text disabled for short chunks
Why Faster Whisper?
- 4x faster than original Whisper implementation
- Lower memory usage with INT8 quantization
- Better suited for real-time streaming
- CTranslate2 backend optimized for CPU inference
- Maintains high accuracy on conversational speech
Model Characteristics:
- Version: Gemini 2.5 Flash
- Context Window: 1M tokens input, 8K tokens output
- Latency: ~1-2 seconds for typical queries
- Cost: Optimized for high-volume applications
- Capabilities: Multi-turn conversation, RAG, summarization
Use Cases in MindTrace:
- RAG Query Answering: Retrieve and synthesize information from interaction history
- Summarization: Generate brief, detailed, or analytical summaries
- Insights Generation: Analyze patterns and provide recommendations
- Multi-turn Chat: Maintain conversation context across multiple turns
- Contact Analysis: Understand relationships and communication patterns
Prompt Engineering:
- Strict anti-hallucination instructions
- Plain text output (no markdown) for HUD display
- Context-aware prompts with contact data, stats, and interactions
- Explicit instructions to only use provided data
- Differentiation between "Last Seen" and "Interactions"
Why Gemini 2.5 Flash?
- Fast inference for real-time applications
- Large context window for comprehensive RAG
- Cost-effective for high-volume usage
- Strong reasoning capabilities for insights
- Reliable plain text generation
Model Characteristics:
- Architecture: Sentence Transformer (MiniLM)
- Dimensions: 384
- Max Sequence Length: 256 tokens
- Use Case: Semantic search over conversations
Performance:
- Speed: ~10ms per sentence on CPU
- Quality: High semantic similarity accuracy
- Size: ~80MB model file
Integration:
- Used by ChromaDB for automatic text embedding
- Powers semantic search across interaction history
- Enables RAG retrieval for AI assistant
- Model Selection: Buffalo-S provides 3x speedup over Buffalo-L
- Detection Size: 320x320 balances speed and accuracy
- Warmup: Models pre-loaded on server startup (zero cold start)
- Filtering: det_score ≥ 0.5 reduces false positives
- Batch Processing: Multiple faces processed in single inference
- ONNX Runtime: Optimized C++ inference engine
- Faster Whisper: 4x faster than original Whisper
- INT8 Quantization: 3-4x speedup with minimal accuracy loss
- VAD Filtering: Reduces processing by ~60%
- Greedy Decoding: Beam size 1 for maximum speed
- Chunk Size: 30ms frames for low latency
- Transcript Caching: Smoother output with deque cache
- ChromaDB: HNSW index for fast vector similarity search
- PostgreSQL: Indexed queries on user_id, contact_id, timestamp
- Connection Pooling: SQLAlchemy connection pool
- Lazy Loading: Relationships loaded on-demand
- Batch Operations: Bulk inserts for face embeddings
- Vite: Lightning-fast HMR and optimized builds
- Code Splitting: React.lazy for route-based splitting
- Image Optimization: Lazy loading and responsive images
- Debouncing: Search and input debouncing
- Memoization: React.memo for expensive components
Recommended Stack:
- Server: Ubuntu 20.04+ or similar Linux distribution
- Python: 3.10-3.12 with uv package manager
- Database: PostgreSQL 13+ (managed service recommended)
- Vector DB: ChromaDB Cloud or self-hosted with persistent storage
- Web Server: Nginx as reverse proxy
- Process Manager: systemd or supervisor
Environment Setup:
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone https://github.com/yourusername/mindtrace.git
cd mindtrace/server
uv sync
# Configure production environment
cp .env.example .env
# Edit .env with production values
# Run with systemd
sudo systemctl start mindtraceSystemd Service Example:
[Unit]
Description=MindTrace API Server
After=network.target
[Service]
Type=simple
User=mindtrace
WorkingDirectory=/opt/mindtrace/server
Environment="PATH=/home/mindtrace/.local/bin:/usr/bin"
ExecStart=/home/mindtrace/.local/bin/uv run uvicorn app.app:app --host 0.0.0.0 --port 8000
Restart=always
[Install]
WantedBy=multi-user.targetNginx Configuration:
server {
listen 80;
server_name api.mindtrace.com;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /asr/stream {
proxy_pass http://127.0.0.1:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}Build for Production:
# Dashboard
cd client
npm run build
# Output: dist/
# Glass Client
cd glass-client
npm run build
# Output: dist/Deployment Options:
- Static Hosting: Vercel, Netlify, Cloudflare Pages
- CDN: AWS CloudFront, Cloudflare CDN
- Self-hosted: Nginx serving static files
Nginx Static Hosting:
server {
listen 80;
server_name mindtrace.com;
root /var/www/mindtrace/client/dist;
index index.html;
location / {
try_files $uri $uri/ /index.html;
}
}Docker Compose Example:
version: '3.8'
services:
postgres:
image: postgres:15
environment:
POSTGRES_DB: mindtrace
POSTGRES_USER: mindtrace
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
chromadb:
image: chromadb/chroma:latest
environment:
CHROMA_SERVER_AUTH_CREDENTIALS: ${CHROMA_API_KEY}
volumes:
- chroma_data:/chroma/chroma
ports:
- "8001:8000"
api:
build: ./server
environment:
DATABASE_URL: postgresql://mindtrace:${DB_PASSWORD}@postgres:5432/mindtrace
CHROMA_HOST: chromadb
CHROMA_PORT: 8000
GEMINI_API_KEY: ${GEMINI_API_KEY}
SECRET_KEY: ${SECRET_KEY}
ports:
- "8000:8000"
depends_on:
- postgres
- chromadb
volumes:
- ./server/data:/app/data
dashboard:
build: ./client
ports:
- "80:80"
depends_on:
- api
volumes:
postgres_data:
chroma_data:- API Keys: Store in environment variables, never commit to git
- JWT Secrets: Use strong, randomly generated secrets (32+ characters)
- HTTPS: Always use SSL/TLS in production
- CORS: Restrict origins to known domains
- Rate Limiting: Implement rate limiting on API endpoints
- Input Validation: Pydantic models validate all inputs
- SQL Injection: SQLAlchemy ORM prevents SQL injection
- File Uploads: Validate file types and sizes
- Authentication: JWT tokens with expiration
- Database: Use strong passwords and restrict network access
Recommended Tools:
- Application Monitoring: Sentry, DataDog, New Relic
- Logging: ELK Stack (Elasticsearch, Logstash, Kibana)
- Metrics: Prometheus + Grafana
- Uptime: UptimeRobot, Pingdom
FastAPI Logging:
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('mindtrace.log'),
logging.StreamHandler()
]
)# Backend development
cd server
uv sync
uv run main.py # Auto-reload enabled
# Frontend development
cd client
npm install
npm run dev # HMR enabled
# Glass client development
cd glass-client
npm install
npm run devBackend (Python):
# Linting
ruff check .
# Formatting
black .
# Type checking
mypy .Frontend (JavaScript):
# Linting
npm run lint
# Formatting
npx prettier --write .Backend Tests:
cd server
pytest tests/Frontend Tests:
cd client
npm testUsing Alembic:
cd server
# Create migration
alembic revision --autogenerate -m "Add new column"
# Apply migration
alembic upgrade head
# Rollback
alembic downgrade -1- Face Recognition: Replace Buffalo-S in
ai_engine/face_engine.py - Speech Recognition: Replace Faster Whisper in
ai_engine/asr/asr_engine.py - LLM: Replace Gemini in
ai_engine/rag_engine.pyandai_engine/summarizer.py
We welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Contribution Guidelines:
- Follow existing code style and conventions
- Add tests for new features
- Update documentation as needed
- Keep commits atomic and well-described
- Ensure all tests pass before submitting PR
Problem: No faces detected or low accuracy
Solutions:
- Check lighting conditions (face recognition works best in good lighting)
- Ensure camera is 0.5m - 3m from subject
- Verify face embeddings are synced:
POST /face/sync - Check ChromaDB connection:
curl http://localhost:8000/health - Lower detection threshold in
face_engine.pyif needed
Problem: High latency or slow transcription
Solutions:
- Ensure Faster Whisper is using INT8 quantization
- Check CPU usage (should be < 50% per core)
- Verify VAD is enabled and filtering silence
- Consider using
tiny.enmodel for faster inference - Check audio quality (16kHz mono recommended)
Problem: Cannot connect to ChromaDB
Solutions:
- Verify ChromaDB is running:
chroma run --host localhost --port 8000 - Check environment variables in
.env - Test connection:
curl http://localhost:8000/api/v1/heartbeat - Check firewall settings
- For cloud ChromaDB, verify API key and tenant/database names
Problem: Alembic migration fails
Solutions:
- Check database connection string in
.env - Ensure PostgreSQL is running
- Verify database user has proper permissions
- Reset migrations:
alembic downgrade base && alembic upgrade head - For SQLite, check file permissions
Problem: Server crashes with OOM error
Solutions:
- Reduce batch size for face recognition
- Use smaller Whisper model (
tiny.enorbase.en) - Limit ChromaDB query results (
n_results) - Increase server RAM (minimum 4GB recommended)
- Enable swap space on Linux
Problem: ASR WebSocket disconnects frequently
Solutions:
- Check network stability
- Increase WebSocket timeout in Nginx/proxy
- Verify audio chunk size (30ms recommended)
- Check server logs for errors
- Ensure client is sending keep-alive messages
Hardware: MacBook Pro M1, 16GB RAM
| Operation | Latency | Throughput |
|---|---|---|
| Face Detection (single face) | 50-100ms | 10-20 FPS |
| Face Recognition (query) | 20-30ms | 30-50 queries/sec |
| ASR Transcription (1s audio) | 100-200ms | 5-10 chunks/sec |
| RAG Query | 1-2s | 0.5-1 queries/sec |
| Summarization | 2-5s | 0.2-0.5 summaries/sec |
| Database Query (indexed) | 5-10ms | 100-200 queries/sec |
| ChromaDB Vector Search | 10-50ms | 20-100 queries/sec |
Note: Performance varies based on hardware, model size, and data volume.
- ✅ Real-time face recognition with InsightFace Buffalo-S
- ✅ Live speech-to-text with Faster Whisper
- ✅ Context-aware AI assistant with Gemini 2.5 Flash
- ✅ RAG over interaction history
- ✅ AI summarization and insights
- ✅ Contact management with profile photos
- ✅ Interaction history tracking
- ✅ Emergency SOS system
- ✅ Smart reminders
- ✅ Mobile-responsive dashboard
- ✅ Semantic search
- 🔄 Multi-language support (Spanish, French, German)
- 🔄 Emotion detection in conversations
- 🔄 Voice cloning for personalized responses
- 🔄 Offline mode with local models
- 🔄 Mobile apps (iOS/Android)
- 🔄 Integration with calendar and email
- 🔄 Advanced analytics dashboard
- 🔄 Export data to PDF reports
- 📋 Real-time object recognition
- 📋 Scene understanding and context
- 📋 Multi-modal AI (vision + audio + text)
- 📋 Predictive reminders based on patterns
- 📋 Social network graph visualization
- 📋 Integration with health monitoring devices
- 📋 Voice commands for hands-free operation
- 📋 Collaborative features for caregivers
Q: What smart glasses are supported? A: MindTrace is designed for Ray-Ban Meta smart glasses but can work with any device that has a camera, microphone, and can run a web browser.
Q: Can I use this without smart glasses? A: Yes! You can use the dashboard and upload photos/audio manually. The glass client is optional.
Q: Is my data private? A: Yes. All data is stored locally or in your own database. Face embeddings and conversations never leave your server unless you use cloud services (Gemini API, ChromaDB Cloud).
Q: Can I use different AI models? A: Yes! The system is modular. You can replace Gemini with OpenAI, Anthropic, or local models. See "Adding New Models" in the Development section.
Q: What's the minimum hardware requirement? A: Server: 4GB RAM, 2 CPU cores, 10GB storage. Client: Any modern browser. Smart glasses: Ray-Ban Meta or similar.
Q: Does this work offline? A: Partially. Face recognition and speech-to-text work offline, but the AI assistant requires internet for Gemini API. Offline mode with local LLMs is planned for v1.1.
Q: How accurate is the face recognition? A: ~95% accuracy at 0.5-3m range in good lighting. Accuracy decreases with poor lighting, extreme angles, or occlusions.
Q: Can I use this for commercial purposes? A: Yes, under the MIT license. However, check the licenses of individual models (InsightFace, Whisper, etc.) for commercial use restrictions.
- InsightFace - State-of-the-art face recognition models
- Faster Whisper - Optimized Whisper implementation
- OpenAI Whisper - Robust speech recognition
- Google Gemini - Powerful language model
- ChromaDB - Vector database for embeddings
- FastAPI - Modern Python web framework
- React - UI library
- Vite - Next-generation frontend tooling
- Tailwind CSS - Utility-first CSS framework
- LangChain - LLM application framework
- PyTorch - Deep learning framework
- SQLAlchemy - SQL toolkit and ORM
- ArcFace: Deng, J., et al. (2019). "ArcFace: Additive Angular Margin Loss for Deep Face Recognition"
- RetinaFace: Deng, J., et al. (2020). "RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild"
- Whisper: Radford, A., et al. (2022). "Robust Speech Recognition via Large-Scale Weak Supervision"
- Sentence Transformers: Reimers, N., & Gurevych, I. (2019). "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks"
Thank you to all contributors who have helped make MindTrace better!
This project is licensed under the MIT License - see the LICENSE file for details.
- InsightFace: Apache 2.0 License (non-commercial use recommended)
- Whisper: MIT License
- FastAPI: MIT License
- React: MIT License
- Tailwind CSS: MIT License
- ChromaDB: Apache 2.0 License
- PyTorch: BSD-style License
Note: Some models (InsightFace) have restrictions on commercial use. Please review individual licenses before deploying commercially.
- Documentation: This README and inline code comments
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@mindtrace.com (if applicable)
When reporting bugs, please include:
- Operating system and version
- Python version
- Node.js version
- Steps to reproduce
- Expected vs actual behavior
- Error messages and logs
- Screenshots (if applicable)
We welcome feature requests! Please:
- Check existing issues first
- Describe the feature and use case
- Explain why it would be valuable
- Provide examples if possible
If you use MindTrace in your research or project, please cite:
@software{mindtrace2024,
title = {MindTrace: AI-Powered Memory Assistant for Smart Glasses},
author = {Your Name},
year = {2024},
url = {https://github.com/yourusername/mindtrace}
}Project Maintainer: Your Name
Email: your.email@example.com
GitHub: @yourusername
Website: https://mindtrace.com (if applicable)
Built with ❤️ for people who need a little help remembering