Transform your precious memories into beautiful stories with AI-powered storytelling and multilingual audio support
A comprehensive FastAPI service that turns your photos into vivid narratives using advanced vision-language models with Cantonese translation and text-to-speech capabilities
๐ Recent Updates - MongoDB Integration And Nomiated as Finalist for 2025 Inter-University GenAI Hackathon
My Team Mingle #1101 has been nominated as Finalist in the Hackathon !!
Your Memory Garden FastAPI has been upgraded from JSON file storage to a production-ready MongoDB backend with significant improvements:
From: JSON file-based storage (stories.json)
To: MongoDB with proper async operations and indexing
# Environment Variables
MONGODB_URI = "mongodb://localhost:27017" # MongoDB connection string
MONGODB_DB_NAME = "community_platform" # Database name
MONGODB_COLLECTION_NAME = "stories" # Collection for storiesThe application now uses a custom ElderDB class loaded from the Backend module:
# Dynamic import from Backend/backend/agents/utils/mongo.py
ElderDB = _load_elder_db()
elder_db = ElderDB()
story_collection = elder_db.connect_collection(
db_name=MONGODB_DB_NAME,
collection_name=MONGODB_COLLECTION_NAME
)All database operations are now fully asynchronous:
# Before (synchronous JSON operations)
def add(self, record: StoryRecord) -> StoryRecord:
# File I/O operations
# After (async MongoDB operations)
async def add(self, record: StoryRecord) -> StoryRecord:
def _insert() -> str:
result = self._collection.insert_one(self._serialize(record))
return str(result.inserted_id)
inserted_id = await run_in_threadpool(_insert)
return record.model_copy(update={"id": inserted_id})Automatic index creation for optimized queries:
@app.on_event("startup")
async def on_startup() -> None:
await story_repository.ensure_indexes() # Creates index on 'created_at'The application now requires additional MongoDB packages:
# Add to your requirements.txt
pymongo>=4.6.0 # MongoDB Python driver
bson>=0.5.10 # BSON data format support
motor>=3.3.0 # Async MongoDB driver (optional)pip install pymongo bson
# or
pip install -r requirements.txt # if updated- Story IDs: Now use MongoDB ObjectId format instead of UUID
- Photo IDs: Still use UUID for file system consistency
- Validation: Proper ObjectId validation with error handling
# ID Validation Example
try:
object_id = ObjectId(story_id)
except (InvalidId, TypeError):
return None # Handle invalid ID gracefullyEnhanced serialization for MongoDB compatibility:
def _serialize(self, record: StoryRecord) -> dict:
payload = record.model_dump(mode="python", exclude_none=True)
payload.pop("id", None) # MongoDB handles _id separately
return payload
def _deserialize(self, payload: dict) -> StoryRecord:
data = payload.copy()
mongo_id = data.pop("_id", None)
if mongo_id is not None:
data["id"] = str(mongo_id) # Convert ObjectId to string
return StoryRecord(**data)Proper database connection lifecycle management:
@app.on_event("startup")
async def on_startup() -> None:
await story_repository.ensure_indexes()
@app.on_event("shutdown")
async def on_shutdown() -> None:
elder_db.close_connection() # Clean database connection closureIf you have existing JSON story data, you can migrate it:
# Migration script (run once)
import json
from pathlib import Path
# Read old JSON data
old_data = json.loads(Path("data/stories.json").read_text())
# Insert into MongoDB
for story_data in old_data:
story_record = StoryRecord(**story_data)
await story_repository.add(story_record)# Set MongoDB connection string
export MONGODB_URI="mongodb://localhost:27017"
export MONGODB_DB_NAME="community_platform"
export MONGODB_COLLECTION_NAME="stories"
# For production with authentication
export MONGODB_URI="mongodb://username:password@localhost:27017/database"- Connection String Security: Use environment variables for credentials
- Database Authentication: Support for MongoDB authentication
- Connection Pooling: Automatic connection management via ElderDB
- Horizontal Scaling: Ready for MongoDB replica sets
- Indexing Strategy: Optimized queries with proper indexes
- Async Operations: Non-blocking database operations
- Connection Pooling: Efficient resource utilization
# Start MongoDB locally
mongod --dbpath /usr/local/var/mongodb
# Or with Docker
docker run -d -p 27017:27017 --name mongodb mongo:latest
# Test the connection
python -c "from pymongo import MongoClient; print(MongoClient().server_info())"# Test story creation (should return MongoDB ObjectId)
curl -X POST "http://localhost:8000/upload/stories" \
-F "photos=@test.jpg" \
-F "date=2024-05-24" \
-F "weather=Sunny" \
-F "location=Test Location"
# Response will include MongoDB ObjectId:
# {"id": "507f1f77bcf86cd799439011", ...}All existing API endpoints work the same way:
- โ
POST /upload/stories- Same functionality - โ
GET /stories- Same response format - โ
GET /stories/{id}- Now accepts MongoDB ObjectId - โ
PUT /stories/{id}/photos- Enhanced performance - โ
DELETE /stories/{id}/photos- Atomic operations
The API responses maintain the same structure, with MongoDB ObjectIds converted to strings for JSON compatibility.
- Story IDs: Now MongoDB ObjectIds instead of UUIDs
- Database Dependency: Requires MongoDB server running
- Environment Variables: New MongoDB configuration required
- Advanced Querying: Complex MongoDB queries for filtering and search
- Aggregation Pipelines: Advanced analytics on story data
- GridFS Support: Large file storage for high-resolution images
- Replica Sets: High availability and read scaling
- Sharding: Horizontal scaling for massive datasets
This MongoDB integration provides a solid foundation for production deployment with enterprise-grade data persistence and scalability.
๐ค AI-Powered Storytelling - Generate compelling narratives using Ollama's LLaVA vision-language model
๐ผ๏ธ Multi-Photo Processing - Upload up to 10 photos and create cohesive stories from multiple images
๐ต Multilingual Audio Support - Automatic Cantonese translation and text-to-speech synthesis
๐ Comprehensive Storage System - JSON-based persistence with atomic operations and photo management
๐ Full CRUD Operations - Complete story lifecycle management with photo updates and deletions
๐ RESTful API Design - Well-structured endpoints with proper HTTP methods and status codes
โก High Performance - Async operations with thread pooling for optimal file I/O
๐ Interactive Documentation - Auto-generated API docs with Swagger UI
๐ Thread-Safe Operations - Concurrent request handling with proper locking mechanisms
๐จ Rich Context Integration - Include date, location, weather, and personal metadata
- Python 3.8+
- Ollama installed and running locally
- LLaVA model downloaded in Ollama
- Google Translate API access (optional for translations)
# Clone the repository
git clone https://github.com/your-username/Memory-Garden.git
cd Memory-Garden
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Mac/Linux
# or
venv\Scripts\activate # On Windows
# Install dependencies
pip install fastapi uvicorn python-multipart ollama googletrans==3.1.0a0 gtts# Start Ollama service
ollama serve
# Pull the LLaVA model
ollama pull llava# Navigate to FastAPI App directory
cd "FastAPI App"
# Start the development server
uvicorn main:app --reload
# API available at: http://localhost:8000
# Documentation: http://localhost:8000/docsGET /Returns API information and welcome message.
POST /upload/storiesUpload photos with metadata and generate AI-powered story.
Form Parameters:
photos(files, required): Image files (max 10)date(string, required): Memory date (YYYY-MM-DD)weather(string, required): Weather descriptionlocation(string, required): Location information
Response Example:
{
"message": "Photo uploaded and story generated successfully.",
"id": "a1b2c3d4e5f6",
"date": "2024-05-24",
"weather": "Sunny with light breeze",
"location": "Central Park, New York",
"photos": [
{
"id": "photo123",
"filename": "sunset.jpg",
"content_type": "image/jpeg",
"size": 234829,
"path": "uploads/a1b2c3d4e5f6.jpg"
}
],
"story": "The golden hour cast its warm glow across Central Park as I captured this perfect moment...",
"created_at": "2024-05-24T18:30:00Z",
"updated_at": "2024-05-24T18:30:00Z"
}GET /storiesRetrieve all stored stories with metadata.
GET /stories/{story_id}Retrieve a specific story by its ID.
PUT /stories/{story_id}/photosUpdate story metadata, photos, and regenerate narrative.
Form Parameters:
date(string, optional): Updated dateweather(string, optional): Updated weatherlocation(string, optional): Updated locationkeep_photo_ids(string, optional): Comma-separated IDs of photos to keepphotos(files, optional): New photos to add
DELETE /stories/{story_id}/photos?photoIds=id1,id2Remove specific photos from a story and clear the generated narrative.
GET /stories/{story_id}/audioDownload the Cantonese audio file for a story.
GET /stories/{story_id}/audio/streamStream Cantonese audio directly in the browser.
GET /stories/{story_id}/photosGet all photos associated with a specific story.
GET /stories/{story_id}/photos/{photo_id}Download a specific photo file.
Handles AI story generation using the Ollama LLaVA model with contextual prompts.
story_generator = OllamaStoryTeller(ollama_client)
story = story_generator.generate_story(
prompt=contextual_prompt,
encoded_images=base64_images
)Manages file system operations with UUID-based naming and CRUD operations.
photo_storage = PhotoStorage(UPLOAD_DIR)
stored_photos = await photo_storage.persist(uploaded_files)JSON-based persistence layer with atomic operations and thread safety.
story_repository = StoryRepository(STORIES_FILE)
await story_repository.add(story_record){
"id": "unique_uuid",
"filename": "original_name.jpg",
"content_type": "image/jpeg",
"size": 1024000,
"path": "uploads/uuid.jpg"
}{
"id": "story_uuid",
"date": "2024-05-24",
"weather": "Sunny",
"location": "Central Park",
"photos": [StoredPhoto, ...],
"story": "Generated narrative...",
"created_at": "2024-05-24T18:30:00Z",
"updated_at": "2024-05-24T18:30:00Z"
}Stories are automatically translated to Cantonese using Google Translate API:
translator = Translator()
cantonese_text = translator.translate(story_text, dest='yue')Cantonese audio files are generated using gTTS (Google Text-to-Speech):
tts = gTTS(text=cantonese_text, lang='yue', slow=False)
tts.save(audio_file_path)- Automatic Generation: Audio files created on-demand
- Caching: Generated audio cached for future requests
- Cleanup: Audio files deleted when stories are updated
- Streaming Support: Both download and streaming endpoints
FastAPI App/
โโโ main.py # Main FastAPI application
โโโ uploads/ # Photo storage directory
โโโ data/ # JSON persistence layer
โ โโโ stories.json # Story database
โโโ audio/ # Generated Cantonese audio files
โโโ {story_id}_cantonese.mp3
# Ollama Configuration
OLLAMA_MODEL=llava
OLLAMA_HOST=http://localhost:11434
# File Paths
UPLOAD_DIR=uploads
DATA_DIR=data
AUDIO_DIR=audio
# API Configuration
MAX_PHOTOS_PER_UPLOAD=10OLLAMA_STORY_PROMPT = (
"You are a compassionate storyteller. Receive a sequence of photos and "
"the contextual details (date, weather, place). Craft a vivid, coherent "
"narrative that connects all of the photos into a single memory, written "
"in the first person. Avoid bullet points and reference visual details "
"from the images when possible."
)Visit http://localhost:8000/docs for Swagger UI interface.
Upload and Generate Story:
curl -X POST "http://localhost:8000/upload/stories" \
-F "photos=@photo1.jpg" \
-F "photos=@photo2.jpg" \
-F "date=2024-05-24" \
-F "weather=Sunny" \
-F "location=Central Park"Get Cantonese Audio:
curl -O "http://localhost:8000/stories/{story_id}/audio"Update Story:
curl -X PUT "http://localhost:8000/stories/{story_id}/photos" \
-F "keep_photo_ids=photo1,photo2" \
-F "weather=Cloudy" \
-F "photos=@new_photo.jpg"Flexible input handling for photo IDs:
# Supports various formats
"photo1,photo2,photo3" # Comma-separated
["photo1", "photo2", "photo3"] # Array format
'["photo1", "photo2"]' # JSON stringAll file operations use atomic writes to prevent data corruption:
temp_path = storage_path.with_suffix(".tmp")
temp_path.write_text(json_data)
temp_path.replace(storage_path) # Atomic replacementComprehensive error handling with proper cleanup:
- Failed uploads trigger photo deletion
- Story generation errors clean up stored files
- Audio generation failures provide fallback responses
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]- Async Operations: All I/O operations use async/await
- Thread Pooling: CPU-intensive tasks run in thread pools
- File Streaming: Large files streamed for memory efficiency
- Connection Pooling: Ollama client reuse for AI requests
- Health check endpoint:
GET /health - Metrics integration ready for Prometheus
- Structured logging for production debugging
"No module named 'fastapi'"
# Ensure virtual environment is activated
source venv/bin/activate
pip install -r requirements.txtOllama Connection Errors
# Verify Ollama is running
ollama serve
ollama list # Check if LLaVA model is installedTranslation Errors
# Check Google Translate API access
pip install googletrans==3.1.0a0Audio Generation Issues
# Verify gTTS installation
pip install gtts- Increase
max_workersfor thread pool operations - Use SSD storage for better I/O performance
- Consider Redis for caching in production
- Implement connection pooling for database operations
- MongoDB Integration: Replace JSON storage with MongoDB
- Multi-language Support: Add more languages beyond Cantonese
- Real-time Processing: WebSocket support for live story generation
- Advanced AI Models: Support for GPT-4V and other vision models
- Cloud Storage: Integration with AWS S3, Google Cloud Storage
- Authentication: User management and story privacy controls
- Batch Processing: Handle multiple story generation requests
- Analytics: Story generation statistics and user insights
Built with โค๏ธ using FastAPI, Ollama, and modern Python
Documentation โข GitHub โข Issues

