This project is a backend memory infrastructure system built in Go. It is designed to store, retrieve, and intelligently search user memories using a combination of relational storage, vector search, caching, and ranking systems.
The system evolves in multiple versions, starting from a simple CRUD backend and progressing toward a production-grade hybrid retrieval engine.
- Provide persistent memory storage for users
- Support session-based context grouping
- Enable hybrid search (keyword + semantic)
- Implement ranking over retrieved memories
- Support scalable and modular backend design
- Demonstrate production-ready system design principles
- Go (Backend API)
- PostgreSQL (Relational storage)
- Qdrant (Vector database for embeddings)
- Redis (Caching layer)
- Docker (Containerization)
- Next.js (frontend dashboard)
The system is designed in layered components:
- API Layer (Go HTTP server)
- Service Layer (Business logic)
- Repository Layer (Database access)
- Storage Layer (PostgreSQL, Qdrant, Redis)
- Async Worker (Embedding pipeline)
- Search Engine (Hybrid retrieval + ranking)
- Create memory
- Read memory
- Delete memory
- List memories by user or session
- Create sessions
- Attach memories to sessions
- Retrieve session history
- Context-based grouping of memories
- Semantic search using vector embeddings (Qdrant)
- Keyword search using PostgreSQL full-text search
- Merging and deduplication of results
- Similarity score
- Recency score
- Importance score
- Weighted ranking formula
- Async worker-based processing
- Generates embeddings on memory creation
- Stores vectors in Qdrant
- Redis-based caching for:
- search results
- session data
- frequent queries
- API key-based authentication
- Middleware-based request validation
- User-scoped data access
When a search request is made:
- Check Redis cache
- Query vector database (Qdrant)
- Query PostgreSQL full-text search
- Merge results
- Apply ranking engine
- Return top-K results
Focus: Basic functionality
- Memory CRUD
- Session system
- PostgreSQL integration
- REST API
- API key authentication
Outcome: A fully functional backend capable of storing and retrieving structured memories.
Focus: Intelligence layer
- Embedding pipeline
- Qdrant integration
- Semantic search
- Keyword search (PostgreSQL FTS)
- Hybrid search system
Outcome: The system becomes a retrieval engine rather than just a database.
Focus: Intelligence and optimization
- Ranking engine (similarity, recency, importance)
- Redis caching layer
- Session-aware boosting
- Performance improvements
Outcome: A context-aware retrieval system with optimized response quality.
Focus: Scalability and observability
- Async job improvements
- Rate limiting
- Logging and latency tracking
- Search tracing
- Optional Next.js dashboard
Outcome: A production-like distributed system with observability and monitoring.
- POST /memories
- GET /memories
- DELETE /memories/:id
- POST /sessions
- GET /sessions
- GET /sessions/:id
- GET /sessions/:id/memories
- POST /search
- Distributed worker system (Kafka or NATS)
- Advanced ranking models (ML-based scoring)
- Multi-tenant architecture
- Real-time streaming memory ingestion
- Analytics dashboard
This project demonstrates how to evolve a simple CRUD backend into a full hybrid retrieval system using modern backend engineering practices including vector search, caching, async processing, and ranking systems.