This project is a full-stack AI-powered document search system that implements a Retrieval-Augmented Generation (RAG) pipeline using NestJS, PostgreSQL (pgvector), and React.
It allows users to upload documents, perform semantic search using vector similarity, and generate grounded answers with source citations.
The system is fully containerized and runs locally using Docker Compose.
- Upload and process
.txt,.md, and.pdfdocuments - Automatic text chunking with overlap for retrieval quality
- Vector embeddings and cosine similarity search with pgvector
- Grounded LLM answers with source citations
- React frontend for upload, querying, and source inspection
- Local-first Docker Compose setup for reproducible runs
The system follows a modular pipeline:
- Document Upload -> Extract text (PDF/TXT/MD)
- Chunking -> Split text into overlapping segments
- Embeddings -> Convert chunks into vector representations
- Storage -> Store embeddings in PostgreSQL (pgvector)
- Query -> Embed user query
- Retrieval -> Perform cosine similarity search (top-k chunks)
- Generation -> Use retrieved context to generate grounded LLM response
- Response -> Return answer with source citations
User -> API -> Ingestion -> DB (pgvector)
v
Query
v
Retrieval
v
LLM
v
Response
frontend(React + Vite)backend(NestJS)postgres(PostgreSQL + pgvector)
- Copy env files:
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.env-
Set
OPENAI_API_KEYinbackend/.env. -
Run:
docker compose up --buildFrontend: http://localhost:5173
Backend: http://localhost:3000
POST /documents/uploadGET /documentsGET /documents/:idPOST /chat/queryPOST /demo/seed
{
"answer": "...",
"sources": [
{
"documentId": "uuid",
"chunkId": "uuid",
"excerpt": "...",
"filename": "...",
"title": "...",
"chunkIndex": 0,
"score": 0.88
}
]
}Backend (backend/.env):
PORTDB_HOSTDB_PORTDB_USERDB_PASSWORDDB_NAMEOPENAI_API_KEYOPENAI_BASE_URL(optional)EMBEDDINGS_MODELCHAT_MODELEMBEDDING_DIMENSION
Frontend (frontend/.env):
VITE_API_BASE_URL
CREATE EXTENSION IF NOT EXISTS vector;CREATE EXTENSION IF NOT EXISTS pgcrypto;- tables:
documents,chunks,query_logs - vector column:
chunks.embedding VECTOR(1536) - vector index:
ivfflat (vector_cosine_ops)
Demo documents are generic and portfolio-safe:
berlin-public-transport-guide.mdremote-work-policy.txtai-ethics-overview.md
Question: What is the difference between U-Bahn and S-Bahn?
Answer: U-Bahn is mainly metro-style urban rail, while S-Bahn connects city center with outer districts and regional links.
Sources:
- berlin-public-transport-guide.md
- similarity score: ~0.75
- Used PostgreSQL + pgvector to simplify architecture and avoid external vector database dependencies
- Chose NestJS for modular backend structure and maintainability
- Implemented character-based chunking for simplicity and fast iteration
- Used raw SQL for vector similarity search to maintain control over retrieval logic
- Designed system to be fully local and reproducible using Docker Compose
- Synchronous ingestion (not suitable for large-scale workloads)
- Basic top-k retrieval without reranking
- Character-based chunking instead of token-aware splitting
- No authentication or multi-user support
- Background jobs with retry handling for ingestion
- Hybrid retrieval (keyword + vector)
- Better citation formatting and confidence hints
- Pagination and filtering for larger corpora
- Integration and end-to-end test coverage
-
docker compose up --buildstarts all services -
GET /documentsreturns ready documents - Uploading
.txt,.md,.pdfworks -
POST /chat/queryreturns grounded answer + sources - Unsupported file types are rejected cleanly