Production-oriented Retrieval-Augmented Generation (RAG) stack with a FastAPI backend and React frontend.
This project provides:
- Session-based document ingestion (PDF/TXT/MD/DOCX)
- Chunking + retrieval using embeddings with lexical fallback
- Chat answers with citations
- Optional streaming responses and web-search augmentation
- Multi-provider LLM support (Gemini/OpenAI/Anthropic/Ollama)
- Backend: FastAPI, SQLite, Python 3.11+
- Frontend: React + TypeScript + Vite + Mantine
- Providers: Gemini, OpenAI, Anthropic, Ollama
flowchart LR
A[Upload Docs] --> B[Chunk + Index]
B --> C[(SQLite + vectors)]
D[User Question] --> E[Retriever]
C --> E
E --> F[LLM]
F --> G[Answer + Citations]
- Python 3.11+
- Node.js 18+
cd backend
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtcd ../frontend
npm installCreate backend env file:
cd backend
cp .env.example .envKey variables:
LLM_PROVIDER(gemini|openai|anthropic|ollama)GEMINI_API_KEY/OPENAI_API_KEY/ANTHROPIC_API_KEYENABLE_STREAMING(default true)ENABLE_WEB_SEARCH+TAVILY_API_KEY(optional)
cd backend
source .venv/bin/activate
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm run devFrontend default URL: http://localhost:5173
Backend default URL: http://localhost:8000
Backend:
cd backend
pytestFrontend checks:
cd frontend
npm run build- Backend can be containerized and deployed as ASGI app.
- Frontend builds as static assets with Vite.
# frontend
npm run buildFor architecture and operational notes, see docs/.
POST /api/sessionsPOST /api/sessions/{session_id}/filesPOST /api/sessions/{session_id}/chatPOST /api/sessions/{session_id}/chat/streamGET /api/models
Example request (POST /api/sessions/{session_id}/chat):
{
"message": "Summarize the uploaded document",
"temperature": 0.2
}Example response shape:
{
"answer": "...",
"citations": [
{
"file_name": "example.pdf",
"score": 0.87,
"snippet": "..."
}
],
"used_embeddings": true,
"model": "gemini-2.5-flash"
}- No model responses: confirm provider API key and
LLM_PROVIDERvalue. - CORS issues: verify
CORS_ALLOW_ORIGINSin backend env. - Low retrieval quality: inspect chunking settings and uploaded file quality.
- Streaming issues: verify reverse proxy supports SSE.
See CHANGELOG.md.
MIT — see LICENSE.