RAG Engine SaaS

Production-oriented Retrieval-Augmented Generation (RAG) stack with a FastAPI backend and React frontend.

Overview

This project provides:

Session-based document ingestion (PDF/TXT/MD/DOCX)
Chunking + retrieval using embeddings with lexical fallback
Chat answers with citations
Optional streaming responses and web-search augmentation
Multi-provider LLM support (Gemini/OpenAI/Anthropic/Ollama)

Architecture / Stack

Backend: FastAPI, SQLite, Python 3.11+
Frontend: React + TypeScript + Vite + Mantine
Providers: Gemini, OpenAI, Anthropic, Ollama

flowchart LR
  A[Upload Docs] --> B[Chunk + Index]
  B --> C[(SQLite + vectors)]
  D[User Question] --> E[Retriever]
  C --> E
  E --> F[LLM]
  F --> G[Answer + Citations]

Quickstart

Prerequisites

Python 3.11+
Node.js 18+

Install backend deps

cd backend
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Install frontend deps

cd ../frontend
npm install

Env vars

Create backend env file:

cd backend
cp .env.example .env

Key variables:

LLM_PROVIDER (gemini | openai | anthropic | ollama)
GEMINI_API_KEY / OPENAI_API_KEY / ANTHROPIC_API_KEY
ENABLE_STREAMING (default true)
ENABLE_WEB_SEARCH + TAVILY_API_KEY (optional)

Run

Backend

cd backend
source .venv/bin/activate
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend

cd frontend
npm run dev

Frontend default URL: http://localhost:5173 Backend default URL: http://localhost:8000

Test

Backend:

cd backend
pytest

Frontend checks:

cd frontend
npm run build

Deployment

Backend can be containerized and deployed as ASGI app.
Frontend builds as static assets with Vite.

# frontend
npm run build

For architecture and operational notes, see docs/.

API snippets

POST /api/sessions
POST /api/sessions/{session_id}/files
POST /api/sessions/{session_id}/chat
POST /api/sessions/{session_id}/chat/stream
GET /api/models

Example request (POST /api/sessions/{session_id}/chat):

{
  "message": "Summarize the uploaded document",
  "temperature": 0.2
}

Example response shape:

{
  "answer": "...",
  "citations": [
    {
      "file_name": "example.pdf",
      "score": 0.87,
      "snippet": "..."
    }
  ],
  "used_embeddings": true,
  "model": "gemini-2.5-flash"
}

Troubleshooting

No model responses: confirm provider API key and LLM_PROVIDER value.
CORS issues: verify CORS_ALLOW_ORIGINS in backend env.
Low retrieval quality: inspect chunking settings and uploaded file quality.
Streaming issues: verify reverse proxy supports SSE.

Changelog

See CHANGELOG.md.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
backend		backend
docs		docs
frontend		frontend
.editorconfig		.editorconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Engine SaaS

Overview

Architecture / Stack

Quickstart

Prerequisites

Install backend deps

Install frontend deps

Env vars

Run

Backend

Frontend

Test

Deployment

API snippets

Troubleshooting

Changelog

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Engine SaaS

Overview

Architecture / Stack

Quickstart

Prerequisites

Install backend deps

Install frontend deps

Env vars

Run

Backend

Frontend

Test

Deployment

API snippets

Troubleshooting

Changelog

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages