🧠 DocMind — Enhanced Production-Grade RAG System

DocMind is a full-stack, production-ready Retrieval-Augmented Generation (RAG) application built to simulate real-world AI retrieval behavior. It goes beyond basic vector search by implementing an intelligent multi-stage pipeline: query rewriting, multi-query generation, hybrid retrieval (BM25 + semantic), LLM-as-a-Judge relevance filtering, and a corrective retrieval loop — all with full transparency surfaced in the UI.

Powered by Gemini 2.0 Flash (via OpenRouter) and Pinecone Serverless, with a React + TypeScript frontend.

✨ Features

🔧 Enhanced RAG Pipeline (PRD §6.1–6.7)

Feature	Description
§6.1 Query Rewriting	Automatically fixes spelling mistakes, expands vague queries, and improves grammar before retrieval — preserving user intent
§6.2 Multi-Query Generation	Generates 3 semantically diverse variants of the rewritten query to maximize retrieval coverage
§6.3 Advanced Chunking	Section-aware chunking that detects headings (markdown, numbered, ALL CAPS) and stores `section`, `chunk_id`, `page`, and `source` metadata per chunk
§6.4 Hybrid Retrieval	Fuses vector similarity search with BM25 keyword scoring via Reciprocal Rank Fusion (RRF) — handles exact syntax, abbreviations, and numbers that semantic search misses
§6.5 LLM-as-a-Judge	Evaluates each retrieved chunk for relevance before generation — filters noisy context and reduces hallucinations
§6.6 Corrective RAG Loop	If all chunks are irrelevant, rewrites the query from a new angle and retries retrieval (max 2 retries, prevents infinite loops)
§6.7 Hallucination Prevention	System prompt strictly confines the LLM to provided context; responds with `"I could not find this information in the uploaded documents."` when unsupported
§6.8 Indexing Status	Step-by-step upload progress overlay: Uploading → Parsing → Chunking → Creating Embeddings → Saving to Vector DB → Complete
§6.9 Error Handling	Gracefully handles empty docs, scanned PDFs, malformed CSVs, unsupported formats, embedding rate limits, API timeouts, and auth failures

🏗️ Core Foundation

Multi-Tenant Isolation: Each user's vectors are isolated using userId metadata filtering in Pinecone — no cross-user data leakage
Integrated Inference: Pinecone's llama-text-embed-v2 embeds text automatically — no separate embedding service needed
Conversational Memory: Full chat history is injected into the LLM prompt for intelligent follow-up questions
Rate-Limit Resilience: Automatic 60s backoff on Pinecone free-tier rate limits during large document uploads
Client-Side Persistence: Chat history and document registry stored in localStorage — zero backend state
Pipeline Transparency: Every assistant response shows a collapsible "⚙️ Pipeline details" panel revealing the rewritten query, query variants, chunks retrieved vs. after judge, and corrective loop attempts
Dark / Light Mode: System-preference-aware with manual toggle

🛠️ Technology Stack

Layer	Technology
Frontend	React 18, TypeScript, Vite, Vanilla CSS
Backend	Node.js, Express 5, Multer
Vector Database	Pinecone Serverless (integrated `llama-text-embed-v2`)
LLM	Gemini 2.0 Flash via OpenRouter
Document Parsing	`pdf-parse` (per-page extraction)
Retrieval	Hybrid: Pinecone vector search + in-process BM25 + RRF fusion

🚀 Setup & Installation

Prerequisites

Node.js v18+
A Pinecone account and API key
An OpenRouter account and API key

1. Backend Setup

cd backend
npm install

Create a .env file in the backend/ directory:

PINECONE_API_KEY=your_pinecone_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here
PORT=3001

Start the server:

npm run dev

You should see:

🚀 Enhanced RAG Backend running at http://localhost:3001
  ✅ §6.1 Query Rewriting
  ✅ §6.2 Multi-Query Generation
  ✅ §6.3 Advanced Metadata-Aware Chunking
  ✅ §6.4 Hybrid Retrieval (Semantic + BM25/RRF)
  ✅ §6.5 LLM-as-a-Judge
  ✅ §6.6 Corrective RAG Loop (max 2 retries)
  ✅ §6.7 Hallucination Prevention Prompt
  ✅ §6.8 Upload Status Steps
  ✅ §6.9 Comprehensive Error Handling

2. Frontend Setup

cd frontend
npm install
npm run dev

Open http://localhost:5173 in your browser.

📖 How to Use

Upload Documents — Click + in the Documents sidebar. Supports .pdf, .txt, .md, .csv, .json, .log (up to 50 MB). Watch the 6-step progress overlay as the document is parsed, chunked, and indexed.
Ask Questions — Type naturally, even with typos or vague phrasing. The pipeline automatically rewrites your query, generates variants, runs hybrid retrieval, judges relevance, and may self-correct before answering.
Inspect the Pipeline — Click ⚙️ Pipeline details under any assistant response to see:
- How your query was rewritten
- How many query variants were generated
- How many chunks were retrieved vs. passed the LLM judge
- Whether a corrective loop was triggered
View Citations — Click 📚 Sources to see the exact document, page, section, and chunk ID used. Sources include hybrid match scores.
Manage Chats & Docs — Use + to start new chats; × to delete chats or documents (deletion removes vectors from Pinecone).

Note: On a free Pinecone tier, uploading large documents (200+ pages) may trigger a 60-second automatic backoff — the app will handle this transparently.

🏛️ Architecture

User Query
      ↓
 Query Rewriting (§6.1)        ← fix typos, expand vague queries
      ↓
 Multi-Query Generation (§6.2) ← 3 semantic variants
      ↓
 Hybrid Retrieval (§6.4)
   ├── Semantic Search × N queries (Pinecone)
   └── BM25 Keyword Scoring
         ↓  Reciprocal Rank Fusion
      Top-8 Candidates
      ↓
 LLM-as-a-Judge (§6.5)         ← per-chunk relevance filter
      ↓
 All irrelevant? → Corrective Loop (§6.6) → retry (max 2×)
      ↓
 Final LLM Generation (§6.7)   ← context-only, hallucination-guarded
      ↓
 Answer + Sources + Pipeline Metadata

📁 Project Structure

Assignment/
├── backend/
│   ├── server.js      # Express API + full enhanced pipeline orchestration
│   ├── ai.js          # Query rewriting, multi-query, LLM judge, answer generation
│   ├── chunker.js     # Section-aware chunking with metadata extraction
│   ├── vectorDB.js    # Pinecone upsert + hybrid search (BM25 + RRF)
│   ├── parser.js      # PDF (per-page) and plain text parsing
│   └── .env           # API keys (not committed)
└── frontend/
    └── src/
        ├── App.tsx    # Main React app with pipeline UI + upload progress
        └── App.css    # Styling for all enhanced features

🔑 API Endpoints

Method	Endpoint	Description
`GET`	`/api/init`	Generate a new user ID
`POST`	`/api/upload`	Upload and index a document
`POST`	`/api/ask`	Run the full enhanced RAG pipeline
`GET`	`/api/documents`	List documents (client-managed)
`DELETE`	`/api/documents`	Clear all documents for a user
`DELETE`	`/api/documents/:id`	Delete a specific document
`GET`	`/api/health`	Health check

The /api/ask response includes a pipeline object:

{
  "answer": "...",
  "sources": [...],
  "pipeline": {
    "originalQuery": "how debug node",
    "rewrittenQuery": "How to debug a Node.js application?",
    "queriesGenerated": 4,
    "chunksRetrieved": 8,
    "chunksAfterJudge": 3,
    "correctiveAttempts": 0
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 DocMind — Enhanced Production-Grade RAG System

✨ Features

🔧 Enhanced RAG Pipeline (PRD §6.1–6.7)

🏗️ Core Foundation

🛠️ Technology Stack

🚀 Setup & Installation

Prerequisites

1. Backend Setup

2. Frontend Setup

📖 How to Use

🏛️ Architecture

📁 Project Structure

🔑 API Endpoints

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 DocMind — Enhanced Production-Grade RAG System

✨ Features

🔧 Enhanced RAG Pipeline (PRD §6.1–6.7)

🏗️ Core Foundation

🛠️ Technology Stack

🚀 Setup & Installation

Prerequisites

1. Backend Setup

2. Frontend Setup

📖 How to Use

🏛️ Architecture

📁 Project Structure

🔑 API Endpoints

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages