RAG-pilot

Implemented a rag based python code, where I can talk to a website + multiple documents.

scrape a website using crawl4ai
⁠upload option in ui where I can upload multiple documents
⁠uploaded documents has to be embeded into chromadb
⁠I will have a chat interface based on question. LLM has to go through website and all the other documents and generate answer

Learnings: RAG model, VectorDB, Embeddings, LLM

Tools and Libraries

streamlit : For UI and Document uploading

crawl4ai : For Scraping a Website and Crawls and extracts text from a website (follows internal links).

langchain : Core framework for Retrieval-Augmented Generation (RAG)

chromadb : Stores and retrieves text chunks as embeddings (Vector DB)

sentence-transformers : Generates embeddings (vectors) from text locally

transformers : Backend dependency for sentence-transformers & HuggingFace models.

torch : Core machine learning backend for embeddings.

langchain-community : Adds support for community-built integrations, like Ollama

pymupdf : Extracts text from PDF files

RAG Model

Basic Architecture of RAG

Implementation and Overview of RAG

Query Translation

Process of transforming a user's natural language query into a retrieval-friendly format that improves the quality of document or passage retrieval

1. Multi-Query

Dividing the query into multiple queries for better answers

2. RAG Fusion

Multi-Query + Result Fusion in rank based manner

Generation of Queries
Running all of them through the retriver
Fusing the retrived results(deduplication + re-ranking)

3. Decomposition

Decomposing or creating sub-queries to answer user-query

4. Step-back

A technique where, instead of answering the user's query directly, the model is first asked to take a step back and generate a higher-level or supporting question - and then use that to guide the retrieval and final answer

5. HyDE (Hypothetical Document Embeddings)

It's a clever technique where instead of embedding the user query directly, you first ask the LLM to generate a fake (hypothetical) answer, and then you embed that answer to the retrive supporting documents from the vector database.

Routing

to direct the obtained query to the right database(Vector/Relational.. DB)

Logical Routing

smartly directing the user queries to the most appropriate retrieval source, LLM behaviour, or toolchain path, based on the intent, type, or domain of the query.

Semantic Routing

Here, LLMs decide where to send a query based on it's meaning, not just keywords or simple rules

Query Construction

It is a process of transfroming a user's raw input into a better, retrival-optimized query before passing it to your document-retriever.

Indexing

the process of converting your documents into a searchable, structured representations so they can be efficently retrived during a query.

Propositional Indexing

instead of indexing raw text chunks, you extract and index discrete proposition from the documents.

Multi Vector Retriever

is an Retrieval Method that allows multiple embeddings per document or chunks, increasing the chance of retrieving relevant context even if the user query is phrased differently.

RAPTOR (Retrival Augmented Precise Tree-Oriented Reading)

RAG strategy designed to improve long document retrieval by creating a tree of semantically summarized nodes, enabling hierarchical retrieval for better context relevance and scalability.

ColBERT (Contextualized Late Interaction over BERT)

It is a dense retrieval technique used in RAG systems to efficently retrive relvant documents from a large corpus using late interaction between query and document embeddings

Retrieval

Active RAG

making the retrieval step in RAG pipline dynamic, query-aware, and context-sensitive rather than static or passive

CRAG (Corrective RAG)

Technique where the system actively detects, corrects, or refines its own mistakes during the generation process by re-triggerring the retrival step with improved or clarified queries

💬 RAGpilot: Website + Document RAG Chatbot

This is a Retrieval-Augmented Generation (RAG) chatbot app built using Streamlit, capable of:

Scraping content from websites
Uploading and processing PDF/TXT files
Embedding and indexing content into ChromaDB
Asking questions based on the combined knowledge using an LLM (via Ollama)

📦 Project Structure

utils/
├── data/
│ ├── doc_loader.py # Load PDFs or text documents
│ └── scraper.py # Scrape and parse websites
├── llm/
│ └── llm_generator.py # Prompt and LLM response logic
├── rag/
│ ├── retriever.py # Retrieve top-k docs from Chroma
├── vectorstore/
│ ├── chroma_handler.py # Handler for ChromaDB
│ ├── embeddings.py # Ollama embedding loader
│ └── indexer.py # Document indexing pipeline
app.py # Streamlit app interface

🛠️ Setup

1. Install Requirements

pip install -r requirements.txt

Start Ollama Server (if not already running)

ollama serve
Optionally pull model like:
ollama pull mistral

Run the App

streamlit run app.py

🧠 Features ✅ Deep website scraping (via scraper.py)

✅ Document upload and parsing (doc_loader.py)

✅ Embedding using OllamaEmbeddings (e.g., Mistral)

✅ Indexing into ChromaDB

✅ Contextual answering via prompt + OllamaLLM

🔁 Session Behavior On app start or refresh, it:

Deletes previous ChromaDB (./chroma_store)

Clears previously scraped site and uploaded documents

Resets chat history

📌 Notes Make sure Ollama is running and mistral or your model is already pulled.

You can modify model name and embedding dimensions via embeddings.py and llm_generator.py.

References (Places from where I learned this)

RAG : RAG-Github
Crawl4AI : crawl4ai

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
utils		utils
README.md		README.md
app.py		app.py
requirement.txt		requirement.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-pilot

Tools and Libraries

RAG Model

Query Translation

1. Multi-Query

2. RAG Fusion

3. Decomposition

4. Step-back

5. HyDE (Hypothetical Document Embeddings)

Routing

Logical Routing

Semantic Routing

Query Construction

Indexing

Propositional Indexing

Multi Vector Retriever

RAPTOR (Retrival Augmented Precise Tree-Oriented Reading)

ColBERT (Contextualized Late Interaction over BERT)

Retrieval

Active RAG

CRAG (Corrective RAG)

💬 RAGpilot: Website + Document RAG Chatbot

📦 Project Structure

🛠️ Setup

1. Install Requirements

References (Places from where I learned this)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG-pilot

Tools and Libraries

RAG Model

Query Translation

1. Multi-Query

2. RAG Fusion

3. Decomposition

4. Step-back

5. HyDE (Hypothetical Document Embeddings)

Routing

Logical Routing

Semantic Routing

Query Construction

Indexing

Propositional Indexing

Multi Vector Retriever

RAPTOR (Retrival Augmented Precise Tree-Oriented Reading)

ColBERT (Contextualized Late Interaction over BERT)

Retrieval

Active RAG

CRAG (Corrective RAG)

💬 RAGpilot: Website + Document RAG Chatbot

📦 Project Structure

🛠️ Setup

1. Install Requirements

References (Places from where I learned this)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages