This project implements an efficient Retrieval-Augmented Generation based LLM (Gemini Pro - 1.5) using a vector database (chroma DB) that indexes document chunks by source and page number. It includes a redundant document checker using SQL to prevent unnecessary processing. The system utilizes FastAPI to develop REST APIs, providing an end-to-end solution for document processing and querying.
- Installation
- Usage
- Functionality
- Clone the repository:
git clone https://github.com/jayjoshi1400/RAG-LLM-PDF-QueAns.git
- Install dependencies
pip install -r requirements.txt
- Create a .env file in the root directory and add the GEMINI API key.
- Running the server:
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
- Uploading PDF files:
curl -X POST "http://127.0.0.1:8000/upload/" -F "file=@data/paper1.pdf"
- To convert the uploaded files into chunks:
curl -X POST "http://127.0.0.1:8000/process/"
- Questions can be asked using the following command:
curl -X POST "http://127.0.0.1:8000/query/" -H "Content-Type: application/json" -d "{\"query\": \"User-Query\"}"
For example:
curl -X POST "http://127.0.0.1:8000/query/" -H "Content-Type: application/json" -d "{\"query\": \"How many players can there be in monopoly?\"}"
- Efficient Document Processing:
- Redundant Document Checker: The
file_processed_checkfunction checks if files in the specified directory have already been processed. This prevents redundant processing, saving time and computational resources. This ensures that only new documents are processed, improving efficiency. - Document Loading and Chunking: The
get_docsfunction loads PDF documents, and theget_chunksfunction splits them into manageable chunks for processing.
- Unique chunk ID allocation:
- The
get_chunk_idfunction assigns unique IDs to each chunk based on the source and page number. This has the benefit of a better management of documents and helps check if the same chunk already exist in the DB and removal of specific chunks if needed.
- Vector Store Integration:
- The
get_vector_storefunction stores document chunks in the vector database, ensuring that only new chunks are added.
- Query Processing:
- The
query_ragfunction queries the vector database using the specified query and returns the response generated by the language model. It first searches the vector DB for similar chunks based on the uploaded documents, uses them as a context and appends the user query to this context for the final prompt which is presented to the LLM. The LLM can give a more grounded and accurate answer based on the
documents and provide the source as well since it has access to chunks.