DocQuery AI is an intelligent, session-isolated RAG (Retrieval-Augmented Generation) chatbot application designed to transform how you interact with PDF documents. Instead of searching by keywords or scrolling through massive files, you can converse directly with your documents in real time.
Built with Streamlit, LangChain, Chroma DB, HuggingFace, and Mistral AI, it processes documents locally, indexes content into a fresh vector store per session, and enables instant semantic retrieval and answering.
- 📂 Instant Document Processing: Upload any PDF through a clean sidebar layout to process and chunk it on the fly.
- 🔄 Session-Isolated Database: Uses an in-memory instance of Chroma DB. Every session is fresh, isolated, and completely private—no documents are persisted on disk unless configured.
- 🧠 Advanced Semantic Retrieval: Implements HuggingFace embeddings combined with Maximal Marginal Relevance (MMR) retrieval to extract the most relevant contexts while keeping content diverse.
- 💬 Intelligent QA Chatbot: Leverages Mistral AI's
mistral-small-latestlanguage model to summarize, synthesize, and answer questions accurately. - 🧹 Single-Click Reset: Clean up your chat history and memory instantly using the "Clear Session" option.
- Frontend & UI: Streamlit
- LLM Engine: Mistral AI API
- RAG Framework: LangChain
- Vector Database: Chroma DB (In-memory)
- Embeddings: HuggingFace Embeddings (
sentence-transformers)
- Python 3.10 or higher
- Mistral AI API Key (Get one from Mistral Console)
git clone https://github.com/coderashhar/Document.AI.git
cd Document.AICreate a .env file in the root directory and add your Mistral API key:
MISTRAL_API_KEY=your_mistral_api_key_hereCreate and activate your python virtual environment:
python3 -m venv .venv
source .venv/bin/activate # On macOS/Linux
# .venv\Scripts\activate # On Windowspip install -r requirements.txtstreamlit run app.py- Open your browser and navigate to the local address provided by Streamlit (typically
http://localhost:8502). - Upload a PDF document in the left sidebar.
- Once the green success banner appears ("Document processed successfully!"), type your question into the chat input box at the bottom.
- Get concise, context-aware answers directly extracted from your document.
- Hit Clear Session if you want to upload a different PDF and start a new conversation.
