The Document Chatbot System simplifies interactions with uploaded PDF documents by enabling users to query the content using natural language. With a FastAPI backend and a Streamlit-based frontend, the system incorporates Groq AI to provide accurate and context-aware responses.
- Supports uploading and processing multiple PDF documents.
- Extracts text content from PDFs.
- Indexes text using the BM25 algorithm for efficient query matching.
- Interactive interface built with Streamlit.
- Enables users to upload multiple files and ask questions seamlessly.
- Implements robust error-handling mechanisms, including:
- Exception handling in API endpoints.
- Validation of user inputs.
- Fallback mechanisms to maintain functionality when Groq AI is unavailable.
- Provides clear error messages for issues such as file upload failures or processing errors.
- Ensures responses are rooted in the provided document content.
- Groq AI integration delivers concise answers while avoiding irrelevant or misleading information.
- FastAPI: For API endpoints.
- BM25: For text indexing and retrieval.
- Streamlit: Provides an intuitive and user-friendly interface.
- Groq AI: For enhanced natural language understanding and precise query responses.
- PyMuPDF: For PDF processing.
- Requests: For API communication.
- Python 3.8+
- pip (Python package installer)
-
Create and activate a virtual environment: bash python -m venv env source env/bin/activate # On Windows: env\Scripts\activate
-
Install dependencies: bash pip install -r requirements.txt
-
Start the backend server: bash uvicorn app:app --reload
-
Run the frontend application: bash streamlit run frontend.py
- Open the Streamlit app in your browser (usually available at http://localhost:8501).
- Upload one or more PDF documents.
- Enter your question in the query input field.
- View the AI-generated answer based on your uploaded documents.