🔎 Information Retrieval System using Gemini API (Combined README)

This document combines the README files for both the "Heavier Version" (IRS.py) and the "Lighter Version" (irsai.py) of the Information Retrieval System. This allows for a comprehensive overview of both implementations within a single file.

🚀 README_IRS.md (for IRS.py - Heavier Version)

🔎 Information Retrieval System using Gemini API (Heavier Version)

This is a comprehensive Information Retrieval System built with Python, Streamlit, and Google's Gemini API. Users can upload a variety of document types (PDF, DOCX, TXT, PPTX, images, Excel, CSV) and interact with the content through context-aware answers, summaries, keyword searches, and multimedia exports. This heavier version includes advanced UI/UX features, detailed error handling, and extensive functionality.

📌 Use Case

Upload PDF, DOCX, TXT, PPTX, JPG/PNG, XLSX, CSV files (up to 200MB)
Ask natural language questions about uploaded content
Generate full or topic-based summaries
Export responses and summaries as PDFs or text
Convert outputs to text-to-speech audio
Search for keywords within documents
Export complete chat history
Common use cases include:
📘 Academic research or literature review
🏫 Educational queries from diverse study materials
📄 Legal/medical/business document analysis
🔉 Audio-based document briefing for accessibility

⚙️ Installation

Follow these steps to set up and run the Information Retrieval System locally:

Clone the Repository

git clone https://github.com/SANJAI-s0/IRS_genai.git
cd IRS_genai

Create and Activate a Virtual Environment (Recommended)

# For macOS/Linux
python3 -m venv venv
source venv/bin/activate

# For Windows
python -m venv venv
venv\Scripts\activate

Install Dependencies

pip install -r requirements.txt

Set Up Environment Variables Create a .env file in the root directory and add your Gemini API key and PostgreSQL database credentials:

GEMINI_API_KEY=your_actual_gemini_api_key
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_PORT=your_db_port
DB_NAME=your_db_name

Run the Streamlit App

streamlit run IRS.py

Your app will be available at http://localhost:8501.

📦 Requirements

Python 3.8 or higher
Gemini API key (from Google AI Studio)
PostgreSQL database
Internet connection

requirements.txt

streamlit==1.37.0
python-dotenv==1.0.1
google-generativeai==0.5.4
PyPDF2==3.0.1
pyttsx3==2.90
python-docx==1.1.0
pandas==2.2.2
Pillow==10.4.0
fpdf==1.7.2
psycopg2-binary==2.9.9
SQLAlchemy==2.0.31
unicodedata2==15.1.0
mimetypes==1.2.0

🔐 .env & .gitignore

.env

GEMINI_API_KEY=your_actual_gemini_api_key
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_PORT=your_db_port
DB_NAME=your_db_name

.gitignore

.env
__pycache__/
*.pyc
*.mp3
*.wav
*.ogg

✅ This ensures secret keys and compiled files are not pushed to GitHub.

🛠 Tech Stack

Tool/Library	Purpose
Python	Core backend logic
Streamlit	Web-based UI framework
PyPDF2	PDF text extraction
python-docx	Word document parsing
google-generativeai	Gemini LLM API access
python-dotenv	Environment variable management
pyttsx3	Text-to-speech conversion
fpdf	Exporting summaries as PDFs
pandas	Excel/CSV processing
Pillow	Image handling
psycopg2	PostgreSQL database integration
SQLAlchemy	Database ORM
unicodedata2	Text sanitization
mimetypes	File type detection

📝 License MIT License — use freely with attribution

🧠 How It Works

📁 Users upload multiple file types (PDF, DOCX, TXT, etc.).
📄 Text is extracted and combined into a unified context with progress tracking.
👤 Users can:
Ask natural language questions
Request summaries (full or topic-based)
Search for keywords
Download responses as PDF, text, or audio
View and export chat history
🧠 Prompts are constructed using document context and sent to Gemini 2.0 Flash.
📤 Responses are displayed with advanced UI styling and error handling.
📥 Exports include detailed formatting and database storage.

📬 Contact

Built by Sanjai For suggestions or contributions, open an issue or pull request on GitHub: https://github.com/SANJAI-s0/IRS_genai/issues

🚀 Notes for Heavier Version

Features: Includes progress bars, tooltips, extensive CSS customization, and robust error handling.
Database: Stores chat history in PostgreSQL with SSL support.
Limitations: Heavier resource usage due to detailed UI and database operations.
Recommendation: Use for production environments with large datasets or complex workflows.

💡 README_irsai.md (for irsai.py - Lighter Version)

🔎 Information Retrieval System using Gemini API (Lighter Version)

This is a streamlined Information Retrieval System built with Python, Streamlit, and Google's Gemini API. Users can upload PDF, DOCX, TXT, Excel, or CSV files and interact with the content through context-aware answers, summaries, and multimedia exports. This lighter version focuses on simplicity, reduced resource usage, and essential functionality.

📌 Use Case

Upload PDF, DOCX, TXT, PPTX, JPG/PNG, XLSX, CSV files (up to 200MB)
Ask natural language questions about uploaded content
Generate full or topic-based summaries
Export responses and summaries as PDFs or text
Convert outputs to text-to-speech audio
Search for keywords within documents
Export complete chat history
Common use cases include:
📘 Academic research or literature review
🏫 Educational queries from diverse study materials
📄 Legal/medical/business document analysis
🔉 Audio-based document briefing for accessibility

⚙️ Installation

Follow these steps to set up and run the Information Retrieval System locally:

Clone the Repository

git clone https://github.com/SANJAI-s0/IRS_genai.git
cd IRS_genai

Create and Activate a Virtual Environment (Recommended)

# For macOS/Linux
python3 -m venv venv
source venv/bin/activate

# For Windows
python -m venv venv
venv\Scripts\activate

Install Dependencies

pip install -r requirements.txt

Set Up Environment Variables Create a .env file in the root directory and add your Gemini API key and PostgreSQL database credentials:

GEMINI_API_KEY=your_actual_gemini_api_key
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_PORT=your_db_port
DB_NAME=your_db_name

Run the Streamlit App

streamlit run irsai.py

Your app will be available at http://localhost:8501.

📦 Requirements

Python 3.8 or higher
Gemini API key (from Google AI Studio)
PostgreSQL database
Internet connection

requirements.txt

streamlit==1.37.0
python-dotenv==1.0.1
google-generativeai==0.5.4
PyPDF2==3.0.1
pyttsx3==2.90
python-docx==1.1.0
pandas==2.2.2
fpdf==1.7.2
psycopg2-binary==2.9.9
SQLAlchemy==2.0.31

🔐 .env & .gitignore

.env

GEMINI_API_KEY=your_actual_gemini_api_key
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_PORT=your_db_port
DB_NAME=your_db_name

.gitignore

.env
__pycache__/
*.pyc
*.mp3
*.wav
*.ogg

✅ This ensures secret keys and compiled files are not pushed to GitHub.

🛠 Tech Stack

Tool/Library	Purpose
Python	Core backend logic
Streamlit	Web-based UI framework
PyPDF2	PDF text extraction
python-docx	Word document parsing
google-generativeai	Gemini LLM API access
python-dotenv	Environment variable management
pyttsx3	Text-to-speech conversion
fpdf	Exporting summaries as PDFs
pandas	Excel/CSV processing
Pillow	Image handling
psycopg2	PostgreSQL database integration
SQLAlchemy	Database ORM
unicodedata2	Text sanitization
mimetypes	File type detection

📝 License MIT License — use freely with attribution

🧠 How It Works

📁 Users upload multiple file types (PDF, DOCX, TXT, etc.).
📄 Text is extracted and combined into a unified context with progress tracking.
👤 Users can:
Ask natural language questions
Request summaries (full or topic-based)
Search for keywords
Download responses as PDF, text, or audio
View and export chat history
🧠 Prompts are constructed using document context and sent to Gemini 2.0 Flash.
📤 Responses are displayed with advanced UI styling and error handling.
📥 Exports include detailed formatting and database storage.

📬 Contact

Built by Sanjai For suggestions or contributions, open an issue or pull request on GitHub: https://github.com/SANJAI-s0/IRS_genai/issues

🚀 Notes for Lighter Version

Features: Simplified UI with basic CSS, no progress bars or tooltips, and reduced error handling.
Database: Stores chat history in PostgreSQL with minimal configuration.
Limitations: Supports fewer file types (no PPTX or images) and lacks advanced UI features.
Recommendation: Use for quick prototyping or resource-constrained environments.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.idea		.idea
.vscode		.vscode
.gitignore		.gitignore
IRS.py		IRS.py
LICENSE		LICENSE
README.md		README.md
irsai.py		irsai.py
logo.png		logo.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔎 Information Retrieval System using Gemini API (Combined README)

🚀 README_IRS.md (for IRS.py - Heavier Version)

🔎 Information Retrieval System using Gemini API (Heavier Version)

📌 Use Case

⚙️ Installation

📦 Requirements

requirements.txt

🔐 .env & .gitignore

.gitignore

🛠 Tech Stack

📬 Contact

🚀 Notes for Heavier Version

💡 README_irsai.md (for irsai.py - Lighter Version)

🔎 Information Retrieval System using Gemini API (Lighter Version)

📌 Use Case

⚙️ Installation

📦 Requirements

requirements.txt

🔐 .env & .gitignore

.gitignore

🛠 Tech Stack

📬 Contact

🚀 Notes for Lighter Version

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔎 Information Retrieval System using Gemini API (Combined README)

🚀 README_IRS.md (for IRS.py - Heavier Version)

🔎 Information Retrieval System using Gemini API (Heavier Version)

📌 Use Case

⚙️ Installation

📦 Requirements

requirements.txt

🔐 .env & .gitignore

.gitignore

🛠 Tech Stack

📬 Contact

🚀 Notes for Heavier Version

💡 README_irsai.md (for irsai.py - Lighter Version)

🔎 Information Retrieval System using Gemini API (Lighter Version)

📌 Use Case

⚙️ Installation

📦 Requirements

requirements.txt

🔐 .env & .gitignore

.gitignore

🛠 Tech Stack

📬 Contact

🚀 Notes for Lighter Version

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages