Verbo is an advanced AI-powered document intelligence system designed to bridge the gap between static documents and actionable insights. By combining Optical Character Recognition (OCR), Natural Language Processing (NLP), and Retrieval-Augmented Generation (RAG), Verbo allows users to transform scanned PDFs and images into searchable, interactive data.
- Intelligent OCR Integration: Seamlessly extracts text from scanned PDFs and images using Tesseract OCR.
- Deep Document Analysis: Automatically generates summaries, extracts keywords, identifies named entities (NER), and performs sentiment analysis using Stanza and Hugging Face Transformers.
- Semantic Search & Indexing: Utilizes Sentence-Transformers for high-dimensional embeddings and FAISS for lightning-fast vector similarity search.
- Context-Aware RAG Pipeline: Uses a Retrieval-Augmented Generation workflow to answer user queries based specifically on the uploaded document's content.
- Intuitive Two-Column UI: A sleek Streamlit interface featuring a side-by-side view:
- Left Panel: Upload, OCR processing, and analytical insights.
- Right Panel: Interactive Q&A chat interface.
- Smart State Management: Optimized session handling to prevent redundant OCR/Analysis processing, ensuring a smooth user experience.
| Layer | Technology |
|---|---|
| Frontend | Streamlit |
| OCR | Tesseract OCR |
| NLP Engine | Hugging Face Transformers, Stanza, BERT |
| Embeddings | Sentence-Transformers |
| Summarization | distilbart-cnn-12-6 |
| Vector Store | FAISS |
| Language Models | Flan-t5 |
- Python 3.9 or higher
- Tesseract OCR engine installed on your system.
- Ubuntu:
sudo apt install tesseract-ocr - macOS:
brew install tesseract - Windows: Download the installer
- Ubuntu:
- Clone the repository:
git clone https://github.com/AbhashK1/Verbo.git
cd Verbo- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Run the application:
streamlit run app.py- Ingestion: Upload a PDF or Image.
- Extraction: The system detects if the file is text-based or an image and applies OCR if necessary.
- Analysis: The NLP pipeline processes the text to extract metadata, sentiment, and key entities.
- Vectorization: Text is split into semantic chunks and converted into embeddings stored in a local FAISS index.
- Querying: When you ask a question, Verbo retrieves the most relevant chunks and passes them to the LLM to generate a precise answer.
Author: Abhash Kumar