This repository implements an offline OCR-powered review extraction system using PaddleOCR integrated with Streamlit.
- Clone the repository
git clone https://github.com/ReviewAid/ReviewAid-OCR
cd ReviewAid- Install dependencies
pip install -r requirements.txt- Run Streamlit app
streamlit run app.py(If your entry file has a different name, replace
app.pyaccordingly.)
Once started, the app runs fully locally/offline after dependencies are installed.
This project uses PaddleOCR, a deep-learning-based OCR engine that provides:
- Higher accuracy than traditional OCR tools like pytesseract
- Better performance on noisy and complex images
- Strong support for structured text extraction
PaddleOCR replaces the earlier pytesseract-based pipeline for improved reliability.

