DocuMind AI is an AI-powered enterprise document intelligence platform that automates document ingestion, OCR extraction, document classification, duplicate detection, and analytics. The platform transforms unstructured business documents into searchable, actionable intelligence through an interactive dashboard and real-time insights.
- Extract text and structured fields from invoices, receipts, purchase orders, and business documents
- High-accuracy OCR pipeline for scanned and digital documents
- Automated field extraction and metadata generation
- Automatically classify uploaded documents into categories
- AI-assisted extraction and categorization pipeline
- Supports invoices, receipts, purchase orders, and custom document types
- Detect duplicate invoices and receipts
- Invoice number matching
- Vendor similarity detection
- Duplicate confidence scoring
- Fraud prevention workflows
- View processed documents
- Search and filter documents
- Metadata management
- Processing status tracking
- Document lifecycle management
- Field extraction accuracy metrics
- OCR confidence scores
- Processing latency monitoring
- Success ratio tracking
- Operational insights and reporting
- Fully responsive UI
- Mobile, tablet, and desktop support
- Modern dashboard interface
- Real-time data visualization
- Framework: React.js + TypeScript + Vite
- Styling: Tailwind CSS
- Charts: Recharts
- State Management: React Hooks & Context API
- Icons: Lucide React
- Framework: FastAPI (Python 3.11)
- Database: PostgreSQL
- ORM: SQLAlchemy
- Authentication: JWT Authentication
- AI Services: Gemini AI APIs
- OCR Engine: Document OCR & Information Extraction Pipeline
graph TD
User[๐ค User Browser]
FE[โ๏ธ React + TypeScript Frontend]
API[โก FastAPI Backend]
Auth[๐ JWT Authentication]
OCR[๐ OCR Extraction Engine]
AI[๐ง Gemini AI Services]
Dup[๐ Duplicate Detection Engine]
Analytics[๐ Analytics Service]
DB[(๐ PostgreSQL Database)]
User --> FE
FE -->|REST API Requests| API
API --> Auth
API --> OCR
API --> AI
API --> Dup
API --> Analytics
OCR --> DB
AI --> DB
Dup --> DB
Analytics --> DB
API --> DB
API --> FE
sequenceDiagram
participant U as User
participant F as React Frontend
participant A as FastAPI Backend
participant O as OCR Engine
participant G as Gemini AI
participant D as PostgreSQL
U->>F: Upload Document
F->>A: POST /documents/upload
A->>O: Extract Text & Fields
O-->>A: Structured Data
A->>G: Classify Document
G-->>A: Category + Metadata
A->>D: Store Results
D-->>A: Persisted Records
A-->>F: Extraction Results
F-->>U: Dashboard & Analytics
- Drag-and-drop uploads
- PDF and image support
- Batch upload capabilities
- Upload progress tracking
- View processed documents
- Search and filtering
- Metadata management
- Processing status tracking
- Invoice number matching
- Vendor similarity detection
- Duplicate confidence scoring
- Fraud prevention workflows
- Extraction accuracy monitoring
- OCR confidence tracking
- Processing latency analysis
- Success ratio monitoring
- Operational reporting
- Python 3.11+
- Node.js 18+
- PostgreSQL
- Gemini API Key
cd backend
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate
pip install -r requirements.txtCreate a .env file:
DATABASE_URL=your_postgresql_connection_string
SECRET_KEY=your_secret_key
GEMINI_API_KEY=your_gemini_api_key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=60Run the server:
uvicorn app.main:app --reloadBackend URL:
http://127.0.0.1:8000
Swagger Documentation:
http://127.0.0.1:8000/docs
cd frontend
npm install
npm run devFrontend URL:
http://localhost:3000
POST /auth/registerPOST /auth/loginGET /users/me
POST /api/v1/documents/uploadGET /api/v1/documentsGET /api/v1/documents/{id}DELETE /api/v1/documents/{id}
GET /api/v1/duplicatesPOST /api/v1/duplicates/analyze
GET /api/v1/statsGET /api/v1/search
DATABASE_URL=
SECRET_KEY=
GEMINI_API_KEY=
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=60VITE_API_URL=http://127.0.0.1:8000docker build -t documind-backend .
docker run -p 8000:8000 documind-backendDeployment Platforms:
- Render
- Railway
- AWS ECS
- Google Cloud Run
npm run buildDeployment Platforms:
- Vercel
- Netlify
- AWS Amplify
- Role-Based Access Control (RBAC)
- Vector Search for Semantic Document Retrieval
- RAG-Powered Document Question Answering
- Real-Time Processing Queues using Celery and Redis
- Multi-Tenant Enterprise Workspaces
- Cloud Storage Integration (AWS S3 / GCS)
- CI/CD Pipeline with GitHub Actions
- Docker Compose and Kubernetes Deployment
- AI-Powered Document Summarization
- Document Chat Assistant
- AI-Powered Enterprise Document Intelligence Platform
- End-to-End OCR Extraction Pipeline
- Duplicate Invoice Detection System
- Real-Time Analytics Dashboard
- FastAPI + React Full-Stack Architecture
- Gemini AI Integration
- Responsive Enterprise UI
- Production-Ready Modular Architecture
Geethanjali V N
GitHub: https://github.com/Geethanjaliii
Project Repository: https://github.com/Geethanjaliii/DocumindAI
This project is licensed under the MIT License.





