┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ ██████╗ ██╗ ██╗ █████╗ ██████╗ █████╗ ████████╗███████╗██╗ ██████╗ │
│ ██╔══██╗██║ ██║██╔══██╗██╔══██╗██╔══██╗╚══██╔══╝██╔════╝██║██╔════╝ │
│ ██████╔╝███████║███████║██████╔╝███████║ ██║ ███████╗██║██║ ███╗ │
│ ██╔══██╗██╔══██║██╔══██║██╔══██╗██╔══██║ ██║ ╚════██║██║██║ ██║ │
│ ██████╔╝██║ ██║██║ ██║██║ ██║██║ ██║ ██║ ███████║██║╚██████╔╝ │
│ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═════╝ │
│ │
│ S I G N · A I — F O R E N S I C │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
₹1.5 Lakh Crore is lost annually in India to cheque fraud and signature forgery.
Banks verify thousands of signatures daily — manually, slowly, and inaccurately.
Existing systems handle only one script. India has 22 official languages.
BharatSign AI solves this. The first forensic signature authentication platform built specifically for India's multi-script banking reality.
┌──────────────────────┬──────────────────────┬──────────────────────┐
│ │ │ │
│ 🟡 CEDAR Engine │ 🔵 Bengali Engine │ 🟠 Hindi Engine │
│ │ │ │
│ English · Latin │ Bengali · বাংলা │ Hindi · हिन्दी │
│ │ │ │
│ Dataset: CEDAR │ Dataset: BHSig260 │ Dataset: BHSig260 │
│ Users : 55 │ Users : 100 │ Users : 160 │
│ Script : Western │ Script : Bengali │ Script : Devanagari │
│ Vocab : k=200 │ Vocab : k=100 │ Vocab : k=100 │
│ Dims : 212-D │ Dims : 112-D │ Dims : 112-D │
│ │ │ │
└──────────────────────┴──────────────────────┴──────────────────────┘
Each engine is independently trained on its own script dataset.
No cross-contamination. No bias. No compromise on accuracy.
This is what makes BharatSign AI fundamentally different.
┌─────────────────────────────────┐
│ BHARATSIGN AI PIPELINE │
└─────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ SIGNATURE │ │ SIGNATURE │ │ SIGNATURE │
│ IMAGE │ │ IMAGE │ │ IMAGE │
│ (English) │ │ (Bengali) │ │ (Hindi) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ IMAGE PRE-PROCESSING │
│ Grayscale → Otsu Threshold → Binary Inversion │
└─────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ SIFT KEYPOINT DETECTION │
│ Scale-Invariant Feature Transform · 128-D Desc │
└─────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ BAG OF WORDS ENCODING │
│ K-Means Clustering → Visual Vocabulary → BoW Vec │
└─────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ 12 GEOMETRIC CONTOUR FEATURES │
│ AspectRatio · HullRatio · Eccentricity · Solidity │
│ Centroid · PixelRatio · Skewness · Kurtosis │
└─────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ LINEARSVС CLASSIFIER │
│ Script-Specific Model · Class-Balanced Training │
└─────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ ✅ GENUINE │ │ ✅ GENUINE │ │ ✅ GENUINE │
│ ❌ FORGED │ │ ❌ FORGED │ │ ❌ FORGED │
└────────────┘ └────────────┘ └────────────┘
| # | Feature | Description | Category |
|---|---|---|---|
| 1–50/100 | BoW Histogram | Visual word frequency from SIFT descriptors via K-Means | SIFT |
| 51 | Aspect Ratio | Width-to-height ratio of bounding box | Geometric |
| 52 | Hull Ratio | Convex hull area / bounding box area | Geometric |
| 53 | Contour Ratio | Contour area / bounding box area | Geometric |
| 54 | Pixel Ratio | Ink pixel density | Geometric |
| 55–56 | Centroid (x, y) | Normalized center of mass | Spatial |
| 57 | Eccentricity | Ellipse elongation measure | Shape |
| 58 | Solidity | Contour area / convex hull area | Shape |
| 59–60 | Skewness (x, y) | Asymmetry of pixel distribution | Statistical |
| 61–62 | Kurtosis (x, y) | Peakedness of pixel distribution | Statistical |
CEDAR (English) BHSig260-Bengali BHSig260-Hindi
───────────────── ──────────────────── ────────────────────
◆ 55 users ◆ 100 users ◆ 160 users
◆ 24 genuine / user ◆ 24 genuine / user ◆ 24 genuine / user
◆ 24 forged / user ◆ 30 forged / user ◆ 30 forged / user
◆ 2,640 total ◆ 5,400 total ◆ 8,640 total
◆ Western Latin script ◆ Bengali script ◆ Devanagari script
◆ Skilled forgeries ◆ Skilled forgeries ◆ Skilled forgeries
💡 Why script-isolated training matters:
Bengali strokes are fundamentally different from Latin strokes.
A model trained on mixed scripts learns noise, not signatures.
BharatSign AI trains each engine exclusively on its own script — this is the core innovation.
BharatSign-AI/
│
├── 📁 Verification_Phase/SVM/
│ │
│ ├── 📁 data/
│ │ ├── 📁 genuine/ ← Original 145 genuine signatures
│ │ ├── 📁 forged/ ← Original 145 forged signatures
│ │ ├── 📁 cedar/
│ │ │ ├── genuine/ ← CEDAR 1321 genuine
│ │ │ └── forged/ ← CEDAR 1321 forged
│ │ ├── 📁 bengali/
│ │ │ ├── genuine/ ← BHSig260 Bengali genuine
│ │ │ └── forged/ ← BHSig260 Bengali forged
│ │ └── 📁 hindi/
│ │ ├── genuine/ ← BHSig260 Hindi genuine
│ │ └── forged/ ← BHSig260 Hindi forged
│ │
│ ├── 🐍 retrain_all.py ← Train Bengali + Hindi engines
│ ├── 🐍 retrain_hindi.py ← Retrain Hindi with verified labels
│ ├── 🐍 sort_all_datasets.py ← Sort BHSig260 into folders
│ ├── 🐍 app.py ← Flask server · 3-engine routing
│ ├── 🌐 sigverify_ui.html ← Production UI · Bank terminal design
│ │
│ ├── 💾 model_combined.pkl ← CEDAR engine model
│ ├── 💾 voc_combined.pkl ← CEDAR visual vocabulary
│ ├── 💾 scaler_combined.pkl ← CEDAR feature scaler
│ ├── 💾 model_bengali.pkl ← Bengali engine model
│ └── 💾 model_hindi.pkl ← Hindi engine model
│
├── 📁 Detection_Phase/ ← Connected Components · OCR pipeline
├── 📁 Our_Dataset/ ← Original cheque image dataset
├── 📄 README.md
└── 📄 requirements.txt
Step 1 — Clone the repository
git clone https://github.com/Darshan-paapani06/BharatSign-AI.git
cd BharatSign-AIStep 2 — Install dependencies
pip install opencv-contrib-python scikit-learn flask flask-cors scipy numpy matplotlib pillowStep 3 — Sort BHSig260 datasets (edit paths in script first)
cd Verification_Phase/SVM
python sort_all_datasets.pyStep 4 — Train Bengali & Hindi engines
python retrain_all.py⏳ Takes ~30 minutes. CEDAR model is pre-trained and included.
Step 5 — Launch the server
python app.pyStep 6 — Open the platform
http://127.0.0.1:5000
1. Select the correct script tab
├── English · CEDAR → for Western / Latin signatures
├── Bengali · বাংলা → for Bengali script signatures
└── Hindi · हिन्दी → for Devanagari script signatures
2. Drop or upload the signature image
└── Supports PNG · JPG · TIF · BMP
3. Click "Authenticate Signature"
└── Watch the 5-step forensic pipeline animate
4. Read the Forensic Report
├── GENUINE → AUTHENTICATED · Negligible Risk
└── FORGED → THREAT DETECTED · Elevated/Critical Risk
| Feature | Other Systems | BharatSign AI |
|---|---|---|
| Script Support | Single script | 3 scripts — English, Bengali, Hindi |
| Training Approach | Global mixed model | Script-isolated engines |
| Feature Space | Basic pixel features | SIFT + 12 geometric contour features |
| Dataset | Single source | CEDAR + BHSig260 (3 variants) |
| Interface | CLI / basic web | Production banking terminal UI |
| India Focus | None | Built for Indian banking infrastructure |
| Label Verification | Not present | Filename-based label verification before training |
┌─────────────────┬──────────────────────────────────────────────────┐
│ Computer Vision │ OpenCV 4.x · SIFT (xfeatures2d) │
│ Machine Learning│ scikit-learn · LinearSVC · StandardScaler │
│ Feature Extract │ Bag of Words · K-Means · Contour Analysis │
│ Backend │ Python 3.12 · Flask · Flask-CORS │
│ Frontend │ Vanilla JS · CSS3 · Canvas API · Google Fonts │
│ Datasets │ CEDAR · BHSig260-Bengali · BHSig260-Hindi │
│ Serialization │ Pickle · NumPy · SciPy │
└─────────────────┴──────────────────────────────────────────────────┘
- Bank Terminal Clock — Mechanical flip digits with live ₹ / Au / $ market tickers
- Forensic Risk Matrix — 10-cell animated grid · NEGLIGIBLE → CRITICAL RISK
- Particle Network Background — Dynamic canvas with connected node graph
- Custom Gold Cursor — Smooth lagging ring cursor with hover expansion
- Scan Line Animation — CRT-style overlay for depth and atmosphere
- Processing Pipeline — 5-step animated forensic analysis sequence
- Three-Tab Engine Nav — Gold / Blue / Saffron color-coded engine selector
- Cormorant Garamond + IBM Plex Mono — Refined editorial typography pairing
Phase 2 Phase 3 Phase 4
─────────────────── ─────────────────── ───────────────────
◇ Deep Learning (CNN) ◇ Mobile App (React Native) ◇ RBI API Integration
◇ More Indian scripts ◇ Cloud deployment ◇ Real-time bank feed
(Tamil · Telugu · Kannada) ◇ Docker containerization ◇ Blockchain audit log
◇ Live camera capture ◇ REST API for banks ◇ Multi-bank rollout
Darshan Paapani
Built with obsessive attention to detail for the Indian banking ecosystem.
From dataset sorting to forensic UI — every line crafted with purpose.
"Signatures are the last line of defense in financial trust.
BharatSign AI makes that defense unbreakable — in every script,
for every Indian."
— Darshan Paapani