🤟 SignSense — Gesture-Driven Assistive Interface

Masters-Level HCI + ML Project | Indian/American Sign Language → Spoken Text with Affective Computing

📌 Project Overview

SignSense is a real-time, web-based assistive communication platform that:

Detects hand, face, and pose landmarks via MediaPipe Holistic (500+ landmarks)
Translates ISL/ASL signs into text using a Transformer / LSTM sequence model
Detects emotion using a CNN on facial micro-expressions (Happy / Serious / Urgent)
Speaks translated text with emotion-matched voice via Web Speech API
Predicts next words using an ML-powered suggestion engine (reducing signer fatigue)

This project targets the Deaf/Hard-of-Hearing community and focuses on low cognitive load, real-time latency, and WCAG 2.1 accessibility compliance.

🗂️ Project Structure

SignSense/
│
├── public/                    # Static assets
│   └── index.html
│
├── src/                       # React frontend
│   ├── components/
│   │   ├── CameraFeed.jsx         # Webcam capture + MediaPipe overlay
│   │   ├── TranslationPanel.jsx   # Real-time sign → text output
│   │   ├── EmotionBadge.jsx       # Live emotion indicator
│   │   ├── PredictiveBar.jsx      # Word suggestion strip
│   │   ├── SpeechControls.jsx     # TTS controls (rate, pitch, voice)
│   │   ├── LatencyMonitor.jsx     # System response time display (HCI eval)
│   │   ├── AccessibilityAudit.jsx # WCAG heuristic checker panel
│   │   └── Navbar.jsx
│   ├── hooks/
│   │   ├── useMediaPipe.js        # MediaPipe Holistic integration
│   │   ├── useWebSocket.js        # Real-time backend communication
│   │   └── useSpeech.js           # Web Speech API hook
│   ├── utils/
│   │   ├── landmarkUtils.js       # Normalize/flatten landmark arrays
│   │   ├── emotionMap.js          # Emotion → TTS pitch/rate mapping
│   │   └── wcagChecker.js         # WCAG 2.1 heuristic evaluator
│   ├── App.jsx
│   ├── main.jsx
│   └── index.css
│
├── backend/                   # Python FastAPI backend
│   ├── main.py                    # FastAPI app + WebSocket endpoint
│   ├── routes/
│   │   ├── predict.py             # Sign prediction route
│   │   └── emotion.py             # Emotion analysis route
│   ├── models/
│   │   ├── lstm_model.py          # LSTM sequence classifier
│   │   ├── transformer_model.py   # Transformer-based sign recognizer
│   │   └── emotion_cnn.py         # CNN facial emotion classifier
│   └── utils/
│       ├── landmark_processor.py  # Preprocess MediaPipe landmarks
│       └── tts_modifier.py        # Emotion-aware TTS parameter output
│
├── notebooks/                 # Jupyter notebooks for model training
│   ├── 01_data_collection.ipynb
│   ├── 02_landmark_extraction.ipynb
│   ├── 03_lstm_training.ipynb
│   ├── 04_emotion_cnn_training.ipynb
│   └── 05_evaluation_metrics.ipynb
│
├── docs/                      # HCI evaluation documentation
│   ├── HCI_Evaluation_Report.md
│   ├── Usability_Test_Protocol.md
│   ├── SUS_Score_Template.xlsx
│   └── Latency_Study.md
│
├── .github/
│   └── workflows/
│       └── ci.yml             # GitHub Actions CI
│
├── requirements.txt           # Python dependencies
├── package.json               # Node dependencies
├── .env.example               # Environment variable template
├── .gitignore
└── LICENSE

🧠 Technical Architecture

Webcam Input
    │
    ▼
MediaPipe Holistic (Browser)
    │  500+ landmarks (x,y,z)
    ▼
WebSocket ──────────────────► FastAPI Backend
                                    │
                          ┌─────────┴──────────┐
                          ▼                     ▼
                  LSTM / Transformer        Emotion CNN
                  (Sign → Word)         (Face → Emotion)
                          │                     │
                          └─────────┬───────────┘
                                    ▼
                          Translated Text + Emotion Tag
                                    │
                                    ▼
                          Web Speech API (Emotion-aware TTS)
                                    │
                                    ▼
                          User hears spoken, emotionally
                          nuanced translation ✅

🚀 Roadmap

Phase 1 — Data & Landmarks ✅

Set up MediaPipe Holistic in React
Extract and save landmark CSV files for 50 signs
Augment dataset (flipping, noise injection)

Phase 2 — ML Model Training

Train LSTM on landmark sequences (ISL/ASL vocabulary)
Train CNN on facial expression dataset (FER-2013 or AffectNet)
Evaluate: Precision, Recall, F1 per sign class

Phase 3 — Backend API

Build FastAPI WebSocket for real-time inference
Integrate both models with a unified prediction endpoint
Optimize for < 200ms latency

Phase 4 — Frontend Dashboard

Build React UI with camera feed + translation panel
Add emotion badge, predictive word bar, TTS controls
Implement WCAG 2.1 heuristic audit panel

Phase 5 — HCI Evaluation

Conduct usability study (n ≥ 5 participants)
Calculate SUS (System Usability Scale) score
Document latency measurements and optimizations
Heuristic evaluation report

⚙️ Setup & Installation

Prerequisites

Node.js 18+
Python 3.10+
Webcam
(Optional) NVIDIA GPU for faster model training

1. Clone the repository

git clone https://github.com/YOUR_USERNAME/SignSense.git
cd SignSense

2. Frontend Setup

npm install
npm run dev
# App runs at http://localhost:5173

3. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000

4. Environment Variables

cp .env.example .env
# Edit .env with your config

📊 HCI Evaluation Metrics

Metric	Tool	Target
Task Success Rate	Usability Test	≥ 85%
System Usability Scale (SUS)	SUS Questionnaire	≥ 75 / 100
System Response Time	LatencyMonitor component	< 200ms
WCAG 2.1 Compliance	AccessibilityAudit component	AA Level
Word Prediction Accuracy	ML evaluation	≥ 70% top-3

📚 References & Datasets

WLASL (Word-Level American Sign Language) — link
INCLUDE (Indian Sign Language) — link
FER-2013 (Facial Emotion Recognition) — Kaggle
AffectNet (High-res facial expressions)
MediaPipe Holistic: Google AI

👤 Author

Shubham Sharma
MSc Human-Computer Interaction / AI
University Name Siegen University

📄 License

MIT License — see LICENSE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤟 SignSense — Gesture-Driven Assistive Interface

📌 Project Overview

🗂️ Project Structure

🧠 Technical Architecture

🚀 Roadmap

Phase 1 — Data & Landmarks ✅

Phase 2 — ML Model Training

Phase 3 — Backend API

Phase 4 — Frontend Dashboard

Phase 5 — HCI Evaluation

⚙️ Setup & Installation

Prerequisites

1. Clone the repository

2. Frontend Setup

3. Backend Setup

4. Environment Variables

📊 HCI Evaluation Metrics

📚 References & Datasets

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
GITHUB_UPLOAD_GUIDE.md		GITHUB_UPLOAD_GUIDE.md
LICENSE		LICENSE
README.md		README.md
SignSense.zip		SignSense.zip
package.json		package.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🤟 SignSense — Gesture-Driven Assistive Interface

📌 Project Overview

🗂️ Project Structure

🧠 Technical Architecture

🚀 Roadmap

Phase 1 — Data & Landmarks ✅

Phase 2 — ML Model Training

Phase 3 — Backend API

Phase 4 — Frontend Dashboard

Phase 5 — HCI Evaluation

⚙️ Setup & Installation

Prerequisites

1. Clone the repository

2. Frontend Setup

3. Backend Setup

4. Environment Variables

📊 HCI Evaluation Metrics

📚 References & Datasets

👤 Author

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages