Computer Vision • Hybrid Machine Learning • Low Latency
AI Mouse is a real-time computer vision research project that explores mouse control using hand gestures captured via a standard webcam.
Unlike typical deep learning approaches, this system combines MediaPipe hand tracking with a Hybrid Machine Learning architecture (KNN + Random Forest). This specific design choice ensures ultra-low latency, stability, and the ability to adapt to new gestures on the fly without heavy retraining.
🎯 Focus: System design, real-time performance, and practical ML decision-making rather than deep learning complexity.
- 🎥 Real-Time Tracking: Uses MediaPipe for robust hand landmark detection.
- ✋ Gesture Control: Full mouse navigation including Move, Scroll, and Click.
- 🧠 Hybrid ML Engine:
- KNN: For fast, online incremental adaptation.
- Random Forest: For stabilizing confidence scores.
- 🎯 Smart Execution: Temporal buffering to reduce jitter and false positives.
- 📦 Modular Architecture: Clean separation between vision, logic, and execution layers.
- 🖥️ Fully Offline: No internet connection required.
The system avoids deep learning to prioritize speed and interpretability.
AI_MOUSE/
│
├── core/
│ ├── config.py # System sensitivity & configuration
│ ├── features.py # Hand landmark feature extraction
│ ├── model.py # Hybrid KNN + Random Forest logic
│ ├── actions.py # PyAutoGUI execution (Mouse/Click)
│ └── __init__.py
│
├── main.py # Camera loop & orchestration
├── requirements.txt # Dependencies
└── README.md # Documentation
This problem demands ultra-low latency and online learning. Deep learning models often introduce unnecessary overhead.
- KNN (K-Nearest Neighbors): Allows for instant adaptation to a specific user's hand shape.
- Random Forest: Acts as a stabilizer to filter out noise from the webcam.
- Result: A system that is faster and more responsive than heavy neural networks for this specific task.
Gestures are trained live during runtime to match the user's specific hand.
| ID | Gesture Name | Action |
|---|---|---|
| 1 | MOVE | Cursor follows hand movement |
| 2 | SCROLL | Scroll Up / Down |
| 3 | CLICK | Left Mouse Click |
| 4 | IDLE | No Action (Safety state) |
| Key | Function |
|---|---|
1 / 2 / 3 / 4 |
Train the respective gesture (Hold to capture data) |
s |
Save trained model data locally |
r |
Reset / Clear current calibration |
ESC |
Exit the program |
- Python 3.10+
- A working Webcam
# Create Virtual Environment
py -3.10 -m venv .venv
.venv\Scripts\activate
# Install Dependencies
pip install -r requirements.txt
python main.py
This project creates a virtual mouse interface. To ensure smooth movement, the failsafe is disabled:
pyautogui.FAILSAFE = False
If the mouse behaves unexpectedly or gets stuck:
- Press
ESCimmediately to kill the script. - Or press
Alt + Tabto switch windows. - Or close the OpenCV window.
Use with caution. This behavior is intentional for experimentation.
- User Specific: Trained data (
.pkl) is specific to your hand and lighting conditions. It is not synced to Git. - OS: Designed and tested on Windows.
- Scope: This is an experimental research project, not intended for production accessibility tools.
- Visual overlays for gesture confidence.
- Dynamic sensitivity tuning via GUI.
- Comparative latency study against CNN models.