Skip to content

PhilemonTJ/Gesture-based-Computer-Interaction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GestureX: AI Gesture-based Computer Interaction

Control your computer's mouse, volume, and more using an AI-powered hand tracking engine. Built with Python, OpenCV, and MediaPipe, GestureX provides a smooth, intuitive, and highly configurable desktop application to map complex hand gestures to native OS commands.


Features

GestureX currently supports a wide array of advanced gestures out-of-the-box:

  • Mouse Movement: Smoothly guide the cursor using your index and middle fingers.
  • Left Click: "Pinch" your index and middle fingers together while moving.
  • Scroll (Up & Down): Move your hand up or down vertically with thumb, index, and middle fingers extended. (Pinching pauses the scroll).
  • Drag & Drop: Pinch your thumb and index fingers to grab, move your hand, and release to drop.
  • Volume Control: Extend all five fingers and rotate your hand like a physical knob (left for down, right for up).
  • Screenshot: Use a rapid sequence: Fist → Open Palm → Fist to capture the screen instantly. Includes native Windows toast notifications!
  • Desktop UI Toolkit: A modern, dark-themed GUI (built via CustomTkinter) that allows you to configure thresholds, tweak sensitivities, and launch the engine.

Gesture Reference Guide

Feature Gesture Requirement Action
Move Cursor Index & Middle fingers UP Move hand within the virtual bounds to push the cursor.
Left Click Index & Middle fingers UP Bring Index and Middle fingers close together to trigger click.
Scroll Thumb, Index, & Middle UP Move hand up/down relative to the start position. Join fingers to Lock/Pause.
Drag & Drop Thumb & Index fingers UP Bring Thumb and Index close together to Start Drag. Release to Drop.
Volume Knob All 5 fingers UP (Open Palm) Rotate the hand clockwise (Up) or counter-clockwise (Down).
Screenshot Multi-state Sequence Fist → hold briefly → Open Palm → hold briefly → Fist.

(Note: The engine uses sophisticated edge-triggering, debounce-timers, and sticky-modes to prevent gesture flickering and ensure accurate commands).


Installation & Setup

Ensure you are running exactly the supported dependencies. A requirements.txt is provided.

  1. Clone the repository

    git clone https://github.com/PhilemonTJ/Gesture-based-Computer-Interaction.git
    cd Gesture-based-Computer-Interaction
  2. Set up a Virtual Environment (Recommended)

    python -m venv venv
    # On Windows:
    venv\Scripts\activate
  3. Install Dependencies

    pip install -r requirements.txt

    (Required packages include opencv-python, mediapipe, numpy, mouse, pyautogui, customtkinter, Pillow, and win11toast).


Usage

GestureX features a settings manager and a centralized desktop app.

Start the GUI Application:

python src/app_ui.py

From the GUI you can:

  1. Adjust camera settings and interaction zones.
  2. Tweak gesture sensitivities (like scroll speed, join thresholds, angles).
  3. Click "Start Engine" to launch the gesture camera feed.

To just run the raw gesture engine directly without the GUI:

python src/main.py

Project Structure

Gesture-based-Computer-Interaction/
├── src/
│   ├── app_ui.py                 # Desktop GUI launcher
│   ├── main.py                   # Core gesture detection loop
│   ├── controllers/              # Modular controller logic
│   │   ├── drag_drop_controller.py
│   │   ├── mouse_click_controller.py
│   │   ├── mouse_movement_controller.py
│   │   ├── screenshot_controller.py
│   │   ├── scroll_controller.py
│   │   └── volume_controller.py
│   ├── core/
│   │   ├── camera_manager.py     # Webcam capture handler
│   │   ├── config.py             # Global constants & variables
│   │   └── preferences.py        # Settings save/load JSON manager
│   ├── ui/
│   │   ├── app.py                # Main CustomTkinter UI definitions
│   │   ├── engine.py             # Background thread runner for Main
│   │   └── settings.py           # Configuration window UI
│   └── utils/
│       ├── hand_detector.py      # MediaPipe wrapper & math helpers
│       ├── roi_tracker.py        # Cropping optimizer for faster FPS
│       └── volume_manager.py     # Windows native COM volume bindings
├── requirements.txt
└── README.md

How it Works

  1. Hand Tracking: Built on Google's MediaPipe, detecting 21 3D hand landmarks in real-time.
  2. ROI Optimization: We use a custom Region of Interest (ROI) tracker. Once a hand is found, we crop the camera frame around it dynamically, boosting overall FPS and reducing CPU load substantially.
  3. One-Euro Filter: The mouse controller pushes the cursor coordinates through a 1€ Filter algorithm to eliminate jitter while preserving low latency during quick movements.
  4. State Machine: Complex features like Screenshot or persistent Drag-and-Drop are handled using isolated state machines inside their respective Controller classes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages