Eye Tracking System with OpenCV and MediaPipe

Real-time eye tracking pipeline built with Python, OpenCV, and MediaPipe FaceMesh. Tracks iris position, estimates head pose in 3D, detects fixations and blinks, maps gaze to Areas of Interest, and exports per-frame metrics to CSV for offline analysis.

Demo

Features

Feature	Details
Iris gaze estimation	Tracks iris position within eye bounds using MediaPipe's 478-point mesh (landmarks 468–477). Outputs normalized horizontal/vertical gaze ratios.
3D head pose	`cv2.solvePnP` on 6 facial landmarks → roll, pitch, yaw in degrees. Axes drawn on nose tip in real time.
Blink detection	Eye Aspect Ratio (EAR) formula on both eyes; blink flagged when avg EAR < 0.20.
Gaze direction estimation	Fuses iris ratios with head-pose yaw/pitch to produce a head-independent gaze direction vector (dir_h, dir_v) in [-1, 1]. Visualised as a live miniature indicator overlay.
3D point cloud export	`--export-ply` writes two PLY files on exit: (1) sampled face mesh landmarks coloured by time, (2) 3D gaze ray endpoints coloured by horizontal position. Viewable in MeshLab, CloudCompare, or Open3D.
Fixation detection	Velocity-based classifier: gaze velocity < 25 px/s for ≥ 100 ms = fixation. Completed fixations logged with duration and position.
AOI tracking	Configurable rectangular Areas of Interest with per-AOI dwell time accumulation.
Heatmap overlay	Gaussian-blurred JET colormap overlaid on the live frame.
Gaze attention classifier	sklearn Random Forest trained on 5 gaze features → predicts `on_screen` / `peripheral` / `away` with >95% CV accuracy. Demonstrated in `notebooks/classifier.ipynb`.
CSV export	Per-frame record saved to `data/gaze_<timestamp>.csv` on exit.

Installation

conda (recommended):

git clone https://github.com/apayne185/cv2-eye-tracking-system.git
cd cv2-eye-tracking-system
conda env create -f environment.yml
conda activate eyetrack

pip / venv:

pip install -r requirements.txt

Requirements: Python 3.11+, opencv-python, numpy, mediapipe, pandas, scipy

Windows note: tested on Windows 11 with Python 3.12. The conda env is recommended — a pytest.ini is included that suppresses a known conflict between the dash pytest plugin and mediapipe's DLL initialisation on Windows.

Usage

# Default: webcam 0
python src/main.py

# Specific webcam index
python src/main.py --source 1

# Process a recorded video file
python src/main.py --source path/to/video.mp4

# Custom output directory
python src/main.py --source 0 --output-dir results/

# Export PLY point clouds (face mesh + gaze trajectory)
python src/main.py --source 0 --export-ply

Press q to quit — the session CSV and heatmap are saved automatically.

Output

Session summary (printed on exit and saved to `summary_<timestamp>.txt`)

--- Session Summary ---
Frames recorded:  2500
Blinks detected:  124
Fixation frames:  438  (17.5%)
Fixations:        39  avg=0.20s  max=0.65s

AOI dwell (frames):
  Center: 2158  (86.3%)
  Left:   120   (4.8%)

AOI dwell (seconds):
  Center: 78.77s
  Left:    4.14s

CSV schema

Column	Description
`frame`	Frame index
`timestamp`	Unix timestamp
`gaze_x`, `gaze_y`	Iris center in pixel coordinates
`gaze_ratio_h`, `gaze_ratio_v`	Normalized gaze position within eye (0–1)
`pitch`, `yaw`, `roll`	Head Euler angles in degrees
`left_ear`, `right_ear`	Eye Aspect Ratio per eye
`is_blink`	Boolean
`is_fixation`	Boolean
`dir_h`, `dir_v`	Estimated gaze direction in [-1, 1] (iris + head pose fused)
`ray_ox`, `ray_oy`, `ray_oz`	3D gaze ray origin (eye midpoint in camera coords, mm)
`ray_dx`, `ray_dy`, `ray_dz`	3D gaze ray unit direction vector in camera coords
`active_aoi`	Name of active Area of Interest, or null

Project Structure

cv2-eye-tracking-system/
├── src/
│   ├── main.py             # Entry point — argparse, main loop, CSV export
│   ├── eye_tracker.py      # EyeTracker class: iris gaze, EAR blink, fixation
│   ├── head_pose.py        # HeadPoseEstimator: solvePnP, draw_axes, gaze ray
│   ├── direction.py        # GazeDirectionEstimator: 2D direction + 3D gaze ray
│   ├── face_mesh_3d.py     # PLY point cloud export: face mesh + gaze trajectory
│   ├── gaze_analysis.py    # Heatmap accumulator and renderer
│   ├── AOI.py              # AOITracker class with dwell-time accumulation
│   └── old_work/           # Legacy scripts (reference only)
├── notebooks/
│   ├── analysis.ipynb      # Offline session analysis — plots, heatmap, stats
│   └── classifier.ipynb    # ML training pipeline — RF vs SVM vs MLP, CV, confusion matrix
├── tests/
│   ├── test_direction.py        # Direction estimator unit tests
│   ├── test_fixation.py         # Fixation state machine unit tests
│   ├── test_gaze_analysis.py    # Heatmap accumulator unit tests
│   ├── test_face_mesh_3d.py     # PLY export unit tests
│   └── test_gaze_classifier.py  # Classifier training, inference, persistence
│   ├── conftest.py         # sys.path setup for src/ imports
│   ├── test_direction.py   # Direction estimator unit tests
│   ├── test_fixation.py    # Fixation state machine unit tests
│   ├── test_face_mesh_3d.py   # PLY export unit tests
│   └── test_gaze_analysis.py  # Heatmap accumulator unit tests
├── data/                   # Session output (CSV, heatmap, summary) — gitignored
├── eye_gaze_heatmap.jpg    # Sample heatmap output
├── pytest.ini              # Disables dash plugin (Windows mediapipe compatibility)
├── environment.yml         # conda env (Python 3.11, eyetrack)
├── requirements.txt
└── README.md

Technical Notes

Why iris landmarks over eye center averaging?
The earlier approach averaged the positions of all eye outline landmarks, which tracks face movement but not gaze direction. The iris landmarks (MediaPipe 468–477, enabled via refine_landmarks=True) give the actual pupil/iris position, so moving your eyes while keeping your head still produces a meaningful signal.

Head pose as gaze context
solvePnP maps six 2D facial landmarks to a known 3D face model to recover the rotation matrix. Roll/pitch/yaw complement the iris ratios: a centered iris with a 30° yaw still points off-center in world space. These extrinsics feed directly into the 3D gaze ray computation.

Fixation vs. saccade
The velocity threshold (25 px/s) follows the I-VT (Identification by Velocity Threshold) algorithm common in psychophysics research. Saccades typically exceed 300 px/s; the threshold is conservative to reduce noise from head micro-movements.

Gaze direction fusion
Iris ratios alone are relative to the eye socket — they correctly detect eye movement but are blind to head rotation. solvePnP yaw and pitch capture head orientation but ignore where the eyes point within the socket. GazeDirectionEstimator linearly combines both signals: dir_h = iris_deviation * EYE_SCALE + yaw * HEAD_SCALE. The weights are empirically tuned; a calibration step (mapping known gaze targets to measured ratios) would improve absolute accuracy.

PLY point clouds and the gaze trajectory
The face mesh export writes MediaPipe's 478 per-landmark 3D coordinates (x, y in pixel space; z at the same relative scale) as a binary PLY file — the format used by depth cameras, LiDAR scanners, and 3D reconstruction pipelines. The gaze trajectory cloud projects each session's 3D gaze rays onto a virtual plane at 500 mm depth, producing a spatial map of where the subject's attention landed. Both files can be opened directly in MeshLab, CloudCompare, or Open3D for inspection.

Offline Analysis

Open the notebook to analyse any recorded session CSV:

conda activate eyetrack
jupyter lab notebooks/analysis.ipynb

The notebook auto-loads the most recent data/gaze_*.csv. Set CSV_PATH manually to analyse a specific session. Produces: gaze scatter, EAR/blink plot, fixation timeline, head yaw, AOI dwell, and gaze heatmap.

Gaze attention classifier

conda activate eyetrack
jupyter lab notebooks/classifier.ipynb

Trains a Random Forest on synthetic gaze data (1800 samples, 3 classes) and demonstrates:

Feature distribution visualisation
Train/test split with classification report
5-fold cross-validation vs SVM and MLP
Confusion matrix and feature importances
Applying the model to a real session CSV (Section 7)

Running the tests

conda activate eyetrack
pip install pytest
pytest tests/ -v

29 tests across direction estimation, face mesh export, fixation detection, and heatmap accumulation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eye Tracking System with OpenCV and MediaPipe

Demo

Features

Installation

Usage

Output

Session summary (printed on exit and saved to `summary_<timestamp>.txt`)

CSV schema

Project Structure

Technical Notes

Offline Analysis

Gaze attention classifier

Running the tests

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
notebooks		notebooks
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
eye_gaze_heatmap.jpg		eye_gaze_heatmap.jpg
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Eye Tracking System with OpenCV and MediaPipe

Demo

Features

Installation

Usage

Output

Session summary (printed on exit and saved to summary_<timestamp>.txt)

CSV schema

Project Structure

Technical Notes

Offline Analysis

Gaze attention classifier

Running the tests

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Session summary (printed on exit and saved to `summary_<timestamp>.txt`)

Packages