SkiSense

English | 日本語

A computer-vision tool that estimates a skier's posture from video and quantitatively evaluates their form via joint angles.

Overview

SkiSense detects a skier in a clip, estimates their pose, and scores key joint angles (knee, hip, ankle, shoulder tilt) against ideal ranges drawn from competitive kihon (technical) skiing. It is a visualization and quantitative-feedback tool, not an automatic judge. The annotated video, the best-scoring frame, and per-joint readouts are written to disk.

Features

Person detection — YOLOv8x
Pose estimation — selectable backend: SAM 3D Body (3D MHR-21, default) or YOLO11-Pose (2D COCO-17). See Pose backends
Joint-angle evaluation — knee / hip / ankle in 3D, shoulder tilt in 2D
Overall score — 0–100 over the angles that can be measured
Multi-person tracking — Deep SORT keeps a stable ID across frames
Auto zoom & centering — keeps the skier's torso centered
Best-shot extraction — saves the highest-scoring frame

Quick start

PyTorch must be installed first, matching your CUDA build:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
# Only needed for the SAM 3D Body backend (see docs/pose_backends.md):
pip install -r requirements-sam3d.txt

Run:

python run.py video.mp4            # process a video (input/video.mp4 by default)
python run.py --fast video.mp4     # skip per-frame detect/track; one pass per frame
python run.py skier.jpg --image    # process a single image

Output is written to output/YYYYMMDD_HHMMSS/ (video_pose.mp4, best_shot.jpg, and a copy of the input).

The default sam3d backend uses gated HuggingFace weights: request access to facebook/sam-3d-body-dinov3 and run hf auth login once before the first run. To avoid this (or to run on CPU), use the yolo11 backend instead.

Pose backends

The pose engine is selected in .env:

SKISENSE_POSE_BACKEND=sam3d    # default: SAM 3D Body (3D, CUDA required)
SKISENSE_POSE_BACKEND=yolo11   # YOLO11-Pose (2D, runs on CPU/MPS/CUDA)

SAM 3D Body — 3D MHR keypoints + body mesh; view-invariant joint angles including the ankle. CUDA required; ~1–2 s/frame.
YOLO11-Pose — 2D COCO-17 keypoints; fast and CPU-capable, but the ankle angle is N/A (no foot landmark).

Full comparison, settings, and trade-offs: docs/pose_backends.md.

Architecture

Each frame runs a three-step pipeline:

Detection & tracking — YOLOv8x detects persons; Deep SORT assigns persistent track IDs.
Pose estimation — the selected backend returns keypoints (3D + 2D for SAM 3D Body, 2D for YOLO11-Pose); pose_analyzer evaluates joint angles.
Rendering — ZoomTracker applies smooth zoom, skeleton/bbox are drawn through the single transform_point_to_zoom() choke point, and the info panel is overlaid.

Key modules: config.py, pose_topology.py, backends/, pose_analyzer.py, zoom_tracker.py, main.py, image_processor.py.

More (日本語)

Project background, design rationale, scoring logic, and lessons learned (Japanese): README_ja.md / docs/project_details_ja.md

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
docs		docs
images		images
input		input
models		models
output		output
src		src
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md
requirements-sam3d.txt		requirements-sam3d.txt
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkiSense

Overview

Features

Quick start

Pose backends

Architecture

More (日本語)

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SkiSense

Overview

Features

Quick start

Pose backends

Architecture

More (日本語)

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages