Skip to content

lechan775/starnet-pose

Repository files navigation

🌟 StarNet-Pose

Heatmap-Free Lightweight Pose Estimation via Multiplicative Feature Interaction and Occlusion-Aware Training

StarNet Coordinate Attention MMPose License Neurocomputing GitHub stars

πŸ› οΈ Installation | πŸš€ Quick Start | πŸ“Š Model Zoo | πŸ“ˆ Ablations | πŸ“œ Citation


πŸ“„ Introduction

StarNet-Pose is an efficient 2D human pose estimation framework that leverages the lightweight StarNet backbone based on element-wise multiplicative feature interaction. It achieves state-of-the-art performance among lightweight top-down pose estimators while running at up to 313.8 FPS.

The core contributions:

  • StarNet Backbone: A lightweight CNN architecture based on the "Star Operation" (element-wise multiplication of two feature branches), achieving strong feature representation with significantly fewer parameters than conventional backbones.
  • Multiplicative Feature Interaction: Replaces traditional additive feature fusion (e.g., skip connections in HRNet) with multiplicative interaction, enabling richer feature representation at lower computational cost.
  • Occlusion-Aware Training: Systematic evaluation of CoarseDropout augmentation strategies for improving robustness under heavy occlusion.
  • Two Efficient Variants: StarNet-Pose-T (3.5M params, 72.00 AP, 313.8 FPS) and StarNet-Pose-S (6.4M params, 72.99 AP, 173.5 FPS).

This repository provides the official implementation of "StarNet-Pose: Heatmap-Free Lightweight Pose Estimation via Multiplicative Feature Interaction and Occlusion-Aware Training", accepted at Neurocomputing.

πŸ”— Built upon MMPose β€” the OpenMMLab Pose Estimation Toolbox.

✨ Key Features

Feature Description
πŸͺΆ Lightweight Two variants: 3.5M (Tiny) and 6.4M (Small) parameters
⚑ Ultra-Fast Up to 313.8 FPS (Tiny) on RTX 5090
🎯 High Precision 72.99 AP (Small) on COCO val2017, surpassing RTMPose-s by +1.4 AP
πŸ›‘οΈ Occlusion Robust Occlusion-aware training via CoarseDropout augmentation
πŸ”Œ Plug-and-Play Compatible with RTMPose head and full MMPose ecosystem
πŸ“¦ Two Variants StarNet-Pose-T (Tiny) and StarNet-Pose-S (Small)

πŸ› οΈ Installation

Prerequisites

  • Python >= 3.8
  • PyTorch >= 1.8 (Install Guide)
  • CUDA (optional, for GPU training)

Step 1: Install MMPose

StarNet-Pose is built upon MMPose. First install the base framework:

# Install MMEngine & MMCV
pip install mmengine mmcv

# Install MMPose (the full framework)
git clone https://github.com/open-mmlab/mmpose.git
cd mmpose
pip install -e .
cd ..

Step 2: Install StarNet-Pose

git clone https://github.com/lechan775/starnet-pose.git
cd starnet-pose

# Install dependencies
pip install -r requirements.txt

# Copy backbone modules into your MMPose installation
cp -r mmpose/models/backbones/* <mmpose_install_path>/mmpose/models/backbones/
cp -r configs/body_2d_keypoint/rtmpose/coco/* <mmpose_install_path>/configs/body_2d_keypoint/rtmpose/coco/

Step 3: Download Pre-trained Weights

Pre-trained StarNet backbone weights (ImageNet-1K) and StarNet-Pose model checkpoints are distributed via GitHub Releases.

# StarNet-S1 backbone (for StarNet-Pose-T)
wget https://github.com/ma-xu/Rewrite-the-Stars/releases/download/checkpoints_v1/starnet_s1.pth.tar

# StarNet-S3 backbone (for StarNet-Pose-S)
wget https://github.com/ma-xu/Rewrite-the-Stars/releases/download/checkpoints_v1/starnet_s3.pth.tar

Full StarNet-Pose checkpoints are available at GitHub Releases.

πŸš€ Quick Start

Standalone StarNet (no MMPose dependency)

import torch
from starnet import starnet_s3

# Load pre-trained StarNet-S3 (ImageNet-1K)
model = starnet_s3(pretrained=True)
model.eval()

x = torch.randn(1, 3, 224, 224)
feat = model(x)  # 1000-way ImageNet logits

Training on COCO

cd <mmpose_install_path>

# Train StarNet-Pose-S (recommended: 8 GPUs)
GPUS=8 bash tools/dist_train.sh \
    configs/body_2d_keypoint/rtmpose/coco/rtmpose_starnetca-s3_8xb256-420e_coco-256x192.py 8

# Fast evaluation on COCO-mini
python tools/train.py \
    configs/body_2d_keypoint/rtmpose/coco/rtmpose_starnetca-s3_8xb256-420e_coco-256x192_mini.py

Evaluation

# Evaluate trained model on COCO val2017
python tools/test.py \
    configs/body_2d_keypoint/rtmpose/coco/rtmpose_starnetca-s3_8xb256-420e_coco-256x192.py \
    <path_to_checkpoint>.pth

FLOPs & Speed Benchmark

# Compute FLOPs and parameter count
python tools/analysis_tools/get_flops.py \
    configs/body_2d_keypoint/rtmpose/coco/rtmpose_starnetca-s3_8xb256-420e_coco-256x192.py

# Measure inference latency and FPS
python tools/benchmark_latency.py \
    configs/body_2d_keypoint/rtmpose/coco/rtmpose_starnetca-s3_8xb256-420e_coco-256x192.py \
    <path_to_checkpoint>.pth

πŸ“Š Model Zoo

Main Results on COCO val2017 (256Γ—192)

Model Backbone Params (M) GFLOPs AP AP⁡⁰ AP⁷⁡ AR Latency (ms) FPS
StarNet-Pose-T StarNet-S1 3.504 0.435 72.00 91.51 79.62 75.03 3.19 313.8
StarNet-Pose-S StarNet-S3 + CA 6.428 0.765 72.99 91.64 80.69 76.01 5.76 173.5

⚑ FPS measured on NVIDIA RTX 5090 under unified PyTorch/MMPose forward benchmark (batch=1). StarNet-Pose-T surpasses RTMPose-t by +3.8 AP while running at a higher frame rate.

Comparison with Lightweight SOTA (COCO val2017, 256Γ—192)

Method Backbone Params (M) GFLOPs AP AP⁡⁰ AP⁷⁡ AR
RTMPose-t CSPNeXt-t 3.34 0.360 68.20 88.30 75.90 73.60
RTMPose-s CSPNeXt-s 5.47 0.680 71.60 89.20 78.90 76.80
Lite-HRNet-30 Lite-HRNet-30 1.80 0.319 67.20 88.00 75.00 73.30
X-HRNet-30 X-HRNet-30 2.10 0.300 67.40 87.50 75.40 73.50
LMFormer-L LMFormer-L 4.10 1.400 68.90 88.30 76.40 74.70
LGM-Pose LGM-Pose 1.10 0.600 69.30 89.50 76.20 73.70
StarNet-Pose-T StarNet-S1 3.504 0.435 72.00 91.51 79.62 75.03
StarNet-Pose-S StarNet-S3 + CA 6.428 0.765 72.99 91.64 80.69 76.01

Efficiency-Oriented Comparison

Model Params (M) FLOPs (G) AP AP⁷⁡ AR Latency (ms) FPS
Lite-HRNet-18 1.10 0.205 64.80 73.00 71.20 18.14 55.1
RTMPose-t 3.34 0.360 68.20 75.90 73.60 3.51 285.1
StarNet-Pose-T 3.504 0.435 72.00 79.62 75.03 3.19 313.8
Lite-HRNet-30 1.80 0.319 67.20 75.00 73.30 30.38 32.9
RTMPose-s 5.47 0.680 71.60 78.90 76.80 3.71 269.6
StarNet-Pose-S 6.428 0.765 72.99 80.69 76.01 5.76 173.5

Model Specification

Component StarNet-Pose-T (Tiny) StarNet-Pose-S (Small)
Stem 3Γ—3 Conv-BN-ReLU6, stride 2 3Γ—3 Conv-BN-ReLU6, stride 2
Stage 1 24 ch, depth 2, no CA 32 ch, depth 2, no CA
Stage 2 48 ch, depth 2, no CA 64 ch, depth 2, no CA
Stage 3 96 ch, depth 8, no CA 128 ch, depth 8, CA enabled
Stage 4 192 ch, depth 3, no CA 256 ch, depth 4, CA enabled
Head RTMCCHead (SimCC), in-ch 192 RTMCCHead (SimCC), in-ch 256

πŸ“ˆ Ablation Studies

Attention Mechanism Ablation (COCO-mini, 210 epochs)

Configuration AP AP⁡⁰ AP⁷⁡ AR Best AP (epoch)
StarNet baseline (w/o attention) 50.62 77.94 54.08 54.42 50.67 (180)
StarNet + CBAM 49.54 77.21 54.47 53.35 50.21 (190)
StarNet + CA 49.65 78.15 51.26 53.46 50.60 (170)

Cross-Scale Component Ablation (COCO-mini, S1/Tiny level)

Configuration Backbone CoarseDropout AP
RTMPose-T baseline CSPNeXt-Tiny 1.0β†’0.5 schedule 42.14
StarNet-S1 (w/o CA) StarNet-S1 1.0β†’0.5 schedule 47.30
StarNetCA-S1 StarNet-S1 1.0β†’0.5 schedule 46.86
StarNetCA-S1 + fixed dropout StarNet-S1 Fixed p=0.6 47.13
StarNet-S1 + fixed dropout StarNet-S1 Fixed p=0.6 47.48

To reproduce these experiments: bash experiments/occlusion_prob_study/run_fast_track_experiments.sh

πŸ—οΈ Architecture

Input (256Γ—192Γ—3)
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stem (Conv) β”‚  3Γ—3 Conv-BN-ReLU6, stride=2
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stage 1     β”‚  StarBlock Γ—N₁  (no CA)
β”‚  (64Γ—48)     β”‚  
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό  Downsample (3Γ—3 Conv, stride=2)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stage 2     β”‚  StarBlock Γ—Nβ‚‚  (no CA)
β”‚  (32Γ—24)     β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό  Downsample
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stage 3     β”‚  StarBlock Γ—N₃  (CA enabled in S variant)
β”‚  (16Γ—12)     β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό  Downsample
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stage 4     β”‚  StarBlock Γ—Nβ‚„  (CA enabled in S variant)
β”‚  (8Γ—6)       β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  RTMCC Head  β”‚  SimCC-based coordinate classification
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β–Ό
   Keypoints (17 Γ— 2Β·(W+H) logits)

The StarBlock core operation β€” multiplicative feature interaction:

         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   x ───▢│ DWConv  │──▢ f₁ ─▢│ ReLU6   │──┐
         β”‚  7Γ—7    β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  element-wise
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”œβ”€β”€β–Ά multiply ──▢ g ──▢ DWConv2 ──▢ + ──▢ out
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚                        β–²
         β”‚ DWConv  │──▢ fβ‚‚ β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
         β”‚  7Γ—7    β”‚                                   input β”€β”€β”€β”€β”€β”€β”˜ (residual)
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Coordinate Attention (applied after the g projection in StarNet-Pose-S):

x (B,C,H,W) ──┬──▢ Pool H (B,C,H,1) ──┐
               β”‚                        β”œβ”€β”€β–Ά Concat ──▢ Conv1Γ—1+BN+ReLU ──▢ Split
               └──▢ Pool W (B,C,1,W) β”€β”€β”˜       β”‚                    β”‚
                                                β–Ό                    β–Ό
                                          ConvH β†’ Sigmoid     ConvW β†’ Sigmoid
                                                β”‚                    β”‚
                                                └────▢ x Γ— attnH Γ— attnW ──▢ out

See figures/ for detailed architecture diagrams (.drawio format).

πŸ“ Repository Structure

starnet-pose/
β”œβ”€β”€ starnet.py                    # Standalone StarNet implementation (no MMPose dep.)
β”œβ”€β”€ mmpose/models/backbones/
β”‚   β”œβ”€β”€ starnet.py                # StarNet backbone for MMPose (StarNet-S1/S3)
β”‚   β”œβ”€β”€ starnet_ca.py             # StarNetCA backbone (StarNet + Coordinate Attention)
β”‚   └── utils/
β”‚       └── coordinate_attention.py  # Coordinate Attention module (CVPR 2021)
β”œβ”€β”€ configs/body_2d_keypoint/rtmpose/coco/
β”‚   β”œβ”€β”€ rtmpose_starnet-s3_*.py            # StarNet-Pose-S (vanilla StarNet-S3)
β”‚   β”œβ”€β”€ rtmpose_starnetca-s3_*.py          # StarNet-Pose-S (StarNet-S3 + CA stages 3-4)
β”‚   β”œβ”€β”€ rtmpose_starnetca-s3_*_a800.py     # A800-optimized config
β”‚   └── rtmpose_*_mini.py                  # COCO-mini fast evaluation configs
β”œβ”€β”€ experiments/occlusion_prob_study/
β”‚   β”œβ”€β”€ datasets.py               # Dataset path configuration
β”‚   β”œβ”€β”€ experiment_matrix.py      # Experiment suite definitions
β”‚   β”œβ”€β”€ generate_configs.py       # Config generator
β”‚   β”œβ”€β”€ run_fast_track_experiments.sh  # Fast evaluation pipeline (~2 hours)
β”‚   β”œβ”€β”€ run_all_paper_experiments.sh   # Full experiment suite
β”‚   └── generated_configs/        # Auto-generated experiment .py files
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ train.py                  # Training launch script
β”‚   β”œβ”€β”€ benchmark_latency.py      # FPS & latency benchmark
β”‚   β”œβ”€β”€ test_starnet_*.py         # Unit tests
β”‚   └── visualize_*.py            # Visualization utilities
β”œβ”€β”€ demo/                         # Demo examples
β”œβ”€β”€ figures/                      # Architecture diagrams (.drawio)
β”œβ”€β”€ requirements.txt              # Python dependencies
β”œβ”€β”€ LICENSE                       # Apache 2.0 License
β”œβ”€β”€ CITATION.cff                  # Citation metadata
└── README.md                     # This file

🀝 Acknowledgements

This work builds upon several excellent open-source projects:

πŸ“œ Citation

If you use StarNet-Pose in your research, please cite:

@article{pan2025starnetpose,
  title   = {StarNet-Pose: Heatmap-Free Lightweight Pose Estimation via
             Multiplicative Feature Interaction and Occlusion-Aware Training},
  author  = {Zheng Luo and Guowei Jiang and Runhang Pan and Qi Qi and Xin Xie and Siyuan Chen},
  journal = {Neurocomputing},
  year    = {2026},
  url     = {https://github.com/lechan775/starnet-pose}
}

Also consider citing the foundational works:

@inproceedings{ma2024starnet,
  title   = {Rewrite the Stars},
  author  = {Ma, Xu and ...},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer
               Vision and Pattern Recognition (CVPR)},
  year    = {2024}
}

@inproceedings{hou2021ca,
  title   = {Coordinate Attention for Efficient Mobile Network Design},
  author  = {Hou, Qibin and Zhou, Daquan and Feng, Jiashi},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer
               Vision and Pattern Recognition (CVPR)},
  year    = {2021}
}

@misc{mmpose2020,
  title  = {OpenMMLab Pose Estimation Toolbox and Benchmark},
  author = {MMPose Contributors},
  year   = {2020},
  url    = {https://github.com/open-mmlab/mmpose}
}

πŸ“ License

This project is released under the Apache 2.0 License.

Note: The StarNet backbone code is adapted from Rewrite-the-Stars. The Coordinate Attention module is adapted from CoordAttention. This project inherits the Apache 2.0 license from MMPose.

πŸ”— Links


Built with ❀️ upon the MMPose ecosystem

About

StarNet-Pose: Heatmap-Free Lightweight Pose Estimation via Multiplicative Feature Interaction and Occlusion-Aware Training. Official code for Neurocomputing paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors