Code for Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation, CVPR 2025
Sarosij Bose, Hannah Dela Cruz, Arindam Dutta, Elena Kokkoni, Konstantinos Karydis, and Amit Kumar Roy Chowdhury
From left to right: keypoint predictions from a baseline adult human pose estimation model (Xiao et al., 2018), predictions from a SOTA UDA pose estimation model (Kim et al., 2022), and predictions from our method, SHIFT. Adult pose estimation models fail when directly applied to infant data; similarly, UDAPE struggles to overcome the domain shift between adults and infants. In contrast, SHIFT accounts for the highly self-occluded pose distribution of infants, thereby effectively adapting to the infant domain
SHIFT leverages the mean-teacher framework (Tarvainen et al., 2017) to adapt a model pretrained on a labeled adult source dataset
Bibtex
@InProceedings{Bose_2025_CVPR,
author = {Bose, Sarosij and Cruz, Hannah Dela and Dutta, Arindam and Kokkoni, Elena and Karydis, Konstantinos and Chowdhury, Amit Kumar Roy},
title = {Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops},
month = {June},
year = {2025},
pages = {5562-5571}
}
}
Dataset Preparation
SURREAL Dataset As instructed by UDA_PoseEstimation, the following datasets can be downloaded automatically:
Save it to ../MINI-RGBD_web
Prior Module Training
See prior folder for instructions.
Keypoint-to-Segmentation Module Training
python train_keypoint_to_segmentation.py --dset-root /data/AmitRoyChowdhury/dripta/surreal_processed --dset SURREAL --arch pose_resnet101 --image-size 256 --heatmap-size 64 --batch-size 32 --log path/to/log/directory --lr 0.0003 --workers 2 --seg-threshold 0.5 --iters-per-epoch 500 --epochs 30 --seed 0 --print-freq 100 --save-dir path/to/save/directory
SURREAL-to-MINIRGBD
python train_human_to_infant.py path/to/surreal path/to/mini-rgbd -s SURREAL -t MiniRGBD --target-train MiniRGBD_mt --log logs/surreal2minirgbd --prior path/to/prior_stage_3.pt --kp2seg /path/to/kp2seg_data/SURREAL_kp2seg_gan.pt --debug --lambda_c 1 --pretrain-epoch 40 --lambda_s 1e-6 --lambda_p 1e-6 --mode 'all' --rotation_stu 60 --shear_stu -30 30 --translate_stu 0.05 0.05 --scale_stu 0.6 1.3 --color_stu 0.25 --blur_stu 0 --rotation_tea 60 --shear_tea -30 30 --translate_tea 0.05 0.05 --scale_tea 0.6 1.3 --color_tea 0.25 --blur_tea 0 -b 32 --mask-ratio 0.5 --k 1 --s2t-freq 0.5 --s2t-alpha 0 1 --t2s-freq 0.5 --t2s-alpha 0 1 --occlude-rate 0.5 --occlude-thresh 0.9
SURREAL-to-SyRIP
# python train_human_to_infant.py path/to/surreal path/to/SyRIP/data/syrip/images -s SURREAL -t SyRIP --target-train SyRIP_mt --log logs/surreal2syrip --prior path/to/prior_stage_3.pt --kp2seg /path/to/kp2seg_data/SURREAL_kp2seg_gan.pt --debug --lambda_c 1 --pretrain-epoch 40 --lambda_s 1e-6 --lambda_p 1e-6 --mode 'all' --rotation_stu 60 --shear_stu -30 30 --translate_stu 0.05 0.05 --scale_stu 0.6 1.3 --color_stu 0.25 --blur_stu 0 --rotation_tea 60 --shear_tea -30 30 --translate_tea 0.05 0.05 --scale_tea 0.6 1.3 --color_tea 0.25 --blur_tea 0 -b 32 --mask-ratio 0.5 --k 1 --s2t-freq 0.5 --s2t-alpha 0 1 --t2s-freq 0.5 --t2s-alpha 0 1 --occlude-rate 0.5 --occlude-thresh 0.9

