Skip to content

Chris-airobot/human_tracking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Human Tracking Synthetic Data Generation

A minimal but verified synthetic RGB-D data generation pipeline for human tracking and pose-related computer vision tasks in NVIDIA Isaac Sim.

This project uses SMPL/AMASS-style human motion as the source of body pose and shape, renders the human in Isaac Sim, and exports synchronized RGB, depth, mask, bounding box, 3D joints, camera metadata, and SMPL annotations.

The current focus is not large-scale generation yet. The first goal is to build a reliable geometry-and-annotation pipeline, verify that the exported labels align with the rendered image, and then scale the system carefully.


Motivation

Synthetic data is useful for human tracking because it can provide labels that are difficult or expensive to collect in real-world data, such as depth, segmentation masks, 3D joints, and parametric body annotations.

However, a synthetic image is only useful if its labels are trustworthy. A rendered RGB image may look correct while the 3D pose, camera metadata, depth, mask, or bounding box are misaligned. This project therefore focuses first on internal consistency before scaling to more humans, scenes, cameras, and appearance variation.


Development workflow

Workflow diagram

The project follows an iterative development process. I first define the assumptions and output labels, then build a minimal end-to-end pipeline, and then validate whether the exported annotations are consistent with the rendered frame.

Only after this clean pipeline is verified should the system move to appearance variation, real-world sanity checking, design revision, and large-scale data generation.

The current implementation covers the green part of the workflow: a controlled SMPL/AMASS/Isaac Sim pipeline with internal consistency validation. The next major step is an appearance module prototype for clothing, eyewear, and headwear.


Current pipeline

Pipeline diagram

The current pipeline starts from an AMASS motion sequence. AMASS provides natural mocap-based motion in an SMPL-compatible format, including pose, translation, and body-shape information.

The SMPL body model converts this motion into a posed human mesh, SMPL parameters, and 3D joints. Isaac Sim then renders the human in an indoor scene using a configured camera. The pipeline exports RGB, depth, and mask outputs, along with SMPL annotations, bounding boxes, 3D joints, and camera metadata.

A final overlay check is used to verify that the exported annotations match the rendered frame.


Demo outputs

Rendered RGB

RGB example

This is an example RGB frame rendered from Isaac Sim. The current version uses a clean SMPL body, which is useful for verifying the geometry and annotation pipeline before adding clothing and appearance variation.

Depth ground truth

Depth example

The depth image is exported from the renderer and provides the per-pixel distance information associated with the same camera frame.

Instance mask

Mask example

The instance mask identifies visible human pixels in the rendered image. This is useful for segmentation, occlusion analysis, bounding box generation, and debugging label-image alignment.

Sequence preview

Sequence preview

The sequence preview shows the SMPL human animated over time in the Isaac Sim scene. The current motion comes from AMASS and is converted through the SMPL body model.


Internal consistency validation

Overlay example

The overlay image is used to validate that the exported labels are aligned with the rendered RGB frame.

The validation checks include:

  • the instance mask covers the visible human body
  • the bounding box encloses the human region
  • projected SMPL joints align with the rendered body
  • RGB, depth, mask, SMPL metadata, and camera metadata refer to the same frame

This does not prove real-world transfer, but it verifies that the synthetic labels are geometrically consistent before scaling the dataset.


Output data

For each generated frame, the pipeline can export:

Output Description
RGB image Rendered camera image from Isaac Sim
Depth image Ground-truth depth from the renderer
Instance mask Visible human mask in image space
Bounding box 2D box around the visible human
SMPL parameters Body pose, shape, and translation metadata
3D joints SMPL-derived joints transformed into render/world frame
Camera metadata Camera pose and intrinsics used for projection
Verification overlay RGB image with mask, bbox, and projected joints

Current status

Implemented:

  • SMPL/AMASS-based human mesh sequence generation
  • Isaac Sim RGB-D rendering
  • instance mask export
  • bounding box export
  • SMPL annotation export
  • 3D joint export in render/world frame
  • annotation overlay validation

Planned:

  • more AMASS motion sequences
  • more camera viewpoints and indoor scenes
  • body-shape variation through SMPL parameters
  • SMPL-compatible clothing / appearance variation
  • small real-world RGB-D reference check
  • downstream model testing for detection, pose, or tracking

Roadmap

The planned development path is:

  1. Verify the clean SMPL-based pipeline
  2. Add a small SMPL-compatible clothing / appearance prototype
  3. Collect a small real-world RGB-D reference set
  4. Compare synthetic and real data gaps
  5. Revise generator settings
  6. Scale one factor at a time: motion, body shape, camera/scene, appearance, and multi-person cases
  7. Test downstream utility on detection, pose, or tracking models

Notes on assets and licensing

This repository is intended to contain code, configuration files, documentation, and small example outputs.

Large datasets, SMPL model files, AMASS data, and third-party assets should not be committed directly unless their licenses explicitly allow redistribution.

About

Synthetic RGB-D human tracking data generation in Isaac Sim with SMPL/AMASS annotations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages