Skip to content

Add detailed hetero MIMO timeline profiling#25

Draft
yashaswikarnati wants to merge 1 commit into
ykarnati/nmfw-464-nemotron-vlm-with-hetero-parallelfrom
ykarnati/nmfw-464-timeline-profiling
Draft

Add detailed hetero MIMO timeline profiling#25
yashaswikarnati wants to merge 1 commit into
ykarnati/nmfw-464-nemotron-vlm-with-hetero-parallelfrom
ykarnati/nmfw-464-timeline-profiling

Conversation

@yashaswikarnati
Copy link
Copy Markdown
Owner

Summary

  • add iteration-window and per-event flushing controls to hetero timeline tracing
  • add memory, optimizer, grad-finalize, MIMO forward, and bridge communicator timeline events
  • keep tracing inactive/no-op unless timeline profiling is enabled

Testing

  • python3 -m py_compile examples/mimo/training/hetero/args.py examples/mimo/training/hetero/grad_sync.py examples/mimo/training/hetero/loop.py examples/mimo/training/hetero/step.py examples/mimo/training/hetero/timeline.py megatron/core/models/mimo/model/base.py megatron/core/pipeline_parallel/bridge_communicator.py megatron/core/pipeline_parallel/timeline.py
  • git diff --check --cached
  • pre-commit hooks during git commit/push: black, pylint, isort

Comment thread examples/mimo/training/hetero/args.py Outdated
action="store_true",
help="Push NVTX ranges with timeline event names for Nsight Systems.",
)
runtime.add_argument(
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it. This was only a hang-forensics escape hatch for flushing an event-start record before every traced event, and it is not needed for the steady-state 1F1B timeline profiling path. The PR now keeps normal buffered flush behavior only.

@yashaswikarnati yashaswikarnati force-pushed the ykarnati/nmfw-464-timeline-profiling branch from 974b11b to 7c6eeb6 Compare May 14, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant