Skip to content

Investigate ROCm CI failure: test_profiler_cuda_sync_events #1243

@scotts

Description

@scotts

CI failure example: https://github.com/pytorch/kineto/actions/runs/21599182062/job/62240085046?pr=1240

_________________ TestProfiler.test_profiler_cuda_sync_events __________________
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/unittest/case.py", line 57, in testPartExecutor
    yield
  File "/opt/conda/lib/python3.11/unittest/case.py", line 623, in run
    self._callTestMethod(testMethod)
  File "/opt/conda/lib/python3.11/unittest/case.py", line 579, in _callTestMethod
    if method() is not None:
       ^^^^^^^^
  File "/pytorch/torch/testing/_internal/common_utils.py", line 3364, in wrapper
    method(*args, **kwargs)
  File "/pytorch/test/profiler/test_profiler.py", line 1515, in test_profiler_cuda_sync_events
    trace_and_check(exp_config=_ExperimentalConfig(enable_cuda_sync_events=True))
  File "/pytorch/test/profiler/test_profiler.py", line 1509, in trace_and_check
    self.assertTrue(
  File "/opt/conda/lib/python3.11/unittest/case.py", line 715, in assertTrue
    raise self.failureException(msg)
AssertionError: False is not true : Expected to find cuda_sync event found = {'cpu_op', 'Trace', 'cuda_runtime', None, 'ac2g', 'kernel'}

To execute this test, run the following from the base repo dir:
    python test/profiler/test_profiler.py TestProfiler.test_profiler_cuda_sync_events

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions